Team Status Report For 9.30.23

The major risk we are currently facing is making sure we successfully transition our project’s goals and requirements without losing a large amount of progress. As we will go into more below, we are transitioning our project from a website to play Mancala on combined with an engine, to a physical Go board, that displays engine recommendations and stores game histories locally for further analysis. This now creates a hardware (physical board) requirement for our project. Fortunately, a large amount of our research is applicable to the new project, and some of the software we had already begun to write can be adapted to meet our new needs. Nevertheless, we need to make sure we are not over-committing, and have a plan to catch up on the small amount of time now lost. To do so, we have defined a new MVP, adjusting our human-computer interaction requirements, while also making plans to adapt as much of our pre-existing work as possible, and adjusted our schedules (condensing some earlier steps) so that while we will have to commit to extra work for a few weeks, we will not be in a crunch at the end.

As mentioned above, we have made monumental changes to our project and its design. After receiving helpful feedback from TAs and students in regards to our proposal, we realized that our use-case was not strong enough and our project did not have the requisite breadth. These two major flaws led us to change our project focus to Go instead. Due to the competitive gaming community there is much more demand for a Go training product, and the Go equivalent to a chess DGT board (which is one of the services our project will provide) has not been created. This switch will also incorporate a hardware component that record’s player’s games and allows our website component to show analysis for these already played games. These changes have forced us to rearrange our schedule a bit (as seen below) but that, combined with the other mitigating actions we took (as described above) will allow us to stay on target.

New Schedule:

Hang’s Status Report For 9.30.23

After our proposal, we decided to switch our project so that we could cover 3 ECE areas instead of 2 in case the machine learning component of our project fails. With our new project, my role changes slightly. I’m still working on the site (solely working on the site since Israel will now work on the go board which is our embedded system), but we decided that we don’t need to host our site. Instead, we will just locally host our site.

With the change in our project, we won’t need a dedicated backend server, so we won’t need AWS and most of the development will just be with React and Javascript. I got started on setting up this React web-app: installing Node and then creating the web-app. Besides working on setting up the initial site, this week was spent going over the different project ideas after we got our feedback from the proposal and then working on the design presentation once we finalized our project idea.

The schedule is on track, and by next week, I expect saving game history to be finished, and I’ll start to work on the visualization of saved game history.

My role on this project is entirely software, but most of the software classes I took here are lower-level systems courses, and this project is higher on the abstraction level. One class that would be relevant is 15-122 since I may use some data structures for the site design. For example, for the visualization of the game history, I want to show the board state for a specific move, so I’ll store the game states of that specific game into a hash table with the move number being the key and the game state being the value.

Nathan’s Status Report For 9.30.23

As is mentioned in our team status report for the week, we have transitioned our project away from Mancala and into the game Go instead. Fortunately, the division of labor remains relatively similar and I am still broadly responsible for the training of a reinforcement-learning engine that will eventually be used to give move suggestions and positional evaluations to our users.

Accordingly, almost all of the research I did last week is still applicable, as the same self-play techniques can be used, and, in fact, have been proven to work in the cases of  AlphaZero and MuZero. After making the transition to Go this week, I had to do a quick catch-up on the rules and gameplay, but after that, along with the two above-linked papers, the research phase of my project has come to a close.

Of course, with the design presentations coming up next week, a good amount of my time this week was devoted to preparing for that as well, and the rest was spent building the platform for the reinforcement learning. The current consensus for optimal Go engine creation is a combination of deep learning and Monte Carlo Tree Simulations (MCTS). MCTS works by using self-play to simulate many game paths given a certain position, and choosing the move providing the best overall outcome. I have started work on creating the framework to perform these simulations as quickly as possible (holding game state, allowing the candidate engine to make moves against itself and returning the new board, etc.).

With regards to classwork helping me prepare for this project, I think the two ECE classes that helped the most are 18213 and 18344. I have not taken any classes in reinforcement learning or machine learning in general, but the research I did in the Cylab with ECE Prof. Vyas Sekar certainly helped me a huge amount, both in the subject matter of the research (deep learning) and the experience of reading scholarly papers to fully understand techniques you are considering using. What 18213 and 18344 provided was the “correct” way of thinking about setting up my framework. I need my simulations to be as efficient as possible while also maintaining accuracy, and I need my system to be as robust as possible, as I will need to make frequent changes, tuning parameters, etc. These combined with the research papers read last week, and the two above papers are what influenced my portion of the design the most.

Finally, in the next week I plan to finish the Go simulation framework, and begin work on setting up the reinforcement learning architecture, to begin training in the week after. MCTS simulation is quite efficient, but with the distinct limit on computational resources I have, allocating proper time is vital.

Israel’s Status Report for 9.23.2023

Tasks accomplished

I have started the ramp up for Java Script and React usage with informational videos. Some of these videos from Mosh Hamedani (Older but more thorough use of react 1) (Newer tutorial on how to use react 2). I followed these videos aswell to practice using React in preparation. I have also looked into documentation for java-script itself with Mozilla with helpful functional usage.

In addition, I looked for some UI based libraries and packages to use with React that might be helpful. One of the ones of focus is BluePrint due to its very well-made documentation and customization integrated with CSS that might prove beneficial in the future with my prior experience with CSS. Other ones of interest that might be used are as followed:

Progress status

Finished ramping up on Javascript and and React usage for this weeks plan.

Tasks to complete

I plan to quick overview on CSS just to be more familiar with the format as well as HTML in case it proves useful in the future.

Websocket familiarity is a number one prioity for my interface with backend. I plan to ramp up on websocket usage and knowledge way more.

In Addition, I plan to start designing the Mancala frontend basic pages and components. Initially start making a interface and blueprint of planned functions and pages. I plan to use my framework from my learning rampup aswell for my codebase.

If everything turns out well and plans don’t stray off, I plan to have a codebase, all my TODOs and file locations, organized for implementation to start smoothly .

Team Status Report for 9.23.2023

One of our biggest risks currently is the possibility that the planned minimax strategy will not prove effective as an initial opponent for the self-play RL model (be it too strong or too weak). If it is too weak, the platform that we are building it on can be extended to look more than 2-ply into the future. While this will increase training time (as the calculations will take longer to compute) it will provide a stronger opponent. On the other hand, if it proves too strong, we have other, even more basic strategies waiting in standby, such as 1-ply maximization (just maximize the amount of stones captured in one move, ignoring the possible responses) or even a random agent.

With regard to changes, a possible problem pointed out during the presentation was the idea that some variants of Mancala were solved. While we had always planned on this, we had not made clear that the version we were building for our website was an unsolved variant (the seven stone variant of Kalah Mancala). Some players use other ways to get around the solved aspect of the game such as switching positions after the first move, but those add unnecessary complication to the game, raising the barrier for entry, especially for younger players. This will not cause any increase in price, or changes to the system itself, but does specify requirements a bit better. Other than that there have been no changes to the system or structure of the project.

For right now, everyone is on schedule, so no changes are necessary there.

The effect our project will have on public safety, the economy, or the environment are relatively minimal. Of course, we are using a small amount of computational power on training the RL model and maintaining our servers, but in the grand scheme of things it is next to nothing. That being said, our project certainly has a non-trivial effect socially, and could possibly improve mental health for some users as well. Multiplayer games are inherently social, and an online platform for them provides an outlet for users to connect with other like-minded individuals. The fact that there is no major website dedicated to Mancala makes it all the more important. Beyond even meeting new people and possible friends, our project would also allow for friends to play each other directly, for friendships where it is difficult for the participants to see each other (long-distance, etc.) this can help strengthen them. Finally, this may only be relevant a tiny percentage of the time, but the small amount of social interaction from online gaming can make a significant difference in mental health. It is all too easy to shut yourself away and not interact with anyone, and as this effect compounds it becomes harder and harder to break out of it. Online social interaction can be a small step in the right direction, and our platform could provide that.

Nathan’s Status Report for 9.23.2023

This week I did a combination of research on reinforcement learning, and opponent/platform setup to enable the RL model training.

With regard to research, I want to understand as much as possible about reinforcement learning before I start the process of actually building a Mancala RL model. Through preliminary research our group decided that a self-play method of training would be best, so I read a number of papers and tutorials on both the theory of self-play RL and the logistics of putting it into practice in Python. A few of the resources I used are shown below:

OpenAI Self-Play RL

HuggingFace DeepRL

Provable Self-Play (PMLR)

Towards Data Science

Python Q-Learning

In order to train the self-play RL model, I must have a competent opponent for the model to start off playing against, before it can train against previous iterations of itself. If I choose too strong of a starting opponent, the model will not get enough positive reinforcement (as it will almost never win), and if I choose too weak of one the reverse is true. As such, we will start with a relatively simple minimax strategy that looks two “ply” (single player turns) into the future. However, to build this strategy, I need a platform for the game to be played on (so the RL model can play the minimax opponent). This week I started building this platform, programming all game rules and actions, and a framework where two separate players can interact on the same board.  I then implemented unit tests to make sure all game actions were functioning as they should. With this now in place, I have begun programming the minimax strategy itself. This means I am on schedule, and hopefully will have the minimax available to start training within the week.

Hang’s Status Report for 9.23.2023

Since I was the presenter for the proposal of our project, the first half of the week was dedicated to practicing the script.

After our project was presented, I put my focus into finding the tutorials that would be necessary for setting up the infrastructure of our project: namely how to set up AWS Lambda as our compute platform and how to set up a DynamoDB database. I came across this tutorial: Building a serverless multi-player game that scales | AWS Compute Blog (amazon.com), which seems perfect for our use-case. This tutorial builds a trivia game using Lambda functions with both http endpoints and websocket endpoints and uses DynamoDB tables as their database. They use Vue.js as their frontend, but we should be able to easily switch to React.

I can use this tutorial to set up the necessary infrastructure for our project and test out the endpoints that they have created. Once I get an understanding of how their code works, I can start working on our project’s backend, starting with the game logic.

The progress is currently on schedule. By the end of next week, the infrastructure should be set up, and I should be able to test communication between the demo’s frontend and backend. I should also start writing game logic for our Mancala gameplay.