Nathan’s Status Report For 10.28.23

In my previous weekly report, I mentioned that my goal for this week was to finish my expand() function, and to find a dataset to train the initial values for my value network. I am happy to say that I have accomplished both of these, as well as finishing the rest of the code required for MCTS. The dataset I am going to use is located here and contains over 60000 games from professional Go matches played in Japan. For the curious viewers, all the code required for MCTS can be found on Github.

With the aforementioned dataset, I plan to begin training the value network as soon as possible (hopefully by Monday). While I have the dataset, it is stored in the Smart Game File (SGF) format, which is the sequence of moves, not the sequences of board states generated. As I need the board states themselves for training, I am currently working on a script to automatically process all 60000 of these files, generating each board state and tagging them with the game result. These results are the training data I require. Once this is finished, I can begin training, which involves the physical training over the dataset, but also I will do some design space exploration with regards to network architecture (number of nodes, types and number of layers, etc.). This will allow me to find a closer to ideal combination of accuracy and processing time (as efficient simulations are helpful in training, but vital for usage).

This design-space exploration will actually prove helpful for the Policy Network as well, as it will provide a baseline for the allowable complexity. (Higher number of nodes and layers will generally perform better barring overfitting, but will take more time, the value network exploration will give me an estimate for the amount of layers I can use in the policy network as the networks can be used to evaluate positions in parallel.)

Then once I have the parameters (architectural mainly) for the networks set, and the initial weights for the value network trained, I can immediately begin running simulations, as all of that infrastructure is complete. I should be running these simulations by the time of my report next week, and will aim to do my first training run on both networks with the data the simulations generate.

 

Hang’s Status Report For 10.28.23

This week, I was working on displaying the saved game states. To do this, I needed a component that would open up a file finder, and it turns out this is a standard component in React called <input/>. After the user selects the file, I have to read in the content of the files into an array. The format of the file contents follow this: tile_type, move_number, and this repeats 361 times (19×19) with the. tile_type either being empty, black, or white, and move_number either being the move number of the tile or -1 representing empty. Since this entire content is read in as a string, I used split to convert the string into an array. I also spent some time looking into the React router so that the site can have different pages as I wanted to separate the page for live gameplay and displaying the saved games.

I’m a little bit behind schedule since I wasn’t able to display the saved game states yet. This is because I had a pretty large project due for one of my classes this week, so I spent most of my time focusing on that project. I’ll spend a little more time this following week to finish up displaying the game states.

My deliverables for next week will be displaying the saved game states,  and getting multiple pages working for the site.

Team Status Report For 10.21.23

First of all, to address the major risk from our last report, we have made significant progress on the physical board but we are not out of the woods yet. We have built the autocad design and prepped for laser cutting for our new 100cmx100cm wooden plank. In addition we have started circuitry assembly for testing purposes before making a final order for all our electrical components. We still have not finished these tasks and have yet to start testing our software on this circuitry but we plan to do so in the coming week. The risk here has evolved a little as well, as we realized we need fire and short-circuit protection, so have developed a plan to do this with a combination of insulation, and strategic spacing of components, but we still need to actually add that to the board.

 

A secondary risk from the reinforcement learning side is the inability to find an open-source database with expert go games. We felt confident that such a database existed, but should it end up not existing (or the data is unclean, or it is in an unusable state, etc.), the value network for MCTS would not be able to be trained to non-random weights before training commences. Fortunately, the mitigation for that comes in the form of MCTS itself, as the search generates more value network training data, that is, every board state plus whatever result the game it came from reached.However, this would harm the efficacy of training at first (the network would improve more slowly) and while we have enough slack for that possible time setback, it would be ideal to avoid.

Other than the aforementioned insulation and spacing, there are no changes to the design, and everything is running roughly on schedule. The development of the website is on-track with the feature for saving game history finished, and we have started on displaying the game history. Our hardware development is on track as well and the reinforcement learning model is almost ready for training.

Israel’s status report 10.21.23

ABET

Some new tools I have already learned has been laser cutting for wood board making. I also planned autocad designing which is a skill I had to refresh myself in. I plan on learning more about arduino coding with COM port communication as I have worked more with stm32 processors and UART initally.

Tasks completed

I have received supplies for our team and have started on laser cutting our board for our project. In addition, I have worked on protyping the hardware: starting a subset of the hardware sensors to see the value adjustment and the circuitry testing.

Progress status

I have made good progress in terms of the embedded software but do still need to test everything and ensure it is prepared for communication with PC soon. I am on track but would like to be further ahead given the hiccup with our hardware board making initially.

Tasks to complete

I will need to make the final order of electronics Monday and start putting all the electronic components together. In addition, I will need to add electric insulation to the board and methods of holding the circuit boards into the board so that mobility of the board will not affect the circuit board placement note cause for errors in future. I’m addition

Hang’s Status Report For 10.21.23

This week, I focused on saving game states. Originally, I wanted to save the board for each move with the recommended moves from the Go engine, however, I’ve decided against this since the text file would be unnecessarily large. Instead, I’ll only save “one board” flattened into an array of 361 (19×19). Each element in the array will either be a tuple of (“E”, -1) or (“W”, some move number) or (“B”, some move number), where “E” corresponds to empty, “W” to white, and “B” to black. I won’t be saving the recommended game moves, instead, I would be reconstructing the board for each move selected to display, and then feeding that board state to the Go engine to get the recommended moves. While this will take a little more processing time for displaying the history of games, it’s worth the memory cost of storing large games/games with many moves.

My schedule is on track as I’ve finished saving game history. By next week, I should have the feature for displaying game history done.

While I have some frontend/UI experience as I have done a full stack internship before, most of my work was in backend. For this project, almost all of my work will be with React/frontend, and I’ll be creating new components with CSS which I haven’t done before. I’ll also be looking into how to do serial communication between the web page I’m creating and the Arduino board for getting the game states with Web Serial API.

Nathan’s Status Report for 10.21.23

As I mentioned in my previous status report, my goal for this week was to finish custom loss function implementation, then work towards getting the MCTS training data generation working, given that the framework was already there.

With regard to the former of the two goals, that work is all complete. I had actually misunderstood my needs, I don’t actually need a custom loss function, as the two networks are trained on mean-squared error (MSE) and binary cross-entropic loss (BCEL). Nevertheless, I have built the framework for the training of the two networks (Policy & Value), in addition to the data generation section of the MCTS code (where the board states and MCTS visit counts are stored as npz files.

With regard to the latter, there was a bit more involved than I originally anticipated. While I have all the Go gameplay functionality for training built, I am specifically not finished for the code for the MCTS expansion phase, where the leaf to expand is determined and children are generated. That being said, I have built out all other functionality, and am working to finish the expand() function by Monday in order to stay ahead of schedule. A brief schematic of the training structure is shown below.

Once the expand() function is finished, the next step is finding a dataset of expert go matches to use as training data for the value network pre-simulation. While this is not strictly necessary, and self-play could be used for this, giving the value network a strong foundation with expert-generated data improves the quality and function of the initial training data much faster. My goal is to have expand() finished and this dataset found by the end of the week. If that goes according to plan, I would be able to commence network training and MCTS generation immediately afterwards.

To accomplish my task, I had and will have to learn a few new tools. While not really a tool, I had to fully understand Monte Carlo Tree Search in order to program it correctly. More presciently, I have never used PyTorch before, or worked with Convolutional Neural Networks. While I have worked with similar technologies (fully connected deep neural networks and TensorFlow) I have had to do a lot of research into how to most effectively utilize them.

Nathan’s Status Report For 10.7.23

Now that we have fully settled on our transition to Go, and how it will be implemented,  I have been able to fully focus on my part of the project. Due to our team presenting our design on Monday, and my conclusion of research last week with AlphaZero and MuZero I had the rest of the week to focus solely on implementation.

Almost all of my time this week was spent on the Go simulation framework, as well if beginning to figure out how to set up the reinforcement learning architecture. With regards to the former, I worked together with Hang to make sure we have a clear plan on how to pass board information from Arduino to my backend engine. From there, I was able to implement a backend representation of the board, and implemented function to allow an outside controller (in this case the engine) to make a move for simulation purposes, as well a functions to update position based on the information conveyed by the physical board. This is effectively all I need (along with basic rule checking like making sure the game is or isn’t over which I have also implemented) to move on to the reinforcement learning architecture. The real challenge here is the custom loss functions defined in our design proposal (expected result optimization and policy vector normalization). I have never worked with custom loss functions in python before, so I’ve done a huge amount of research into different ways to accomplish this. I decided to settle on PyTorch, as this is not only the current industry consensus for best deep learning framework, but also extremely well supported in Python. I started, but have not completed, actually scripting these loss functions, I am taking my time to make sure they are not only correct, but also optimally efficient, as in conjunction with the MCTS simulation, training times could balloon rapidly with inefficient implementation for either of these.

In the next week, I plan to finish these custom loss functions, then work on getting the in-training MCTS simulations to work. With the simulation framework already built, this shouldn’t require too much time.

Team Status Report For 10.7.23

The major risk that we are taking as of the moment is spending more time on the implementation of the board than expected. With less support from Techspark than initially expected, it has led us to focus on building a board of our own. To ensure this does not put use behind schedule, we plan to dedicate more time next week in catching up. If the Board implementation does require more time to build, which is not planned nor expected, we will be sure to look for external resources that may help us build the physical board if not look for possible substitutes to making a custom board.

We currently have no changes in our design other than our shift of projects from the earlier week.

Schedule update:

One principle of engineering that was used is modularity. One specific example in the site design would be the code for visualizing the go boards. Go boards are typically 19×19, but the code is made modular such that it can visualize up to any NxN board size. We attempted to make our code as modular as possible because modularity typically reduces complexity and makes parts more reusable.

In addition, we have used skills from Electrical Engineering and electro-magnetic physic principles when deciding on components to use as many components have limitations. Such principles applied would be Ohm’s law and KCL rules for our simple circuitry design as well as knowledge on components characteristics.

With the custom board development, one of the concerns we dealt with as an engineer was dimension requirements and constraints of our box that would hold our electronics. This board development required skills in product engineering to make sure the board was easy for users to use as well as safe for users.

Hang’s Status Report For 10.7.23

Instead of focusing on game history, I decided to flip the order and focused on visualization of the game board first. This ended up being a little more tricky than I thought because I originally planned to draw the board with divs as tiles, but while I was working through it, I remembered that the pieces fall on the intersection of the lines, not in-between the lines and there was no good way of drawing pieces slightly off of a div without messing up the entire grid (React expects children to fall inside the parent containers). Instead what I did is that I had each “tile” fall under 9 categories: top left corner, bottom left corner, bottom right corner, top right corner, top, left, right, bottom, and middle. This way, I can place the pieces inside of each tile, instead of attempting to put the pieces in the intersection of 4 tiles. 

My progress is currently on schedule, I’ve just flipped the order, so that I worked on the visualization of the board, instead of game saving first.

By next week, I expect to have game saving done, and also start to work on game loading.

Israel’s Status Report 10.7.23

Tasks accomplished

This week, I made not the most progress as I’d like in the embedded software aspect of things. I have ordered half of the parts but due to some fall backs on  wood shop, I prioritized more of my time with the physical board implementation more than anything.

I made a autocad and solidworks design with the help of a tech spark technician for laser cutting and marking the top of the board as required. I have also made a overall model of the board itself for documentation purposes and for helping when board building is of need.

(autocad inital design in a dwg format)

Progress status

Because I prioritized more of my time with board building, I am slightly behind on embedded software for data retrieval trials.  I am still on track as I should have time for such elements next week from front loading these initial physical board tasks.

Tasks to complete

I will need to prioritize data retrieval trial code next week. In addition, if I could get parts early parts, it would be ideal to make the board in class time just to ensure we have the board ready for our return from fall break. This is a additional task not intended as we expected to techspark to provide more support with building the board. This, again, should not be a challenging task with our cad designs and required designs for laser cutting.