October 2023 – Team A0: Go Learning Buddy

October 30, 2023

Israels status report 8.29.2023

Tasks completed

I have finished the laser cutting for our board as well as software expected for our initial circuitry. I finished troubleshooting and testing a small section of our circutry for our light sensors and have order new parts for us.

Progress status

Even though I have tested our circuitry at a smaller scale, I need to repeat a similar process for different light sensors. This means I am falling slightly behind but I can at least start on the LED portions.

Tasks to complete

I need to finalize the internal layout of the board to support the vector boards near the top panel of the board. In addition, I need to adjust the software for this new light sensor we are using but that should be a quick task. In addition, I may start on integration or LED addition task for the time I wait for light sensors to arrive.

October 28, 2023November 4, 2023

Team Status Report For 10.28.23

One of the risks that we are handling in the hardware development of the project is starting mid-size scale development of the circuitry without having tested a subset of the circuitry design. Some concerns regarding this is that our Arduino component, even though theoretically able to provide enough Current for our required sensor subset read time, the Current the Arduino has may not be substantial enough to provide for all the sensors at once. If this Arduino can not provide Current required, we will need to add more into our circuit than expected to limit current usage of each sensor.

For the secondary risk mentioned in last week’s team status report, we were able to find an open-source database with expert Go games; this fully mitigates this risk from last week. The games are stored in a file format called SGF, or smart game format, where game moves are stored in a tree which allows for variations from the main line of gameplay. This database has over 60000 go games, and the go games are split into different categories.

For hardware, we have changed our usage of photodiode sensors to photo resistors with an additional static 1MOhm resistor in series. This decision was required due to unexpected value results not being a high enough range. This along with the component requiring an additional resistor to restrict Supply Current made our usage of this sensor impractical and even more expensive than other components. The new photo resistors are very simplistic in characteristic, cheaper, and will theoretically be able to provide a larger range of light values with our 1MOhm resistor we have chosen in series. In addition, we have added more parts to the physical board design, as we will require an additional platform in the physical board to support the multiple mini vectors to be held near the top of the board’s holes.

Besides the hardware design change, there are no design changes to the reinforcement learning side and the software side. The development of the reinforcement learning model is ahead of schedule. The development of the software side is a little behind schedule, but the work can easily be caught up in time.

Updated schedule:

Physical board assembled and subcircuit testing breadboarded.

General vector board placement internal view

October 28, 2023

Nathan’s Status Report For 10.28.23

In my previous weekly report, I mentioned that my goal for this week was to finish my expand() function, and to find a dataset to train the initial values for my value network. I am happy to say that I have accomplished both of these, as well as finishing the rest of the code required for MCTS. The dataset I am going to use is located here and contains over 60000 games from professional Go matches played in Japan. For the curious viewers, all the code required for MCTS can be found on Github.

With the aforementioned dataset, I plan to begin training the value network as soon as possible (hopefully by Monday). While I have the dataset, it is stored in the Smart Game File (SGF) format, which is the sequence of moves, not the sequences of board states generated. As I need the board states themselves for training, I am currently working on a script to automatically process all 60000 of these files, generating each board state and tagging them with the game result. These results are the training data I require. Once this is finished, I can begin training, which involves the physical training over the dataset, but also I will do some design space exploration with regards to network architecture (number of nodes, types and number of layers, etc.). This will allow me to find a closer to ideal combination of accuracy and processing time (as efficient simulations are helpful in training, but vital for usage).

This design-space exploration will actually prove helpful for the Policy Network as well, as it will provide a baseline for the allowable complexity. (Higher number of nodes and layers will generally perform better barring overfitting, but will take more time, the value network exploration will give me an estimate for the amount of layers I can use in the policy network as the networks can be used to evaluate positions in parallel.)

Then once I have the parameters (architectural mainly) for the networks set, and the initial weights for the value network trained, I can immediately begin running simulations, as all of that infrastructure is complete. I should be running these simulations by the time of my report next week, and will aim to do my first training run on both networks with the data the simulations generate.

October 28, 2023October 28, 2023

Hang’s Status Report For 10.28.23

This week, I was working on displaying the saved game states. To do this, I needed a component that would open up a file finder, and it turns out this is a standard component in React called <input/>. After the user selects the file, I have to read in the content of the files into an array. The format of the file contents follow this: tile_type, move_number, and this repeats 361 times (19×19) with the. tile_type either being empty, black, or white, and move_number either being the move number of the tile or -1 representing empty. Since this entire content is read in as a string, I used split to convert the string into an array. I also spent some time looking into the React router so that the site can have different pages as I wanted to separate the page for live gameplay and displaying the saved games.

I’m a little bit behind schedule since I wasn’t able to display the saved game states yet. This is because I had a pretty large project due for one of my classes this week, so I spent most of my time focusing on that project. I’ll spend a little more time this following week to finish up displaying the game states.

My deliverables for next week will be displaying the saved game states, and getting multiple pages working for the site.

October 21, 2023November 4, 2023

Team Status Report For 10.21.23

First of all, to address the major risk from our last report, we have made significant progress on the physical board but we are not out of the woods yet. We have built the autocad design and prepped for laser cutting for our new 100cmx100cm wooden plank. In addition we have started circuitry assembly for testing purposes before making a final order for all our electrical components. We still have not finished these tasks and have yet to start testing our software on this circuitry but we plan to do so in the coming week. The risk here has evolved a little as well, as we realized we need fire and short-circuit protection, so have developed a plan to do this with a combination of insulation, and strategic spacing of components, but we still need to actually add that to the board.

A secondary risk from the reinforcement learning side is the inability to find an open-source database with expert go games. We felt confident that such a database existed, but should it end up not existing (or the data is unclean, or it is in an unusable state, etc.), the value network for MCTS would not be able to be trained to non-random weights before training commences. Fortunately, the mitigation for that comes in the form of MCTS itself, as the search generates more value network training data, that is, every board state plus whatever result the game it came from reached.However, this would harm the efficacy of training at first (the network would improve more slowly) and while we have enough slack for that possible time setback, it would be ideal to avoid.

Other than the aforementioned insulation and spacing, there are no changes to the design, and everything is running roughly on schedule. The development of the website is on-track with the feature for saving game history finished, and we have started on displaying the game history. Our hardware development is on track as well and the reinforcement learning model is almost ready for training.

October 21, 2023

Israel’s status report 10.21.23

ABET

Some new tools I have already learned has been laser cutting for wood board making. I also planned autocad designing which is a skill I had to refresh myself in. I plan on learning more about arduino coding with COM port communication as I have worked more with stm32 processors and UART initally.

Tasks completed

I have received supplies for our team and have started on laser cutting our board for our project. In addition, I have worked on protyping the hardware: starting a subset of the hardware sensors to see the value adjustment and the circuitry testing.

Progress status

I have made good progress in terms of the embedded software but do still need to test everything and ensure it is prepared for communication with PC soon. I am on track but would like to be further ahead given the hiccup with our hardware board making initially.

Tasks to complete

I will need to make the final order of electronics Monday and start putting all the electronic components together. In addition, I will need to add electric insulation to the board and methods of holding the circuit boards into the board so that mobility of the board will not affect the circuit board placement note cause for errors in future. I’m addition

October 21, 2023

Hang’s Status Report For 10.21.23

This week, I focused on saving game states. Originally, I wanted to save the board for each move with the recommended moves from the Go engine, however, I’ve decided against this since the text file would be unnecessarily large. Instead, I’ll only save “one board” flattened into an array of 361 (19×19). Each element in the array will either be a tuple of (“E”, -1) or (“W”, some move number) or (“B”, some move number), where “E” corresponds to empty, “W” to white, and “B” to black. I won’t be saving the recommended game moves, instead, I would be reconstructing the board for each move selected to display, and then feeding that board state to the Go engine to get the recommended moves. While this will take a little more processing time for displaying the history of games, it’s worth the memory cost of storing large games/games with many moves.

My schedule is on track as I’ve finished saving game history. By next week, I should have the feature for displaying game history done.

While I have some frontend/UI experience as I have done a full stack internship before, most of my work was in backend. For this project, almost all of my work will be with React/frontend, and I’ll be creating new components with CSS which I haven’t done before. I’ll also be looking into how to do serial communication between the web page I’m creating and the Arduino board for getting the game states with Web Serial API.

October 21, 2023

Nathan’s Status Report for 10.21.23

As I mentioned in my previous status report, my goal for this week was to finish custom loss function implementation, then work towards getting the MCTS training data generation working, given that the framework was already there.

With regard to the former of the two goals, that work is all complete. I had actually misunderstood my needs, I don’t actually need a custom loss function, as the two networks are trained on mean-squared error (MSE) and binary cross-entropic loss (BCEL). Nevertheless, I have built the framework for the training of the two networks (Policy & Value), in addition to the data generation section of the MCTS code (where the board states and MCTS visit counts are stored as npz files.

With regard to the latter, there was a bit more involved than I originally anticipated. While I have all the Go gameplay functionality for training built, I am specifically not finished for the code for the MCTS expansion phase, where the leaf to expand is determined and children are generated. That being said, I have built out all other functionality, and am working to finish the expand() function by Monday in order to stay ahead of schedule. A brief schematic of the training structure is shown below.

Once the expand() function is finished, the next step is finding a dataset of expert go matches to use as training data for the value network pre-simulation. While this is not strictly necessary, and self-play could be used for this, giving the value network a strong foundation with expert-generated data improves the quality and function of the initial training data much faster. My goal is to have expand() finished and this dataset found by the end of the week. If that goes according to plan, I would be able to commence network training and MCTS generation immediately afterwards.

To accomplish my task, I had and will have to learn a few new tools. While not really a tool, I had to fully understand Monte Carlo Tree Search in order to program it correctly. More presciently, I have never used PyTorch before, or worked with Convolutional Neural Networks. While I have worked with similar technologies (fully connected deep neural networks and TensorFlow) I have had to do a lot of research into how to most effectively utilize them.

October 7, 2023

Nathan’s Status Report For 10.7.23

Now that we have fully settled on our transition to Go, and how it will be implemented, I have been able to fully focus on my part of the project. Due to our team presenting our design on Monday, and my conclusion of research last week with AlphaZero and MuZero I had the rest of the week to focus solely on implementation.

Almost all of my time this week was spent on the Go simulation framework, as well if beginning to figure out how to set up the reinforcement learning architecture. With regards to the former, I worked together with Hang to make sure we have a clear plan on how to pass board information from Arduino to my backend engine. From there, I was able to implement a backend representation of the board, and implemented function to allow an outside controller (in this case the engine) to make a move for simulation purposes, as well a functions to update position based on the information conveyed by the physical board. This is effectively all I need (along with basic rule checking like making sure the game is or isn’t over which I have also implemented) to move on to the reinforcement learning architecture. The real challenge here is the custom loss functions defined in our design proposal (expected result optimization and policy vector normalization). I have never worked with custom loss functions in python before, so I’ve done a huge amount of research into different ways to accomplish this. I decided to settle on PyTorch, as this is not only the current industry consensus for best deep learning framework, but also extremely well supported in Python. I started, but have not completed, actually scripting these loss functions, I am taking my time to make sure they are not only correct, but also optimally efficient, as in conjunction with the MCTS simulation, training times could balloon rapidly with inefficient implementation for either of these.

In the next week, I plan to finish these custom loss functions, then work on getting the in-training MCTS simulations to work. With the simulation framework already built, this shouldn’t require too much time.

October 7, 2023November 4, 2023

Team Status Report For 10.7.23

The major risk that we are taking as of the moment is spending more time on the implementation of the board than expected. With less support from Techspark than initially expected, it has led us to focus on building a board of our own. To ensure this does not put use behind schedule, we plan to dedicate more time next week in catching up. If the Board implementation does require more time to build, which is not planned nor expected, we will be sure to look for external resources that may help us build the physical board if not look for possible substitutes to making a custom board.

We currently have no changes in our design other than our shift of projects from the earlier week.

Schedule update:

One principle of engineering that was used is modularity. One specific example in the site design would be the code for visualizing the go boards. Go boards are typically 19×19, but the code is made modular such that it can visualize up to any NxN board size. We attempted to make our code as modular as possible because modularity typically reduces complexity and makes parts more reusable.

In addition, we have used skills from Electrical Engineering and electro-magnetic physic principles when deciding on components to use as many components have limitations. Such principles applied would be Ohm’s law and KCL rules for our simple circuitry design as well as knowledge on components characteristics.

With the custom board development, one of the concerns we dealt with as an engineer was dimension requirements and constraints of our box that would hold our electronics. This board development required skills in product engineering to make sure the board was easy for users to use as well as safe for users.