November 2023 – Team A0: Go Learning Buddy

November 20, 2023

Israel’s status report 11.12.23

Tasks completed

I have completed making the interior board lifter that we plan to have support our vector boards to be close to the board. We had laser cutted the acrylic board and the H wood supporter. In addition, I tested the optimal resistor choice to use in series with the photoresistor in hand given the holes we have on the board.

Progress status

Even though I have developed the foundation to support the vector board and circuitry, the photo resistors will not be able to get a correct consistent value for all wholes as it can not be stationary in the hole. I will need to add spacing into the board itself to support the photo resistors to be in the holes and be all consistent.

Tasks to complete

I need to finalize the internal layout of the board to support the vector boards near the top panel of the board. Because I need to have different constant resistors for our photo resistors. In addition, I need to make holes into the board so the photo resistors can fit into the wood panel. I also need to add holes for the buttons to be placed into the board itself.

November 19, 2023November 19, 2023

Team Status Report For 11.18.23

Hardware:
There have been quite a few design changes made with regards to the hardware component. The main differences from the previous design iteration are that wires will replace the vector boards, and the instead of being located on said vector boards, the light sensors will be located on the wooden planks themselves. With regards to its development, we are still wiring sensors which is taking longer than expected due to design changes. This is also affecting the planned interfacing with the Raspberry Pi. This is the main risk for our project at the moment, as to mitigate it, Hang has started to work on the hardware component along with Israel, in order to reach the interfacing stage faster.

Software (backend):
We have a slight change in design for the software backend. After consultation with Prof. Tamal, we have decided to implement the communication between the Arduino connected to the physical board and the web server via a Raspberry Pi, instead of using the user’s computer. This will be done by hosting a Flask server on the Raspberry Pi itself. However, before this can be implemented Hang will continue his work helping Israel with the hardware component, as that has proved to be more complex than expected and thus requires more manpower in order to be finished in time.

Software (Engine):
The engine has come together nicely, and is in full working order, and is currently running MCTS in order to tune both the value and the policy networks. However, as mentioned in Nathan’s weekly report there is a bit of a risk in the execution speed of the MCTS. This does not threaten the existence of the engine, the engine already works as is, and can be used. However, the faster and more efficient the MCTS implementation is, the more scenarios can be examined in a given amount of time, and thus the more accurate the engine can play (assuming the evaluation skills of the network are not harmed in the speedup). Thus Nathan will continue working on improving execution to allow for deeper simulations. At some point, he may also need to evaluate the tradeoffs in reducing the network complexity to improve MCTS timing, as the increase in allowable depth could outweigh the network accuracy loss.

Schedule:
Other than Hang, whose changes are detailed above (helping with the hardware) there are no schedule changes. Because Hang is ahead of schedule this added workload will not cut into his slack, and this added help will allow Israel to stay on schedule.

ABET:
We’ve definitely gotten a lot better at working together as a team as the semester has proceeded. At first, we did all of our work together, in person. This had quite a few limitations, including timing and interest levels. We began expanding the ways in which we work together. Sometimes when working on collaborative efforts (ex. team reports, design documents, etc.) we work remotely but synchronously connected over a voice call. This allows us to communicate effectively while still being in the comfort of wherever we choose to be (usually at home). We’ve also gotten better at working asynchronously. Our project sections (hardware, software, and engine) are fairly disjoint, so we spend a lot of time working individually, however, there are obviously times when we need to consult each other on how parts interact. We have gotten good at doing as much as possible while leaving a generalized interface, prevent blockages while still making interaction efficient and easy. Finally, due to the recent overlap in jobs (Hang helping on hardware) all of us have had to learn a bit more about how the hardware works, and Israel has done a great job teaching us so we can help him out.

November 18, 2023November 18, 2023

Nathan’s Status Report For 11.18.23

I am once again happy to report that I was able to accomplish all of my stated goals from last week, and am again running on schedule.

I started by debugging MCTS locally, as I still had some errors in the code, including an aliased scoreboard among tree nodes causing rampantly inflated scores and a missed case when checking which stones on a board would be captured by a particular move. Once I fixed these issues, among others, I was able to simulate multiple full matches of the MCTS engine playing against itself locally to cover any edge cases that my test cases didn’t catch (stepping through each move manually). Once this was finished, I ported it over to be run on the ECE machines.

Once moved over to the ECE machines, I set up the initial run of MCTS, which is running as I write this. As I am able to complete more runs, the policy network strength will increase, and thus the expansion factor for each node in the tree can be lowered, reducing the computation required for each simulation (ex. I might only need to consider the top 20 suggested moves at any given position from a stronger policy network rather than say 50 from a weaker or recently-initialized network).

That being said, I am not fully satisfied with the current time each MCTS iteration is taking, and am thus currently working on optimizing my implementation while simulations are running. I was expecting about a 13x speedup from my local machine to the ECE machines, which is what I saw when training the value network, but for some reason, with MCTS this speedup is almost non-existent, limiting the rate at which I can generate new data. As such, I am doing some research into what might be causing this (GPU utilization, etc.). Secondarily, I am also optimizing my direct MCTS implementation. An example of the types of changes I’m making includes only expanding a node (i.e. generating children for each of the top n nodes) once it has been selected post its own generation, that is, the search not only found it as a temporary ending node, but also selected the node again for expansion. This cuts down of the amount of calls to the value network to evaluate positions, which seems to be the largest factor in slowing the program down.

Finally, I have settled on a definite policy network architecture, with it being the same as the value network, but having a length 362 softmax vector as the final dense layer, instead of a singular sigmoid scalar.

Over the next week (Thanksgiving week) I mean to continue running MCTS simulations, training the policy network, and optimizing the system to increase the speed at which I generate training data.

Final note: As I have MCTS working fully, this essentially means the the engine can be run to play against a real opponent (not itself) at any time, as everything is synced up together. The engine will improve with each iteration of MCTS, but this just updates the weights of the constituent networks.

November 18, 2023

Hang’s Status Report For 11.18.23

This week I spent some time setting up a websocket connection for the communication between the Arduino on the physical board and our web application. Since the physical board isn’t fully set up yet, the only thing I can test is the websocket communication.

In the example above, the client-side is emitting {test: “1”} to the server, and the server simply prints out the message, and emits it back to all clients connected to the socket. The client side then prints out the message it got from the server. Once the board is finished, I can use python’s serial library to read in data from the Arduino and also send data back to the Arduino, and with the server’s websocket, the information can be transferred between the server and the web application.

I also spent some time trying to figure out how to connect to the locally hosted web application and backend server from a different computer in the same network. It wasn’t difficult to do this since this was already taken care of by React’s npm command. For the flask server, I simply had to add another flag into the run command. In both cases to connect, it would need my computer’s IP address. The reason I spent some time looking into this is because it was suggested that we make the communication between the web application and the physical board wireless. As such, I’m planning on running a flask server on a Raspberry Pi connected directly to the Arduino. This flask server would have the websocket as shown above for wireless bidirectional communication between the board and the web application.

While my progress is technically on track since I’m working on the communication steps now, I had a pretty heavy fever for half of the week, so I didn’t get to work on the project as much as I planned to. The Raspberry Pi component currently isn’t a priority, as the building of the physical board is. I will need to help Israel with the construction of the board, however, by next week, I can still test out running a flask server on a Raspberry Pi.

November 11, 2023

Team Status Report For 11.11.23

A hardware risk we have is being able to solder all the vector boards in a timely fashion. Even with the components we have, we are still waiting on more sensors and resistors to be received. Due to this long wait, we will have Hang develop vector board soldering to speed the process once the parts arrive.

A software risk is the integration between the hardware and software. Currently the board hasn’t been finished, so there’s no way of testing the communication between the board and the web application. To mitigate this risk, Hang will spend some time helping Israel with the construction of the physical board.

No changes were made to the existing designs of the hardware, software, or go engine.

Here is our current updated schedule:

11.11.23 schedule

November 11, 2023

Hang’s Status Report For 11.11.23

This week I started working on the backend of the project so that the frontend can communicate with the engine (otherwise we would have to translate the engine code into javascript or set up some kind of hack to run a python script from the React web application). Since the entire project is locally hosted, I chose a simple backend framework with Flask. I ran into quite a few issues with CORs when attempting to make the request from the frontend to the backend, and spent some time adding different headers into my response in an attempt to make it work. After some experimentation, I got the request to work but only when I run the web application from Chrome instead of Edge which is my default browser.

For now, since the engine isn’t fully completed, I’m returning 5 placeholder recommended moves from the backend to the frontend (results displayed above).

My progress is currently on track. By next week, I will have some code for the communication between the Arduino and my web application, but since I can’t really test this out without the board being finished, I will have to help out Israel with the creation of the board to speed up its build process.

As for testing, I will write some pytests to unit test the backend service. For the frontend, I will time the response time for the communication between the frontend and the backend by simply using timestamps (use case requirement is that the response should take less than 2 seconds, engine needs some time to compute moves) and will also include jest tests to make sure that each page renders correctly.

November 11, 2023

Nathan’s Status Report For 11.11.23

I’m happy to say that I accomplished everything I set as my goals last week and in fact am ahead of schedule at the moment. I was able to port my data over to the ECE machines and finish training (with about a 13x speedup) on there. I solidified both the weights and the architecture for the value network. I then utilized the design space exploration from the value network to solidify the architecture for the policy network. Finally, I began testing the MCTS process locally, and once I am sure it fully works, I will port it over to continue on the ECE machines as well.

Starting off with the relocation to the ECE machines, I was able to move 13 GB of training data (more than 3.5 million data points) over to my ECE AFS so I could train the value network remotely. This had the added advantage of speeding up training time by a factor of about 13, meaning I had more freedom with the network architecture. The architecture I ended up settling on took about 13 minutes per epoch on the ECE machine, meaning it would have taken ~170 minutes per epoch locally which obviously would have been impossible as even a lower bound of 50 epoch would have taken about a week.

Secondly, my finalized architecture is shown below in listed form,

As you can see, there are three parallel convolution towers, with kernel sizes of 3, 5, and 7, which help the network derive trends in different sized subsections of the board. Each tower than has a flattening layers, and a fully-connected dense layer. These layers are concatenated together with the other towers, giving us a singular data-stream that passes through successive dense and dropout layers to prevent overfitting, culminating in a singular sigmoid output node, which provides the positional evaluation. This network was trained on 3.5 million data points, pulled evenly from over 60,000 expert level go matches. After training, the network was able to identify the winner of a game from a position 94.98% of the time, with a binary cross-entropic loss of .0886. This exceeded my expectations, especially considering many of the data points come from the opening of matches, where it is considerably harder to predict the winner as not many stones have been placed.

Using my design space exploration for the value network, I was able to solidify an initial architecture for the policy network, which will have the same convolutional towers, only differing in the amount of nodes in the post-concatenation dense layers, and the output form of a length 362 vector.

I have started testing MCTS locally, with success, once I am convinced everything works as expected I will port over to the ECE machines to continue generating training data for the policy network, in addition to tuning the value network. Fortunately, as for the first iteration of MCTS the policy network essentially evaluates all moves equally, the training data will be valid for further training, even if the architecture for the policy network needs to be changed.

In the next week, I plan to move MCTS over to the ECE machines and complete at least one iteration of the (generate training data via MCTS, tune value network and train policy network, repeat) cycle.

ABET: For the overall strength of the Go engine, we plan to test it by simply having it play against different Go models of known strength, found on the internet. This will allow us to quantitatively evaluate its performance. However, the Go engine is made of 2 parts, the value and the policy networks. Training performance gives me an inkling into how these networks are working, but even with “good” results, I still test manually to make sure the models are performing as expected. Examples of this include walking through expert games to see how the evaluations change over time, and measuring against custom-designed positions (some of which were shown in the interim demo).

November 4, 2023

Nathan’s Status Report For 11.4.23

In my previous week’s report, I mentioned that my goal this week was to finish the design space exploration for the Value Neural Network, and begin running simulations. Unfortunately, I am running about one day behind schedule, as processing the expert-level games dataset into consumable board states took longer than expected. However, I have a baseline version of the value network set aside for the interim demo, and am finishing up the design exploration as we speak, meaning if a better model is trained between now and Monday I can replace the already competent baseline.

That being said, I have not fallen very far behind at all, and it is easily covered by the slack built into my schedule. However, there are a few things of note before I start simulation proper, the first being ECE Machine setup. For the preliminary value network, I trained locally as the training data I generated takes up roughly 40 GB of space, well above my AFS limit. However, locally I am also limited by 8 GB of RAM meaning I can only use about 7.5 GB of this training subset anyway. As such, even if I cannot port all 40 GB of data onto the ECE Machines, anything over 8 GB would be an improvement, and worth trying just in case it helps train a substantially different model. As such, I am planning on asking Prof. Tamal on Monday who I should ask about getting my storage limit increased, and I will work on it from there.

The design space exploration has also yielded useful results in terms of what an allowable limit on network size would be. Locally, I’m currently operating with 2 convolutional layers, 1 pooling, and 1 fully connected dense layer, and this takes about 6.5 minutes per epoch with my reduced 8GB training set. The ECE machines will compute faster, and this 6.5 minutes per epoch rate is far shorter than my limit once we’re past the interim demo. This means if necessary, both the value and policy network architectures can grow without the training time becoming too prohibitive.

Therefore, beyond our interim demo, I plan to begin simulations next week to generate my first batch of policy-network-training and value-network-tuning data. Ideally I get the space increase on AFS quickly meaning I can do this remotely, but if possible I can run it locally as well, and port over the weights later. I also plan on setting up the architecture and framework for the policy network as well, so that I can begin training it as soon as the simulation data starts being generated.

November 4, 2023

Team Status Report For 11.4.23

A hardware risk that we are currently facing is the delay in parts and our delay of ordering them. With our delay of these hardware components, it’s put us behind in building and testing our code. This would push our completion date as well as possibly affect our LED development in the later half of the semester. As of right now at least, we plan to have a pre-made circuit design layout on the vector boards to speed the build process when our components arrive as well as start soldering resistors as required. This would at least not waste too much of our time and allow us to gain back some later.

One of the risks of site development is that people may not find specific parts of it not very intuitive to use. Some issues can seem a little nitpicky (one of the 2 tabs should be highlighted so that the user can easily tell which page they are on) but having an easy-to-use site makes the product more appealing. To mitigate this issue, we’ll find some sample users and have them test out the site and record their feedback on specific issues with usage on the site.

No changes were made to the existing designs of the 3 different components of the project (hardware, software, go engine).

However, there were some changes made to the schedule:

November 4, 2023November 4, 2023

Hang’s Status Report For 11.4.23

This week, I spent some time making the UI look nicer and got the saved games to display on the site. I created a new page specifically for the saved games (all the routing done with react-router) and added the necessary button and dropdown menu for uploading a file and choosing a move number. I also spent some time making the board interactable specifically for the interim demo because we haven’t integrated the different parts yet, so after the board, go engine, and site are all integrated, I will remove this feature.

My progress is currently on track. By next week, I will set up a local python server specifically for running the python scripts for the go engine, so that the frontend can make a request for getting the recommended game moves, and the server will return a response containing the recommended moves.