Status Report – Team A0: Go Learning Buddy

December 10, 2023

Team Status Report For 12.9.23

As I’m sure is common among many groups, this week our team was basically exclusively focused on getting our project ready for demo. On that front:

Engine: The engine is fully operational with high accuracy value and policy networks. In the demo version, a simulation depth of 25 is used, meaning the engine is looking up to 25 moves into the future (though the average would tend to be between 8-10, with a minimum of 4). The engine has been integrated fully into the web app, and both real-time and historical analysis work as intended.

Web-App: The web-app is fully operation on its own. Integration with the engine has been completed, as mentioned earlier, however integration with the physical board is not quite complete. That being said, the analysis function works as intended, and communication with the physical board is almost complete, pending a full debug of said board.

Physical Board: The physical board is almost fully complete. The LEDs have been soldered, meaning the only remaining issue is debugging 2 of the muxes that are having slight issues. This should be done easily before demo, as it is a known issue that has been fixed on 2 of the other muxes already. Full integration with the web-app has been done, but the overall product has not been fully tested, as it is not possible without working muxes. That is, with the outputs coming from the muxes currently, the web-app and engine work exactly as expected, so there is a 99.9% chance no more debugging will be needed once the final two muxes are done.

Once the muxes are debugged we are ready for demo, and the next week will be spent on the video and on the final report.

ABET:

Unit Tests:

Engine:
Basic board positions to insure all suggestions are legal moves. Found error with how passes were signified on a 9×9 board, pass was changed to be signified by an index of 81 and not 361.
Complex board positions to insure engine maximized or minimized depending on whose turn it is (minimize for black, max for white). Found error where exploration wasn’t inverted, so for black’s moves the optimization wasn’t working correctly. Fixed.

Physical Board:
Sensors tested manually when disconnected from board (i.e. only connected to Arduino) to make sure they would fit our purposes. These ended up having thresholding issues, which caused our change from 1 megaOhm to 10 megaOhm resistors in series with the photoresistors.
Tested multitude of board positions to ensure the correct board state is sent to arduino. Found issue with muxes, which is currently being debugged.
(Each intersection was tested individually)
Tested each individual button to make sure each signalled the web-app in the expected fashion.

Web-app:
Tested input states to make sure they were rendered as intended.
Tested responses to RPi signals (from Arduino) to ensure intended behavior.
Pytest suite for endpoints and web-socket.

Integration Tests:
Web-App -> Engine:
Tested interactions to make sure a prompt suggestion is delivered while also making sure all suggested moves are legal (implying the sent state is processed correctly). This identified a secondary issue with how passes were being conveyed which was again fixed.
Timing analysis was performed to be sure that the engine and communication components complied with out use case requirements.

Board->Web-App:
Extensive unit tests comprising of setting up the board and making sure the exact state on the board is the exact state on the web app. This caught issues with the muxes, which are currently being debugged, so I can’t give a super detailed writeup on the solution yet.

December 9, 2023December 9, 2023

Nathan’s Status Report For 12.9.23

I am happy to say that not only is the engine fully ready for demo multiple days ahead of time, but also that the improvements from the MCTS training run in the past week were much larger than I originally anticipated.

Before switching to 9×9 board, the original (initial trained) version of the policy network had just above 1% accuracy. While that sounds extremely low, it in fact meant the policy network identified the “best” move in the position 4 time more often than randomly guessing (with 362 options), plus, there is a much higher chance that the “best” move is in its top 5 or 10 suggestions.

The second version, trained exclusively on 9×9 data generated from my MCTS runs, had an accuracy of around 9%, which represents performance about 8 times better than randomly guessing, as for 9×9 there are only 82 possible selections, not 362. With each increase in accuracy, the expansion factor of the tree search (i.e. how many candidate moves are considered from each node) can be reduced, as the probability that the best moves are some of the stronger recommendations of the policy network is higher.

Finally, with the third version, I was expecting something around 20% accuracy, however, much to my surprise the measured accuracy (on separated validation data) was just over 84%. Though this is probably not due to overtraining (due to the network architecture having multiple systems to prevent this, and considering the performance on “unseen” validation data), I suspect some of this marked increase is due to the policy network essentially predicting the value network’s response, with an extra layer of abstraction. That is, if the policy network identifies what traits in a position the value network “likes”, its suggestions will line up more evenly with the value network’s evaluations. However, this is indeed the point of MCTS, as assuming a degree of accuracy in the value network (which I think is safe due to the abundance of training data and past performance), the policy network’s job is to identify candidate moves that may or may not result in the best possible positions. Thus fewer candidate moves need to be explored, and deeper, more accurate simulations can be run in the same amount of time.

This 84% model is the policy network that will be utilized by our engine during demo, however, I am starting a simulation run tonight, whose results (if better), will be used in our final report. The engine is fully integrated onto the web app backend, and works exactly as intended.

Added note, the code for the engine itself (i.e. lacking the code for simulation which can be found on the Github I linked earlier in the semester) can be found here.

December 9, 2023

Hang’s Status Report For 12.9.23

This week most of the time was spent wrapping up various parts of my part of the project as well as integrating the different components together. Once Nathan was done training his new engine, I hooked the new engine to the backend server which was pretty seamless since I had an older version of his engine already hooked up. I’ve had to change how I save the game states to deal with captures in an easier way by simply reading in the game state from the physical board, instead of checking for the new move(s) from the read in state. This means that I had to save the board state at each move, instead of saving the move number in each of the board positions.

For integrating the physical board, there were a few issues I had to resolve in order to get the integration to work. To read in information from the serial port on my Flask server on the RPi, I would have to have while true loop that constantly checked the serial port to see if any new data has been received, however, Flask typically doesn’t allow for this type of structure, so I had to spawn a new thread that did this check. Whenever this thread finds that the Arduino has sent new information, the new information is emitted to the web application via the “update_board” websocket. If the Arduino/physical board requested for a move suggestion, the request is emitted from the RPi to the web application via the “update”board” websocket, the web application then makes a request to the game engine endpoint to get the recommended moves, the recommend moves are then emitted back to the RPi via the “recommended_moves” websocket, and then the Flask server on the RPi writes the recommended move to the serial port.

I’ve also spent some time helping Israel construct LEDS on the board in order to speed up the integration process.

My progress is currently on track: we’ve tested communication from the Arduino to the RPi and communication from the RPi to the Arduino (verifying info with minicom serial monitor). The only item left is to replace some mux components on the board since they were short circuited.

December 3, 2023

Team Status Report For 12.2.23

A hardware risk that ended up occurring was completing the 19×19 board in time for the demo. As such, we decided to change our design by downscaling our board from a 19×19 grid into a 9×9, making the board around 4x smaller. This change also affected the software component and the engine component for the project. On the software side, the data structures holding the board states had to be changed to accommodate for the smaller board size on both the client side code and the code on the raspberry pi server and the backend server.

This created some risk for the Engine, as all the planning had been for a 19×19 version, including the training for the value network and initial training for the policy network. It was decided after deliberation that we would retain the 19×19 network architectures, but adapt their usage to fit a 9×9 board. This involved tweaking the engine so it converts 9×9 inputs into 19×19 via padding, allowing the networks to run as normal. Additionally it required a refactoring of the MCTS code to run 9×9 simulations. These are necessary, as switching the network to optimizing for 9×9 hurts its relative strength, so extra data points are needed to correct this.

As a quick note, the server and backend needed to be adjusted to fit the 9×9 board, but this was done very quickly and easily.

Our new schedule is shown below, and most of the changes are around adapting our system to the new 9×9 board:

December 3, 2023

Nathan’s Status Report For 12.2.23

Some unforeseen issues caused me to deviate a bit from my expectations for the week (i.e. what I said I would do in my previous status report). After extensive consultation it was decided that we would change the board from its original size of 19×19 to a small size of 9×9 to aid in construction time reductions. As such there were a few extra tasks for the week, that I will go into more detail on later.

Task 1: Multi-machined MCTS. In the past week(s) I have been running MCTS near-continuously across many of the ECE number and lab machines. This allowed me to collect around 150,000 data points for tuning, bringing me to

Task 2: Policy Network Initialization. With the data generated through MCTS I was able to train the initial version of the policy network. This is more useful than just being an accuracy increase, as an increase in policy network strength means the expansion factor of MCTS can be reduced to maintain the same level of strength over the same simulation depth. That is, because the suggested moves are better on average, fewer possibilities need to be explored, and thus execution time decreases.

Task 3: Converting the engine to 9×9 usage. The change to a 9×9 board required a change in the engine as all the networks were set up to take in a 19×19 vector instead of a 9×9. This required a small amount of refactoring and debugging to make sure everything was working as intended.

Task 4: Converting MCTS to run as 9×9. As previously mentioned the engine has been converted from 19×19 to 9×9 to conform to the physical board changes. Unfortunately, this reduces the engine’s relative strength, as the value network and first iteration of the policy network were trained on 19×19 data. Accordingly, I refactored the MCTS code to run MCTS simulating 9×9 games, which will generate more specialized data to tune both networks.

Accordingly, aside from prepping for final presentations and demo, I will just be running MCTS in parallel across machines to generate extra tuning data for both networks. The engine works as is, so this improvement is all I have to work on until demo.

December 3, 2023

Hang’s Status Report For 12.2.23

This week I was working on setting up a flask server on the Raspberry Pi, so that we can set up a websocket for wireless bidirectional communication between the physical board and the web application instead of wiring the Arduino to a computer. I had to spend some time setting up the Raspberry Pi in headless mode so that I could simply SSH into the machine instead of setting up a monitor and keyboard each time, and once that was done, I started moving the websocket code from the original server code into the Raspberry Pi. The image below shows the flask server being hosted on the Raspberry Pi, and the communication between the web application and the Raspberry Pi.

Since we changed the board to accommodate 9×9 games instead of 19×19 games, I had to change some parameters in my code to fit the new board. While I was working on this change, I also added row and column labels into the board, so that it is easier for players to read the recommended game moves when the row and columns are listed. Below shows the new board/ui.

I’ve also added the engine to the backend server, so the backend server is almost completely finished now as it takes in a game state and returns the top 5 moves with the probability of winning associated with each move. The only thing left for the backend server is to update the final weights once our new engine is completely trained.

The progress on the software is on schedule. The only important item left on the software side is setting up serial communication between the Arduino and the server on the Raspberry Pi, but this is currently bottlenecked by the completion of the physical board.

By next week, I’ll polish up some final visual items on the web application, and finish setting up serial communication between the Arduino and the server on the Raspberry Pi.

November 19, 2023November 19, 2023

Team Status Report For 11.18.23

Hardware:
There have been quite a few design changes made with regards to the hardware component. The main differences from the previous design iteration are that wires will replace the vector boards, and the instead of being located on said vector boards, the light sensors will be located on the wooden planks themselves. With regards to its development, we are still wiring sensors which is taking longer than expected due to design changes. This is also affecting the planned interfacing with the Raspberry Pi. This is the main risk for our project at the moment, as to mitigate it, Hang has started to work on the hardware component along with Israel, in order to reach the interfacing stage faster.

Software (backend):
We have a slight change in design for the software backend. After consultation with Prof. Tamal, we have decided to implement the communication between the Arduino connected to the physical board and the web server via a Raspberry Pi, instead of using the user’s computer. This will be done by hosting a Flask server on the Raspberry Pi itself. However, before this can be implemented Hang will continue his work helping Israel with the hardware component, as that has proved to be more complex than expected and thus requires more manpower in order to be finished in time.

Software (Engine):
The engine has come together nicely, and is in full working order, and is currently running MCTS in order to tune both the value and the policy networks. However, as mentioned in Nathan’s weekly report there is a bit of a risk in the execution speed of the MCTS. This does not threaten the existence of the engine, the engine already works as is, and can be used. However, the faster and more efficient the MCTS implementation is, the more scenarios can be examined in a given amount of time, and thus the more accurate the engine can play (assuming the evaluation skills of the network are not harmed in the speedup). Thus Nathan will continue working on improving execution to allow for deeper simulations. At some point, he may also need to evaluate the tradeoffs in reducing the network complexity to improve MCTS timing, as the increase in allowable depth could outweigh the network accuracy loss.

Schedule:
Other than Hang, whose changes are detailed above (helping with the hardware) there are no schedule changes. Because Hang is ahead of schedule this added workload will not cut into his slack, and this added help will allow Israel to stay on schedule.

ABET:
We’ve definitely gotten a lot better at working together as a team as the semester has proceeded. At first, we did all of our work together, in person. This had quite a few limitations, including timing and interest levels. We began expanding the ways in which we work together. Sometimes when working on collaborative efforts (ex. team reports, design documents, etc.) we work remotely but synchronously connected over a voice call. This allows us to communicate effectively while still being in the comfort of wherever we choose to be (usually at home). We’ve also gotten better at working asynchronously. Our project sections (hardware, software, and engine) are fairly disjoint, so we spend a lot of time working individually, however, there are obviously times when we need to consult each other on how parts interact. We have gotten good at doing as much as possible while leaving a generalized interface, prevent blockages while still making interaction efficient and easy. Finally, due to the recent overlap in jobs (Hang helping on hardware) all of us have had to learn a bit more about how the hardware works, and Israel has done a great job teaching us so we can help him out.

November 18, 2023November 18, 2023

Nathan’s Status Report For 11.18.23

I am once again happy to report that I was able to accomplish all of my stated goals from last week, and am again running on schedule.

I started by debugging MCTS locally, as I still had some errors in the code, including an aliased scoreboard among tree nodes causing rampantly inflated scores and a missed case when checking which stones on a board would be captured by a particular move. Once I fixed these issues, among others, I was able to simulate multiple full matches of the MCTS engine playing against itself locally to cover any edge cases that my test cases didn’t catch (stepping through each move manually). Once this was finished, I ported it over to be run on the ECE machines.

Once moved over to the ECE machines, I set up the initial run of MCTS, which is running as I write this. As I am able to complete more runs, the policy network strength will increase, and thus the expansion factor for each node in the tree can be lowered, reducing the computation required for each simulation (ex. I might only need to consider the top 20 suggested moves at any given position from a stronger policy network rather than say 50 from a weaker or recently-initialized network).

That being said, I am not fully satisfied with the current time each MCTS iteration is taking, and am thus currently working on optimizing my implementation while simulations are running. I was expecting about a 13x speedup from my local machine to the ECE machines, which is what I saw when training the value network, but for some reason, with MCTS this speedup is almost non-existent, limiting the rate at which I can generate new data. As such, I am doing some research into what might be causing this (GPU utilization, etc.). Secondarily, I am also optimizing my direct MCTS implementation. An example of the types of changes I’m making includes only expanding a node (i.e. generating children for each of the top n nodes) once it has been selected post its own generation, that is, the search not only found it as a temporary ending node, but also selected the node again for expansion. This cuts down of the amount of calls to the value network to evaluate positions, which seems to be the largest factor in slowing the program down.

Finally, I have settled on a definite policy network architecture, with it being the same as the value network, but having a length 362 softmax vector as the final dense layer, instead of a singular sigmoid scalar.

Over the next week (Thanksgiving week) I mean to continue running MCTS simulations, training the policy network, and optimizing the system to increase the speed at which I generate training data.

Final note: As I have MCTS working fully, this essentially means the the engine can be run to play against a real opponent (not itself) at any time, as everything is synced up together. The engine will improve with each iteration of MCTS, but this just updates the weights of the constituent networks.

November 18, 2023

Hang’s Status Report For 11.18.23

This week I spent some time setting up a websocket connection for the communication between the Arduino on the physical board and our web application. Since the physical board isn’t fully set up yet, the only thing I can test is the websocket communication.

In the example above, the client-side is emitting {test: “1”} to the server, and the server simply prints out the message, and emits it back to all clients connected to the socket. The client side then prints out the message it got from the server. Once the board is finished, I can use python’s serial library to read in data from the Arduino and also send data back to the Arduino, and with the server’s websocket, the information can be transferred between the server and the web application.

I also spent some time trying to figure out how to connect to the locally hosted web application and backend server from a different computer in the same network. It wasn’t difficult to do this since this was already taken care of by React’s npm command. For the flask server, I simply had to add another flag into the run command. In both cases to connect, it would need my computer’s IP address. The reason I spent some time looking into this is because it was suggested that we make the communication between the web application and the physical board wireless. As such, I’m planning on running a flask server on a Raspberry Pi connected directly to the Arduino. This flask server would have the websocket as shown above for wireless bidirectional communication between the board and the web application.

While my progress is technically on track since I’m working on the communication steps now, I had a pretty heavy fever for half of the week, so I didn’t get to work on the project as much as I planned to. The Raspberry Pi component currently isn’t a priority, as the building of the physical board is. I will need to help Israel with the construction of the board, however, by next week, I can still test out running a flask server on a Raspberry Pi.

November 11, 2023

Hang’s Status Report For 11.11.23

This week I started working on the backend of the project so that the frontend can communicate with the engine (otherwise we would have to translate the engine code into javascript or set up some kind of hack to run a python script from the React web application). Since the entire project is locally hosted, I chose a simple backend framework with Flask. I ran into quite a few issues with CORs when attempting to make the request from the frontend to the backend, and spent some time adding different headers into my response in an attempt to make it work. After some experimentation, I got the request to work but only when I run the web application from Chrome instead of Edge which is my default browser.

For now, since the engine isn’t fully completed, I’m returning 5 placeholder recommended moves from the backend to the frontend (results displayed above).

My progress is currently on track. By next week, I will have some code for the communication between the Arduino and my web application, but since I can’t really test this out without the board being finished, I will have to help out Israel with the creation of the board to speed up its build process.

As for testing, I will write some pytests to unit test the backend service. For the frontend, I will time the response time for the communication between the frontend and the backend by simply using timestamps (use case requirement is that the response should take less than 2 seconds, engine needs some time to compute moves) and will also include jest tests to make sure that each page renders correctly.