Some unforeseen issues caused me to deviate a bit from my expectations for the week (i.e. what I said I would do in my previous status report). After extensive consultation it was decided that we would change the board from its original size of 19×19 to a small size of 9×9 to aid in construction time reductions. As such there were a few extra tasks for the week, that I will go into more detail on later.
Task 1: Multi-machined MCTS. In the past week(s) I have been running MCTS near-continuously across many of the ECE number and lab machines. This allowed me to collect around 150,000 data points for tuning, bringing me to
Task 2: Policy Network Initialization. With the data generated through MCTS I was able to train the initial version of the policy network. This is more useful than just being an accuracy increase, as an increase in policy network strength means the expansion factor of MCTS can be reduced to maintain the same level of strength over the same simulation depth. That is, because the suggested moves are better on average, fewer possibilities need to be explored, and thus execution time decreases.
Task 3: Converting the engine to 9×9 usage. The change to a 9×9 board required a change in the engine as all the networks were set up to take in a 19×19 vector instead of a 9×9. This required a small amount of refactoring and debugging to make sure everything was working as intended.
Task 4: Converting MCTS to run as 9×9. As previously mentioned the engine has been converted from 19×19 to 9×9 to conform to the physical board changes. Unfortunately, this reduces the engine’s relative strength, as the value network and first iteration of the policy network were trained on 19×19 data. Accordingly, I refactored the MCTS code to run MCTS simulating 9×9 games, which will generate more specialized data to tune both networks.
Accordingly, aside from prepping for final presentations and demo, I will just be running MCTS in parallel across machines to generate extra tuning data for both networks. The engine works as is, so this improvement is all I have to work on until demo.