What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours).
- This week I fabricated and tested all the internal circuitry for the new 8 blocks.
- I soldered all battery holders to adapters, soldered headers onto all picos and muxes, cut charging ports into battery holders, and tested all picos, waveshare LCDs, batteries, battery holders, and adapters for functionality.
- I changed the web app to handle 4 rows by using the full NYT Connections dataset and using their startup word placements instead of a random shuffle. I also implemented the “one word away” feature.
- I helped to fabricate the test bed for 16 blocks. I laser cut cardboard platforms for the male pogo pins, cut all the wires needed, and helped to tape everything onto a text board.
- Additionally, I soldered wires for the buttons and row LCD displays, and did a little soldering to help with the transfer of using a breadboard to using a soldered perf board for the grid circuitry.
- I went through the NYT connections database and picked 2 puzzles of similar difficulty that contained words involving knowledge of American cultural references and double meanings to use for our user testing.
Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
- My progress is a little behind schedule because we were having some issues with the 16-block grid working consistently, so I did not get as much testing as I would have liked. To remedy this, I will write scripts to perform my latency testing so that once the grid is ready, I can just run my scripts and get the testing done as fast as possible.
What deliverables do you hope to complete in the next week?
- Next week I hope to complete my verification tests and gather user testing data.
- If I have time, I will also try to improve the text cleaning for the dictionary api and potentially look into using gemini api for words or phrases that don’t show up in the dictionary api.
Now that you have some portions of your project built, and entering into the verification and validation phase of your project, provide a comprehensive update on what tests you have run or are planning to run. In particular, how will you analyze the anticipated measured results to verify your contribution to the project meets the engineering design requirements or the use case requirements? Verification is usually related to your own subsystem and is likely to be discussed in your individual reports. Validation is usually related to your overall project and is likely to be discussed in your team reports.
Tests that I have run:
- Game logic accuracy
- Input: Sets of submitted words
- Output: 100% correct answer checking
- I played 10 games. For 5 games, I tested 2 combinations of correct submissions and 4 combinations of wrong submissions. I made sure that 4 wrong submissions resulted in a game loss. For the next 5 games, I tested 2 combinations of wrong submissions and 4 combinations of correct submissions. I made sure that the 4 correct submissions resulted in a game loss. For all games, I made sure that the correct submissions were registered as correct and the wrong submissions were registered as incorrect. I also verified that the mistake count was decremented and reset properly, and that submitting the same category multiple times would not result in a premature game win. For each of the 10 games, I made sure that at least 1 wrong submission was a “one away” submission, and checked that the “one away” message was displayed properly.
- From these tests, I was able to verify that the game logic was 100% accurate, which is imperative as incorrect game logic will result in the user not being able to finish the game properly since each successive submission depends on what words have been “eliminated” by previous submissions.
Tests I am planning to run:
- Hints Retrieval Latency
- Input: Load 20 puzzles
- Output:3.2s latency/puzzle
- I plan to write a script that will run on the RPi to test latency of hint retrieval. The script will load 20 puzzles. For each puzzle, I will time how long it takes to retrieve/generate the puzzle, retrieve hints via calls to the dictionary API, and parse and store these responses into the game controller data structures. This will give us an idea of how long it takes to gather hints for all of the words in a puzzle, and make sure that it is not too long so that we can meet our latency goals and maintain user attention.
- Hints Retrieval Quality
- Input: all words in the NYT Connections Archive dataset
- Output:
- 95% retrieval rate of definitions
- no more than 1 word’s hints unretrieved for each puzzle
- Aside from the latency, I also want to test the quality of the hint retrieval itself. I will write a script to loop through every word in the dataset and record the number of times where no definitions or context/examples were found for a word. I want to make sure that definitions/examples are retrieved at least 95% of the time. Additionally, for each puzzle, I want to make sure that no more than 1 word in that puzzle does not have hints available. If I do not meet this goal, I will consider options like generative LLM APIs such as Gemini to provide hints for missing words. It is important that the virtual companion is able to retrieve hints reliably since otherwise it will be completely useless to the user as a feature.