Risk Mitigation
For now, we focus on selecting the most suitable NLP model for parsing the text and classifying the item name, which is critical for correctly generating the entry, a key feature of our product. We compared across models for matching commands, identifying elements, and classifying words and settled on spaCy and word2vec for these tasks. There are also backup libraries if the selected ones don’t work out. We will start working on spaCy and word2vec next week to make sure they work as expected before the design presentation.
Design Changes
We replaced random forest model with word2vec for classifying item names because random forest requires feature engineering and is thus more suitable for classifying whole sentences or paragraphs. We will reflect this change on our block diagram for the design presentation/report.
Updated Schedule
No updates to the schedule. Next week we will address the comments on our proposal presentation provided by TAs and peers, discuss hardware order requests with faculty, design MVC framework for our Django web application, and run tests for every model/library we want to use as part of the speech recognition/NLP process on our laptop before the design presentation.