yyue1 – Team D3 SmolKat: A Smart Kitchen Assistant

May 8, 2021

Yang’s Project Report for 05/08/2021

This week I worked with my team to make the final video for my demo on Monday and also preparing for the poster. We also reviewed and prepped the final presentation this week.

Next week, we plan on editing and finalizing the video and poster, and practice before the poster session. We will focus on wrapping up this upcoming week.

May 1, 2021

Yang’s Project Report for 05/01/2021

This week, I collected user data for our speech recognition system and tested the collected data in our system for both latency and also accuracy. We will use this data for our presentation next week, and this will be useful for showing the performance of our system.

Additionally, I fixed some issues with our image capture on the jetson nano, so that now we can correctly live capture images and process them for our system. Along with that, I worked on setting up our physical demo system (see included photo) with a minifridge and LEDs for the presentation.

In addition to this, I am working on adding to our presentation slides for Monday. This week we are finishing integrating our system and completing testing. We are on track, and should have a ready presentation for Monday.

April 24, 2021

Yang’s Project Report for 04/24/2021

This week, I looked at collecting data for our presentation next week. Primarily focused on how our system works and interacts with users. To do this, we are doing surveys to gather audio data and testing how our system works with that input. This will be very important for our presentation on testing and validation. Overall, we are on track and will only need to make some minor changes to our code + validation to have a ready presentation.

April 10, 2021April 10, 2021

Team Status Report for 04/10/2021

As a team, we looked into integrating the speech recognition with the image recognition to be able to do a basic request and response for the demo. We ran into some issues with the LEDs, since we had to get replacements. It is currently working out well and we expect to be able to demo a fully functional pipeline, albeit without the depth of recipes and ingredients that we expect for the final demo. This should be a good midpoint for us, to have the general framework done, and can simply add in additional functionality/data.

The biggest risk for us seems to be some porting issues, with the microphone on the Jetson being much weaker than the normal desktop microphone, meaning it is harder to trigger for commands. However, tuning some parameters fixed that issue. So we are currently looking at some other issues with images as well, however, we think that this should not be a bottleneck.

There were generally no changes to our deadlines and we expect to finish on time.

April 10, 2021April 10, 2021

Yang’s Status Report for 04/10/2021

This week I focused on putting the speech recognition onto the actual new Jetson nano, as well as working on looking at integrating the image recognition into the entire project for the demo. I also took a deeper look into the image recognition to make sure we can integrate without issues.

April 3, 2021

Yang’s status report for 04/03/2021

This week I wrapped up the speech recognition, in triggering from speech, determining words from audio. This means that we can actually take in a request from the user and be able to trigger from that. In the next week, I will be looking into text to speech, which should be easy, as well as working on how to send emails to the user to inform them of the recipe.

April 3, 2021

Team Status Report for 04/03/2021

As a team, we are currently aiming to integrate all of our components together for the interim demo. We are slightly behind on this aspect but we are planning on using the long weekend to catch on that regards and hold meetings to put all of our components together in time for the demo.

The biggest risk for us is putting together the image recognition to work correctly for the interim demo. to mitigate this risk, both Yang and Elena will focus on this before the demo. We believe that this will be sufficient to fix up the issues that we have in time.

There were no changes to the system design or schedule this week, and our progress has been consistent so far.

March 27, 2021

Yang’s status report for 03/27/2021

I ran into some issues earlier this week with doing speech processing, since I realized we needed a way to trigger the recording instead of simply recording at set intervals. To do this, I found pvporcupine, which is a python module that has the best accuracy on trigger words, out of all the free software that aims to do this.

After doing this, I set up the threading to handle each of the different portions of our system, and have managed to get our speech recognition system to record only after being triggered by the wake word. With this, I am able to actually get text out of the input audio in the format of a command that we specified, to be sent to the recommendation system.

Next week, I’ll be looking into actually parsing the text we have into intent, which can be used by the recommendation system, and start creating the framework for that.

March 13, 2021March 14, 2021

Yang’s Status Report for 3/14/2021

This week I did some more testing on the NeMo speech recognition package that we are planning on testing and also picked up and started looking at the hardware we ordered (still need to place orders for the network adaptor for the jetson nano, which we missed on the first go around). My plan for this upcoming week is to fully figure out the package with our hardware and be able to send the software system fully parsed requests. Additionally, another agenda item is to have the recorded audio tests ready since we have the microphone.

March 6, 2021

Yang’s Project Report for 03/06/2021

This week I worked with some starter code in NeMo, the speech recognition package we are hoping to use. The pretrained models seem to work fairly well in terms of recognizing the actual input waveform, although I had some technical difficulties with getting it to capture the audio and run speech recognition at the same time on my desktop.

Additionally, I worked on my presentation and also submitted our parts list.