Yang’s Project Report for 05/08/2021

This week I worked with my team to make the final video for my demo on Monday and also preparing for the poster. We also reviewed and prepped the final presentation this week.

Next week, we plan on editing and finalizing the video and poster, and practice before the poster session. We will focus on wrapping up this upcoming week.

Yang’s Project Report for 05/01/2021

This week, I collected user data for our speech recognition system and tested the collected data in our system for both latency and also accuracy. We will use this data for our presentation next week, and this will be useful for showing the performance of our system.

Additionally, I fixed some issues with our image capture on the jetson nano, so that now we can correctly live capture images and process them for our system. Along with that, I worked on setting up our physical demo system (see included photo) with a minifridge and LEDs for the presentation.

In addition to this, I am working on adding to our presentation slides for Monday. This week we are finishing integrating our system and completing testing. We are on track, and should have a ready presentation for Monday. 

Yang’s Project Report for 04/24/2021

This week, I looked at collecting data for our presentation next week. Primarily focused on how our system works and interacts with users. To do this, we are doing surveys to gather audio data and testing how our system works with that input.  This will be very important for our presentation on testing and validation.  Overall, we are on track and will only need to make some minor changes to our code + validation to have a ready presentation. 

Yang’s Status Report for 04/10/2021

This week I focused on putting the speech recognition onto the actual new Jetson nano, as well as working on looking at integrating the image recognition into the entire project for the demo. I also took a deeper look into the image recognition to make sure we can integrate without issues. 

Yang’s status report for 04/03/2021

This week I wrapped up the speech recognition, in triggering from speech, determining words from audio. This means that we can actually take in a request from the user and be able to trigger from that. In the next week, I will be looking into text to speech, which should be easy, as well as working on how to send emails to the user to inform them of the recipe.

Yang’s status report for 03/27/2021

I ran into some issues earlier this week with doing speech processing, since I realized we needed a way to trigger the recording instead of simply recording at set intervals. To do this, I found pvporcupine, which is a python module that has the best accuracy on trigger words, out of all the free software that aims to do this.

After doing this, I set up the threading to handle each of the different portions of our system, and have managed to get our speech recognition system to record only after being triggered by the wake word. With this, I am able to actually get text out of the input audio in the format of a command that we specified, to be sent to the recommendation system. 

Next week, I’ll be looking into actually parsing the text we have into intent, which can be used by the recommendation system, and start creating the framework for that.

Yang’s Status Report for 3/14/2021

This week I did some more testing on the NeMo speech recognition package that we are planning on testing and also picked up and started looking at the hardware we ordered (still need to place orders for the network adaptor for the jetson nano, which we missed on the first go around). My plan for this upcoming week is to fully figure out the package with our hardware and be able to send the software system fully parsed requests. Additionally, another agenda item is to have the recorded audio tests ready since we have the microphone.

Yang’s Project Report for 03/06/2021

This week I worked with some starter code in NeMo, the speech recognition package we are hoping to use. The pretrained models seem to work fairly well in terms of recognizing the actual input waveform, although I had some technical difficulties with getting it to capture the audio and run speech recognition at the same time on my desktop.

Additionally, I worked on my presentation and also submitted our parts list.

Yang’s Status Report for 2/27/2021

This week I looked at some libraries from nvidia for image inference, and automatic speech recognition. I found that jetson-inference https://github.com/dusty-nv/jetson-inference has pretrained object recognition and image recognition that we could use for on-board image inference. Additionally, I found the NeMo library for Automatic Speech Recognition (ASR). It is pretrained and would work well for our use case and is also tailored for the NVIDIA platform.

Overall we are on schedule and had a successful presentation and for the next week, I plan on building the required components for speech recognition using the nemo api. Additionally, I will be working with my team to select out the final hardware components.

 

Yang’s Project Report for 02/20/2021

This week, I put significant effort into figuring out our team’s project plan. I, along with Nanxi and Elena, spent time looking into what hardware we are interested in using (Jetson Nano), what software we need for Speech Recognition/Generation, LED controls, Image Recognition, for our project proposal presentation. I believe that we are on track and have a good idea of how to proceed with the rest of our project.