Hugo Status Report 4/12

Over the last two weeks I have had a lot of good progress with regards to finalizing our hardware. After all of the parts arrived last week, I was able to build the first full prototype of the filter and had a lot of issues with it. The sound quality was very poor, with mostly crackling and static at the speaker output. I looked into what issues could cause this, and decided that the most important part to fix were the loose connections. Because I am using a breadboard for prototyping, I did not have the ideal protoboard pin connections to hook up the input jacks that I ordered, and so I had originillay just tried to essentially tie on the wires, but these connections were extremely poor and I feel that was a large reason for the struggle. However, I have now soldered on the wires and made firm connections which hopefully will create a big boost in performance.

Additionally, I realized another obvious issue was that I was building my setup with a single 9V battery with a +Vdd and ground. This is an issue because the opamps require +Vdd and -Vdd, and so obviously this was leading to cutoffs with my audio output. I am now looking into two possible solutions, either using two batteries with the intersection as ground, or creating a virtual ground and splitting the 9V. I am currently leaning towards the former and am hoping to test it out this week. Additionally, I have started building the circuitry for connecting the microphone input into the circuit. There is some worries about how well the ESP32 bluetooth part we are using will work but I have found some examples online about how to use it to build systems like bluetooth speakers and so I am feeling better about the forecast for that.

That will be the first major part of implementation for my work, as the vocal removal is already connected to our webapp.

Hugo Status Report 3/22

This week, I focused on building and testing the first real circuit prototypes for the filter system. I breadboarded the initial design and ran real-time tests to evaluate how well the vocal removal performs in an actual analog setup. I was not able to make a functional system yet, but was able to make a few of the small subsystems. I was working with some parts I had available but also placed orders for everything that I did not have yet. Next week, I plan to continue working on the circuit and incorporating in a way to filter off DC noise with capacitors. Looking ahead, my priority is finalizing a stable and optimized breadboard version of the circuit before deciding whether to transition to a PCB. This will also involve additional real-time tests to ensure that the system maintains consistent performance.

Team Status Report 3/15

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

We still have worries about getting our components and pieces delivered on time, but we have now placed orders for most of our fundamentals. Because of this, we are feeling on track to be able to overcome this block. A new risk is in recent changes made to our design. We scrapped our original scoring idea and so we are now a little bit behind schedule again and working to get back to speed. As far as contingency, we have our most simplistic method of scoring ready to put in if it does not work.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

We have adapted our scoring system from the original method. Our original method would execute the scoring primarily via hardware, subtracting the final speaker output from the original music track. We changed because due to advice from Professor Sullivan, our scoring would be inaccurate and not particularly useful. Even the natural differences in people’s voices would cause unpredictable differences in the output signal. There is only a small additional latency which will come from using software speech-to-text systems, but this will be the only change and will not largely affect our ability to provide a response in real time.

Status Report 3/15

Accomplishments this week:

This week, we sought to address our number one concern which was not having our parts on time. I ordered most of my crucial hardware components, mainly the speaker and splitter wires I needed to get started building the filter system. In addition to this, after some feedback from Prof. Sullivan, I reassessed our options for scoring the user’s audio. Originally, we had a feedback system which would subtract the final combined output from the original song. Because this would be overly complicated and provide very poor quality feedback, I looked into a new system to do it all on the software side. I helped pivot our design to include a speech to text system that compares the lyrics the user sings which we will now use for scoring instead.

Schedule Update:

I am still behind schedule because we have not prototyped or built anything. The GANTT chart states that I should’ve been wrapping up most of the work for vocal removal and scoring by now. However, we will redistribute because since all of the design work is laid out and most has been tested, I should be able to quickly catch up with the real prototypes for these parts.

Next week:

I will start by trying to source op amps and other fundamental components for breadboarding the filter. Once I know if this is possible or not I will order the components on Monday in order to make progress. By the end of next week, I want to have either prototyped our filter or made the first iteration for our scoring.

D1 Team Status Report 2/15

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

Our biggest risks at the moment are coming from our signal processing for scoring and from our ability to do vocal removal with low latency. Scoring is risky because as we have recently moved away from pitch detection, we are now developing new ideas for metrics and if we are not able to select one and start testing it as soon as possible, we leave a risk of missing out on a major component of our gamification. For our vocal removal, we have confirmed that it is possible with low latency via software, but we still need this to become a hardware system ideally. Both risks are being managed by quick decision making, as we are finalizing design choices right now and are hoping to be able to quickly get to a testing phase and prototype these pieces as soon as possible. As far as contingency, for our scoring, we have a range of options including some extremely easy (but unideal) work arounds. For the vocal removal, there is always the option to use an AI model to do the work which is proven to be possible.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

The primary change is switching from a bandpass / bandstop filter to the subtraction method taking advantage of stereo output. There is not much as far as surface level cost, but this does limit our potential music library to only songs that exist in stereo format. It will simply require a small check with the Spotify API to verify before allowing the user to confirm.

 

Additional Questions:

Part A (Written by Kiera): JustPerform is designed to promote the psychological well being of our users by creating a simple and affordable at home karaoke experience. Self expression through singing and dancing can be an effective way of alleviating stress and improving your mental health. Our product makes this possible for those who may not be able to afford to go to a karaoke venue.
Part B (Written by Aleks): JustPerform promotes social interactions by providing a fun group experience for at-home events. Music and dance are universal cultural expressions, and by allowing users to convert their personal music libraries into karaoke tracks, JustPerform allows for greater personalization and culture, accommodating different musical tastes and traditions. Additionally, the incorporation of dancing produces physical activity, promoting health and well-being.
Part C (Written by Hugo):
From an economic perspective, JustPerform presents an affordable alternative to traditional karaoke systems, combining all of the facets of the karaoke experience into one product. Additionally, by allowing the user to use their own library, there is no limitation of traditional karaoke machines which are limited by the preloaded library they come with. Additionally, its multi-functionality appeals to a broad market, including households, party venues, etc. , increasing demand and potential revenue streams

Hugo’s Status Report 2/15

Continuing on where I left off last week, this week I spent my time looking into the specific methods for implementing the system. Because I am our primary audio processing lead, my work revolved around fleshing out not only systems for vocal removal, but also how generally getting the full outline for how our data is being passed around and what kind of processing we need to do. First, after our proposal presentation, we had a slight change of path with regards to how we want to do scoring for our game. Originally, we had intended to work with pitch detection but now wanted to find something that more accurately captured the karaoke experience for the average user. We came up with a series of new metrics and strategies, and are continuing to analyze and pick a specific plan. Then, I took time to investigate the vocal removal aspect. Because this is currently such a fundamental part of the project it is imperative that this works as expected. I used matlab to do some tests by passing through audio files to see the effects of our original bandpass and bandstop filtering idea. In the end, this was not effective. Although the bandpass could kind of get a weak signal that almost isolated the vocal (often leaving in percussion), the bandstop filter was next to useless at removing the vocals from the backing track. So, I moved on to testing methods that take advantage of audio in stereo format to cancel out the vocals by subtracting the left and right channels. In the end, this did provide favorable results while still allowing us to work with a hardware system as we had originally hoped. Then I took time to read into the actual wiring and built up a design for splitting the two audio channels, passing through our subtractor, adding in the microphone input, and outputting to a speaker. I also took some time to look at possible speaker options and assessed whether this was a part that we wanted to allocate substantial amount of the budget to, as it is crucial that the sound quality is high for the product to reach our user requirements.

Aleks’ Status Report 2/15

This week, I focused on working through our design requirements with our group. I participated in multiple group sessions to brainstorm the best solutions and both ask for and provide feedback on our individual project research areas. I continued my research into the Spotify API to ensure it met our updated requirements. I also did research into discovering the best strategy to retrieve lyrics for our application. I spent time researching and evaluating multiple APIs, web resources, and tools for accuracy and availability. We also got feedback that our GANTT chart needed improvement. I created a true GANTT chart and updated the tasks and timeline to be more fully fleshed out and aligned with our new design decisions. I also spent substantial time working on our presentation slides and going through gradual iteration and revision with my group. Next week, I hope to spend a bit of time actually building with our resources, then I will focus on our design report, taking into account any feedback from our presentation.

Hugo Weekly Journal 2/8/25

The beginning of the week I was tasked with fleshing out our testing plans. Our original abstract had the fundamental ideas but was missing detailed explanations of how we intended to accomplish this and so I created more thorough details on our plans. After our design presentation, I moved on to researching methods for how we could extract and analyze the singer’s pitch. I started with design work, making final decisions about how we would connect our microphone to both our computer and our speaker and decided that it would route directly to the computer as opposed to our original plan of through the speaker. Then I looked into python libraries for pitch detection, first looking at librosa which is a commonly used one by some of our classmates, but decided to start with aubio. Aubio’s main selling point for us is a method that can do pitch detection from streaming data which is a stretch goal for our project.

My progress is mostly on schedule, as my first tasks are all related to microphone audio processing and so having found the necessary libraries and planning out the design was a very good start. However, I want to start testing the software as soon as possible and so next week I will ensure that on top of my planned research I create a proof of concept for this process.