Lucy’s Weekly Reports

12/6

This week, I concentrated on final presentation preparation and system refinement. I resolved the remaining bugs in the game implementation and practiced and creating the final presentation slides. Additionally, I conducted quantitative testing of the software components, measuring gesture recognition latencies and evaluating hand gesture classification accuracy across different movements.

My progress remains on schedule with no delays.

Before the demo, my priority is debugging the audio rendering pipeline to achieve our target latency of under 100ms. Following the demo, I will shift focus to the final deliverables: the technical report, demonstration video, and project website.

11/22

This week, I implemented the game mode for our air guitar project. Users can now choose to enter game mode and select from three songs: Twinkle Twinkle Little Star, Mary Had a Little Lamb, and Happy Birthday. During gameplay, the program displays real-time labels showing which chord to play and the corresponding hand gesture. At the end of each song, players receive a score based on their performance. There are some design choices that we discussed with the School of Music, including whether or not to allow the user to move on to the next chord if they strike a wrong chord. We concluded that the user should be able to move on, but their score would be deducted. Further, we also discussed when the user should be deemed playing the wrong chord (further adjustments needed to switch from determining a wrong chord at pattern change to determining a wrong chord at pattern change and strum).

My progress is on schedule, and there are no delays.

Next week, I will focus on working on the final presentation and continue fixing bugs for the game version of the project.

Status Report Specific:

Throughout this project, I needed to learn several new tools and techniques. The primary areas included computer vision with OpenCV and MediaPipe for hand tracking and gesture recognition, which was essential for detecting finger positions and translating them into musical chords. I also had to learn audio processing with Python libraries to generate guitar sounds in real-time based on detected gestures. As the project progressed into the game mode phase, I deepened my knowledge in game state management to track user progress and validate inputs against expected gestures.

My approach to acquiring this knowledge was primarily hands-on and iterative. I started by reading OpenCV and MediaPipe documentation along with online tutorials to understand the fundamentals of hand tracking and gesture detection. I then learned through trial and error, testing different approaches to gesture mapping and audio playback while refining based on what worked best. I also examined existing projects combining computer vision with interactive applications to understand common design patterns and potential issues. When stuck on particularly challenging bugs, such as improving detection accuracy or reducing latency, I discussed approaches with my teammates and timed parts of my code so we could discover where the bottleneck was. This helped me see problems from different angles and discover solutions I wouldn’t have found alone.

11/15

This week, I successfully integrated guitar sound samples into the software system, providing an alternative audio mode for the air guitar. I also collaborated with Taj and Alexa to develop the interim demo presentation slideshow. Additionally, I updated the project Gantt chart to reflect completed and remaining tasks before the final demo. I also redesigned the application UI to improve usability, such as enlarging the camera window for better hand tracking visibility and reorganizing feature controls for more intuitive access.

My progress remains on schedule and there are no delays.

Next week, I will focus on implementing the gaming features identified during our user testing sessions with School of Music students. I will also integrate IMU velocity data into the delay calculation system, replacing the current manual delay bar with velocity-based timing that responds to strum speed.

11/8

This week, I collaborated with Alexa to verify Bluetooth functionality on the ESP32 by connecting it to a mobile Bluetooth app. We then tested data transmission using Alexa’s Python script to ensure that the signals sent from the ESP32 were received correctly. Because I was attending a conference during the week, I continued contributing remotely by creating the testing feedback form for our upcoming demo session with School of Music participants.

My progress remains on schedule, and I am not currently facing any delays or blockers.

Next week, I plan to work closely with Alexa and Taj to further calibrate the strumming sensitivity and review feedback gathered from the School of Music testing session. From there, we will begin developing solutions to address the identified issues.

11/1

This week, I edited one vocalist’s recorded audio sample, integrated it into the codebase, and successfully replaced the existing synthesized notes. I also enabled simultaneous note playback to support chord generation and updated the trigger logic so that each note plays only once per hand gesture, eliminating repeated playback artifacts.

My progress remains on schedule, with no delays or blockers.

Next week, I plan to collaborate with Alexa and Taj to integrate the hardware and software components further. I will also prepare targeted questions for School of Music students ahead of our user testing sessions. If time permits, I will continue processing the remaining two vocal recordings into individual notes.

Link to video with synthesized notes

Link to video with voices

10/25

This week, I implemented the portion of the algorithm that maps different hand gestures to distinct sounds. I downloaded a .rar file containing a range of musical notes in .mp3 format, added the files to the codebase, and used the pytgame.mixer library to play them since it supports that format. Each gesture now successfully triggers a corresponding sound, as demonstrated in the attached video. Additionally, I recorded voice samples from members of the School of Music for future integration, allowing the system to play their voices instead of instrumental notes.

My progress remains on schedule, with no current delays or blockers.

Next week, I plan to edit and refine the recorded vocal samples, import them into the codebase, and experiment with using them as replacements for the existing notes. I also intend to explore functionality for playing multiple notes simultaneously to produce chords.

10/18

This week was fall break, so progress was limited due to reduced team activity and schedule constraints. However, substantial work was completed in the prior week.

During the previous week, I made significant contributions to the design report by drafting multiple core sections, including the Introduction, Use Case Requirements, Design Requirements, parts of the Trade Studies, Software Implementation Plan, Testing and Verification Plan, and Project Management. These sections now connect system objectives with measurable performance targets and a structured development timeline.

In parallel, I began exploring the computer vision subsystem for hand recognition, specifically investigating MediaPipe’s hand-tracking framework for chord detection. I also experimented with NumPy and PyAudio to handle audio synthesis and chord manipulation, testing how note frequencies can be programmatically generated and combined to emulate realistic guitar tones. These experiments are intended to validate the feasibility of the audio rendering pipeline before integrating with the gesture-recognition module.

Overall, my progress remains on schedule, and there have been no delays or blockers. The current focus aligns with our planned milestones for this stage of the project.

Next week, I plan to continue developing the PyAudio-based sound synthesis module to generate test samples and record prototype chord sequences. This will allow for preliminary evaluation of sound quality, latency, and responsiveness before hardware integration. We will also communicate with the School of Music participants to give feedback or potentially test out our progress so far.

10/4

This week, I completed my portion of the design presentation, finalized the content and visuals for the hardware design, and system overview sections. I also set up and tested the Veremin GitHub repository on my local machine to understand how the original codebase handled the computer vision and sound synthesis. After getting it to run locally, I experimented with the structure of the code to identify which components we could reuse for our implementation. I also started working on my portions of the design report that would be worked on with more details next week.

My progress is on schedule. Finishing the design presentation and testing the Veremin repo provided a foundation for the next stage of hardware integration and firmware development. No delays have occurred so far.

Next week, I plan to begin integrating the computer vision hand-recognition with the Flask server. I will also work on the design report further. If time permits, I will start developing the data visualization and calibration tools to support initial testing.

9/27

This week, I focused on refining our project concept and researching the computer vision component in depth. Together with Taj and Alexa, we finalized the idea of using a hybrid design: MediaPipe Hands for detecting one hand’s finger/joint positions and the IMU wristband with haptic pads for strumming. My personal contribution was conducting research into how MediaPipe Hands works, including its ability to track 21 hand landmarks in real time. I evaluated how finger bends could be mapped to different chords, making the system more expressive and customizable. I also compared this approach to full-arm tracking (PoseNet), noting that MediaPipe Hands provides finer granularity and better aligns with our accessibility goals.

In addition, I attended our meetings with Professor Mukherjee, mentor Belle, Professor Dueck, and John Cohn, and documented feedback. I researched integration tools, including TensorFlow.js, Web MIDI API, Web Audio API, Tone.js, and MQTT/WebSockets, based on the “veremin” codebase for connecting hand-tracking data with real-time audio synthesis. I also created the base codebase, forked from the “veremin” project, and took a deep dive into the code.

I also created a Figma prototype of what our final project might look like.

Our team is currently on schedule. By narrowing the scope to one hand under computer vision and combining it with our hardware strumming mechanism, we reduced project complexity while still maintaining an engaging use case. If any delays arise in integrating MediaPipe Hands with the audio pipeline, I will focus on building a minimal working prototype that detects just one or two gestures first, then expand.

By next week, I plan to implement a basic MediaPipe Hands demo that detects hand landmarks from webcam input, experiment with mapping one or two finger bends to simulated chord events, and draft a short requirements list outlining how the hand-tracking data will connect to Tone.js via the Web MIDI API.

9/20

This week, I focused on developing our proposal presentation and coordinating with Jocelyn, the representative from the School of Music. Together, we finalized the use case for our project and made adjustments to the design direction.

Instead of creating an air guitar with haptic pads and a strumming wristband, we decided to pivot toward building a video theremin (“veremin”) inspired by a pre-existing project. Our main innovation will be shifting from computer vision that tracks large hand movements to a system that tracks individual finger movements. As a possible hardware extension, users could press a button to change chords while moving their hands, producing a strumming effect.

The chosen use case centers on accessibility. For individuals with injuries or disabilities who may have limited hand mobility, interacting with haptic pads could be difficult. Our new design aims to lower these barriers, making it easier for them to create and enjoy music.

Our team is currently on schedule. By collaborating and having constant communication with Jocelyn, we can develop a product that can satisfy both an engineer’s point of view and a musician’s point of view.

By next week, I plan to: research further how our project can be achieved, complete the design presentation, and develop further details of our idea.