Taj’s Weekly Reports

9/20

This week my team presented our proposal on Monday. Outside of preparing for that, I spent time learning how theremins function as prior to a week ago I never heard or seen the instrument. Specifically, I looked at how they use electromagnetic fields and antenna capacitance to generate sound. In parallel, I also worked on understanding chords from a musical perspective, since I do not have prior experience with string instruments.

Additionally, I collaborated with my teammates to continue refining our use case, which is evolving as we receive more input from Jocelyn at the School of Music. Jocelyn’s more computer-vision-oriented approach pushed me to start researching Python libraries for CV applications. In particular, I focused on learning the basics of OpenCV, with the goal of exploring whether it can be used to detect finger positions via a camera and map those movements to chord recognition.

 

9/27

This week, on Wednesday, our team met with Dr. Dueck, the rest of the students in the Engineering Creative Interactions: Music + AI course, and also John Cohn. Specific to me, I had a conversation with John about the hardware aspects of our project. He strongly supported the idea of using one hand for strumming with an IMU and the other hand with a computer vision approach for chord selection. He also recommended several hardware options he had previously worked with that could fit our needs, including:

  • ESP32 + IMU
  • M5 Stack (ESP32 C3 + IMU)
  • ESP32 with capacitive touch inputs (5 channels)
  • MPR121 sensor

Following that discussion, I spent time researching these modules and additional sensors for movement tracking. My goal is to finalize options and place an order by early next week.

In parallel, I have also been contributing to our design presentation slides. To improve upon our proposal presentation, I am focusing on clarifying specific metrics and structuring the flow more effectively. In particular, I am ensuring that the presentation builds toward our solution and MVP more gradually, since in our previous presentation we introduced them too early.

 

10/4

This week, I focused on preparing for our design presentation on Monday, which I delivered to the class. The presentation required extra effort since we recently shifted our project to incorporate computer vision, a change that significantly altered our design approach and testing plans.

Following the presentation, I began exploring repositories and tutorials related to interfacing the MPU6050 IMU with an STM32 microcontroller. Since we plan to use the I²C protocol for communication, I concentrated specifically on I²C-focused resources. Two particularly helpful references were this GitHub repository and this tutorial. These resources will prepare me to efficiently collect and process data from the IMU once it arrives.

 

10/18

For the week of Oct 11-Oct 18, I did not complete much project work because I was traveling with my family for fall break. In the prior week, I concentrated on the design report with a focus on hardware details we had not previously specified. I researched the power system and battery connectors, identified libraries and driver requirements for our two candidate IMUs, mapped the I2C wiring and address plan to the ESP32 pinout including pull-ups, and compared two touch-gate implementations: a capacitive touch sensor on I2C versus a force sensing resistor into an ADC channel.

I also defined the on-device feature and packet schema. The ESP32 samples the active IMU at 120 Hz and polls the chosen gate at a higher rate. Each record carries a millisecond timestamp and a scalar strumming-speed feature S computed on-device without full strum segmentation. For each sample, compute angular velocity magnitude |ω| = sqrt(gx² + gy² + gz²), then take the RMS over a 50 ms window and normalize the result to the range [0, 1] for transmission. When the gate is active, the node streams IMU vectors plus S; when inactive, it pauses to save bandwidth and power. The planned ESPNOW packet schema is {ts_ms, imu: {ax, ay, az, gx, gy, gz}, gate_active, S}, where ts_ms is the capture timestamp in milliseconds, imu.ax/ay/az are linear accelerations in m/s², imu.gx/gy/gz are angular velocities in rad/s, gate_active is a boolean indicating whether the Trill or FSR gate is engaged, and S is the normalized strumming-speed feature. The trill is the capacitive touch sensor, and the FSR is the force detecting sensor (Force Sensitive Resistor).