bconnaug – Team C0: DrumLite

December 7, 2024January 9, 2025

Belle’s Status Report for 12/7

This week was mainly spent working on our final poster, as well as helping to integrate Elliot’s changes to the bluetooth code into the current codebase (as my final presentation on Monday went quite well, but I did claim multiple times that Elliot’s changes would fix most (if not all) of our dual-stick lag issues). I also created a couple of graphics for the poster and cleaned it up overall, referencing posters that did well in previous years so that our upcoming presentations can go as smoothly as possible.

Since we are still supposed to be integrating all of our components and making final changes/tweaks to our system, I believe we are on track. Next week, I hope to finish the final poster and practice my presentation skills for both the TechSpark expo and the 18500 showcase, as well as complete the final poster/video.

November 30, 2024January 9, 2025

Belle’s Status Report for 11/30

This week, I focused on preparing the slides for the final presentation – incorporating the results from our verification and validation tests – and contributed to the drumstick detection portion of our project.

The former involved organizing and presenting the data in a way that highlights how our project meets its use case and design requirements, as well as practicing the general flow of how I would present the relevant tests (since there are many of them but there is not much time allotted for each presentation, so I have to be concise).

As for the drumstick detection, one key aspect of our design was the use of exponential weighting to account for latency when the video frame taken at the moment of an accelerometer impact did not reflect the correct position of the drumstick tip (i.e., it would show the drumstick tip as being in the previous drum’s boundary, rather than the drum that was actually hit). This was particularly a concern because of the potential delay between the moment of impact and the processing of the frame, as we were not sure what said latency would look like.

However, during further testing, we found that this issue was quite rare. The camera’s FPS was sufficiently high, and the CV processing latency was small enough that frames typically matched up with the correct impact timing. As a result, we found that exponential weighting was unnecessary for most scenarios. Additionally, the mutexes required to protect the buffer used for the calculation were introducing unnecessary and unwanted latency. In order to simplify the system and improve overall responsiveness, we scrapped the buffer and exponential weighting completely, which led to a noticeable reduction in latency and slightly smoother performance in general.

Previously, we also found a way to have the user tweak the hsv values themselves using several sliders and a visualizer and changed one of the drumstick tips from blue to red, so the relevant issues were solved. As a result, I feel as though the drumstick detection portion of the project is mostly done.

According to our gantt chart, I should still be working with Elliot and Ben to to integrate all of our individual components of our project, so I believe I am on track. Therefore, next steps include finalizing preparations for the presentation and continuing to troubleshoot the Bluetooth latency discrepancy between the drumsticks.

November 30, 2024January 9, 2025

Team Status Report for 11/30

This week, our team mainly focused on implementing several validation and verification tests for our project, as well as finalizing our slides for next week’s Final Presentation.

Validation Tests

Drumstick Weight: Both drumsticks weigh 147g, well under the 190.4g limit.
Minimum Layout: The drum setup achieved a layout area of 1638.3cm², close to (but still under) the required 1644cm².
Drum Ring Detection: A 15mm margin was implemented to reduce overlapping issues, successfully scaling drum pad radii.
Reliable BLE Connection: At a 3m distance, all impacts were detected with no packet loss.
Correct Sound Playback: The system achieved an 89% accuracy for correct drum sound playback. This slightly missed the 90% target and will thus require refinement.
Audio Response: Average latency between drum impact and sound playback was 94.61ms, meeting the 100ms limit (despite a notable outlier).

Verification Tests

Connective Components: The drumsticks’ wires and tape weighed only 5g per drumstick, far below the 14.45g limit.
Latency: Measured latencies include:
- BLE transmission: 0.088ms (RTT/2) – under 30ms requirement
- CV processing: 33.2ms per frame – under 60ms requirement
- Accelerometer processing: 5ms – under 5ms requirement

We were happy with the majority of these results, as they prove that we were indeed able to meet the constraints that we initially placed on ourselves for this project.

We still are facing a few challenges, however: we have found that when both drumsticks are connected via Bluetooth, one often experiences noticeably higher latency than the other. The root cause is unclear and under investigation, so resolving this issue is our current/next major priority.

Nonetheless, we have incorporated the results of these tests into our presentation slides and will continue working to resolve the Bluetooth latency issue.

November 16, 2024January 9, 2025

Belle’s Status Report for 11/16

This week, I mostly worked with Ben and Elliot to continue integrating & fine-tuning various components of DrumLite to prepare for the Interim Demo happening this upcoming week.

In particular, my main contribution focused on fine-tuning the accelerometer readings. To refine our accelerometer threshold values, we utilized Matplotlib to continuously plot accelerometer data in real-time during testing. In these plots, the x-value represented time, and the y-value represented the average of the x and z components of the accelerometer output. This visualization helped us identify a distinct pattern: each drumstick hit produced a noticeable upward spike, followed by a downward spike in the accelerometer readings (as per the sample output screenshot below, which was created after hitting a machined drumstick on a drum pad four times):

Initially, we attempted to detect these hits by capturing the “high” value, followed by the “low” value. However, upon further analysis, we determined that simply calculating the difference between the two values would be sufficient for reliable detection. To implement this, we introduced a short delay of 1ms between sampling, which allowed us to consistently measure the low-high difference. Additionally, we decided to incorporated the sign of the z-component of the accelerometer output rather than taking its absolute value. This helped us better account for behaviors such as upward flicks of the wrist, which were sometimes mistakenly identified as downward drumstick hits (and were therefore incorrectly triggering a drum sound to be played). Thus, we were able to filter out other similar movements that weren’t downward drumstick swipes onto the drum pad/a solid surface, further refining the precision and reliability of our hit detection logic.

To address lighting inconsistencies from previous tests, we acquired another lamp, ensuring the testing desk is now fully illuminated. This adjustment will significantly improved the consistency of our drumstick tip detection, reducing the impact of shadows and uneven lighting. While we are still in the process of testing this 2-lamp setup, I currently believe using a YOLO/SSD model for object detection is unnecessary. These models are great for complex environments with many objects, but the simplicity of our current setup — with (mostly) controlled lighting and focused object tracking — is key. Also, implementing YOLO/SSD models would introduce significant computational overhead, which we aim to avoid given our desired sub-100ms-latency use case requirement. Therefore, I would prefer for this to remain as a last-resort solution to the lighting issue.

As per our timeline, since we should be fine-tuning/integrating different project components and are essentially done setting the accelerometer threshold values, we are indeed on track. Currently, specifically picking an HSV value for each drumstick is a bit cumbersome and unpredictable, especially in areas with a large amount of ambient lighting. Therefore, next week, I aim to further test drumstick tip detection under varying lighting conditions and try to simplify the aforementioned process, as I believe it is the least-solid aspect of our implementation at the moment.

November 9, 2024January 9, 2025

Belle’s Status Report for 11/9

This week, we mainly focused on integrating the different components of our project to prepare for the Interim Demo, which is coming up soon.

We first successfully integrated Elliot’s bluetooth/accelerometer code into the main code. The evidence of said success was an audio response (a drum beat/sound) being triggered by making a hit motion with the accelerometer and ESP32 in-hand.

We then aimed to integrate my drumstick tip detection code, which was a bit more of a challenge. The main issue concerned picking the correct HSV/RGB color values with respect to lighting and the shape of the drumstick tip. We positioned the drumstick tip on the desk (which we colored bright red), in-view of the webcam, and took a screenshot of the output. I then took this image and used a HSV color picker website in order to get HSV values for specific pixels in the screenshot. However, because of its rounded, oval-like shape, we have to consider multiple shadow, highlight, and mid-tone values. Picking a pixel that was too light or too dark would cause the drumstick tip to only be “seen” sometimes, or cause too many things to be “seen”. For example: sometimes the red undertones in my skin would be tracked along with the drumstick tip, or the tip would only be visible when in the more brightly-lit areas of the table.

In order to remedy this issue, we are experimenting with lighting to find an ideal setup. Currently we are using a flexible lamp that clamps onto the desk that the drum pads are laid on, but it only properly illuminates half of the desk. Thus, we put in an order for another lamp so that both halves of the desk can be properly lit, which should make the lighting more consistent.

As per our gantt chart, we are supposed to be configuring accelerometer thresholds and integrating all of our code at the moment, so we are surely on track. Next week, I plan to look into other object/color tracking such as CamShift, Background Subtraction, or even YOLO/SSD systems in case the lighting situation becomes overly complicated. I also would like to work on fine-tuning the accelerometer threshold values, as we are currently just holding the accelerometer and making a hit-like motion rather than strapping it to a drumstick and hitting the table.

November 2, 2024January 9, 2025

Belle’s Status Report for 11/2

This week, I mainly focused on cleaning up the code that I wrote last week.

Essentially, its purpose is to make a location prediction for each frame from the camera/video feed (0-3 if in range of a corresponding drum, and -1 otherwise) and store it in-order in a buffer with a fixed capacity of 20. I demoed this portion of the code with the sample moving red dot video I made a couple of weeks ago, and it appeared to work fine, with minimal impact to the over frame-by-frame computer vision calculation latency (it remained at ~1.4ms). Given that the prediction function has a worst-case O(1) time (and space) complexity, this was expected.

However, the issue lies with the function that calculates the moving average of the buffer.

As mentioned in my previous post, the drumstick tip location result for each frame is initially put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer. Then, the aforementioned function calculates the exponentially weighted moving average of the most recent 20 camera/video frames.

However, during this calculation the buffer is still being modified continuously since it is a global variable, so the most recent frames could very likely change mid-function and potentially skew the result. Therefore, it would be best to protect this buffer somehow: using either mutexes or copying. Though using a lock/mutex is one of the more intuitive options, it would likely not work for our purposes. As previously mentioned, we still need to modify the buffer to keep it updated for other consecutive drum hits/accelerometer spikes, so we would not be able to do this while the moving average calculation function has the lock on the buffer. There is also the option of combining boolean variables and an external buffer such that we read and write to only one (respectively), depending on whether the moving average is being calculated or not. However, I feel as though this needlessly complicates the process, and it would be simpler to instead make a copy of the buffer inside of the function and read from it accordingly.

Since the computer vision code is somewhat finished, I believe we are on track. Next week, since we just got the camera, I hope to actual begin testing my code with the drumsticks and determine actual hsv color ranges to detect the drumstick tips.

October 26, 2024January 9, 2025

Team Status Report for 10/26

This week, our team mainly focused on solidifying and integrating our code.

Currently, the most significant risks we are facing are persistent, concerning bluetooth and audio latency:

Currently, we are trying to determine a reliable estimate for the one-way bluetooth latency, which would help us massively in determining how much leeway we would have with other components (CV processing, audio generation, etc). This is being done by first sending data to the laptop using an update from the ESP, then sending an update back using a handler in the host code. The one-way latency would then be half of the timestamp taken from start to finish. However, this process is not as simple as it sounds in practice, as having a shared buffer accessed by the server/host and client introduces issues with latency and concurrency. This issue is being managed however, as we still have time blocked out in our Gantt chart to work on data transmission. In a worst-case scenario, we would have to rely on more direct/physical wiring rather than Bluetooth, but we believe this would not be necessary and just need a bit more time to adjust our approach.

Audio latency is also proving to be a slight issue, as we are having issues with overlapping sounds. In theory, it should be the case that each drumstick’s hit on a drum pad should generate a sound individually, rather than waiting for another sound to finish. However, we are currently experiencing the opposite issue, where drum sounds are waiting for another to finish, despite a thread being spawned for each. This situation, if not fixed, could introduce considerable amounts of latency into the response of our product. However, this is a relatively new issue, so we strongly believe that it can be fixed within a relatively short amount of time, if we at least all try to reason about its cause.

No changes were made to the existing design of our product. At the moment, we are mainly focused on trying to create solid implementations of each component in order to integrate & properly test them as soon as possible.
We have also not made any changes to our schedule, and are mostly on track.

October 26, 2024January 9, 2025

Belle’s Status Report for 10/26

This week, I mainly worked on implementing the actual drumstick detection code and integrated with the current code that Ben wrote.

The code Ben wrote calculates the x-value, y-value, and radius of each drum ring at the beginning of the program and stores them in a list so that they don’t have to be recalculated in the future. I then pass this list into a function that then calls another short function, which calculates the exponentially weighted moving average of the most recent 20 video frames like so:

It may seem strange, but accessing the buffer in this way is optimal because of the way I am putting data into it.:

Currently, with every video frame read by openCV’s video capture module, I first determine which drumstick’s thread is accessing the function using threading.current_thread().name to determine whether to apply a red or green mask to the image, as they are named “red” and “green” (respectively) when spawned. I then use findContours() to acquire the detected x and y location values of said drumstick. Afterwards, I pass these x and y values to another function that uses the aformentioned drum ring location list to determine which bounding circle the drumstick tip is in. This returns a number between 0 and 3 (inclusive) if the drum is detected in the bounds of a corresponding drum0-3, and -1 otherwise. Finally, this number is put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer.

As a result, it is highly possible for the ring detection yielded for a more recent video frame to be put at a lower index than older ones. Thus, we start at the current value of (bufIndex + 1) % bufSize (which should be the current least recent frame), and loop around the buffer in-order to apply the formula.

I am also using this approach because I am trying to figure out if there is any significant difference between calculating the drum location of each video frame as it is read and then putting that value into our buffer, and putting each frame into the buffer as it is read, then determining the drum location afterwards. I have both situations implemented, and plan to test how long each takes in the near future in order to reduce latency as much as possible.

Currently, based on our gantt chart, we should be finishing the general drum ring/entry into drum ring detection code, as well as starting to determine the accelerometer “spike” threshold values. Therefore, I believe I am somewhat on track, since we could (hypothetically) plug the actual red/green drumstick tip color range values into the code I wrote, connect the webcam, and be able to detect whether a stick is in bounds of a drum ring. I do have to put more time into testing for accelerometer spike values however, as soon as we are able to transmit its data.

Therefore, next week, I plan to start testing with the accelerometer to determine what an acceptable “threshold value” is for a spike/hit. This would be done by essentially hitting either the air, a drum pad, and/or the table with a machined drumstick, and observing how the output data changes over time.

October 19, 2024January 9, 2025

Belle’s Status Report for 10/19

This past week, I mainly wanted to start drafting a bit more Computer Vision code that could potentially be used in our final implementation. At the time, we had not yet received our accelerometers and microcontrollers in the mail, so I wanted to figure out how to use PyAudio (one of the methods we are considering for playing the drum sound) by creating simple spacebar-triggered sounds from a short input .wav music file. I figured that this process would also help us get an idea of how much latency could be introduced by this process.

I ended up adding code to the existing red dot detection code that first opens the .wav sound file (using the wave module) and a corresponding audio stream (using pyaudio). Then, upon pressing the spacebar, a thread is spawned to create a new wave file object & play the opened sound using the previously mentioned modules/libraries, as well as threading.

Though simple, writing this code was a good exercise for me to get a decent feel of how PyAudio works, as well as what buffer size we should use for when the audio stream is being written (i.e. when the input sound is being played). During our recent meeting with Professor Bain and Tjun Jet, we discussed possibly wanting to have a small buffer size of 128 bytes or so instead of the usual ~4096 to reduce latency. However, I found that the ~(256-512) range is more of a sweet spot for us, as having a small buffer also means that it would get filled more often (which could introduce latency accordingly, especially when it comes to larger audio files). I also found that when the spacebar was mashed quickly (which could simulate multiple, quick, consecutive drum hits), the audio lagged a bit towards the end, despite a new thread being spawned for every spacebar-press detection. I suspect this was likely due to the aforementioned small buffer, as increasing the buffer size seemed to remedy the issue.

Our gantt chart indicates that we should still be working on CV code this week, so I believe I am on schedule. This week, I hope to work with the ESP32 and accelerometers to get a feel for how they output data in order to determine how to process it. Setting up the ESPs is not necessarily a straightforward task, however, so I plan to work with Elliot in order to get them working (which would be confirmed by at least having some sort of communication between them and the laptop over Bluetooth). From there, if we are able to transmit accelerometer data, I would like to graph it and determine a range for the ‘spike’ generated by creating a ‘hit’ motion with the drumsticks.

October 5, 2024January 9, 2025

Belle’s Status Report for 10/5

This week, most of my time was taken up by a major deadline in another class, so my main task included working on the Design Trade Studies and Testing/Verification sections of our draft Design Report. I spent a couple of hours looking into key components of our project (such as the microcontroller, proposed Computer Vision implementation, and camera type), comparing and contrasting them against other potential candidates to ensure that our choices were optimal, and putting these differences into words.

I was also able to modify the CV dot detection code from last week to determine how long it takes to process one frame of the sample video input. This yielded a consistent processing time of ~1.3-1.5ms per frame, which allows us to determine how many frames can be processed once an accelerometer spike is read (while staying below our CV latency limit of 60ms).

Since our gantt chart still has us working on CV code this week, I believe the rest of the team and I are still on schedule. This coming week, I plan to finalize my parts of the design report and start trying to feed in real-time camera input to the current CV code – if one is delivered this week. If not, I would like to feed in accelerometer data to determine the minimum threshold of a “hit,” and start thinking about how to incorporate that into the code.