Belle’s Status Report for 12/7

This week was mainly spent working on our final poster, as well as helping to integrate Elliot’s changes to the bluetooth code into the current codebase (as my final presentation on Monday went quite well, but I did claim multiple times that Elliot’s changes would fix most (if not all) of our dual-stick lag issues). I also created a couple of graphics for the poster and cleaned it up overall, referencing posters that did well in previous years so that our upcoming presentations can go as smoothly as possible.

Since we are still supposed to be integrating all of our components and making final changes/tweaks to our system, I believe we are on track. Next week, I hope to finish the final poster and practice my presentation skills for both the TechSpark expo and the 18500 showcase, as well as complete the final poster/video.

Belle’s Status Report for 11/30

This week, I focused on preparing the slides for the final presentation – incorporating the results from our verification and validation tests – and contributed to the drumstick detection portion of our project.

The former involved organizing and presenting the data in a way that highlights how our project meets its use case and design requirements, as well as practicing the general flow of how I would present the relevant tests (since there are many of them but there is not much time allotted for each presentation, so I have to be concise).

As for the drumstick detection, one key aspect of our design was the use of exponential weighting to account for latency when the video frame taken at the moment of an accelerometer impact did not reflect the correct position of the drumstick tip (i.e., it would show the drumstick tip as being in the previous drum’s boundary, rather than the drum that was actually hit). This was particularly a concern because of the potential delay between the moment of impact and the processing of the frame, as we were not sure what said latency would look like.

However, during further testing, we found that this issue was quite rare. The camera’s FPS was sufficiently high, and the CV processing latency was small enough that frames typically matched up with the correct impact timing. As a result, we found that exponential weighting was unnecessary for most scenarios. Additionally, the mutexes required to protect the buffer used for the calculation were introducing unnecessary and unwanted latency. In order to simplify the system and improve overall responsiveness, we scrapped the buffer and exponential weighting completely, which led to a noticeable reduction in latency and slightly smoother performance in general.

Previously, we also found a way to have the user tweak the hsv values themselves using several sliders and a visualizer and changed one of the drumstick tips from blue to red, so the relevant issues were solved. As a result, I feel as though the drumstick detection portion of the project is mostly done.

According to our gantt chart, I should still be working with Elliot and Ben to to integrate all of our individual components of our project, so I believe I am on track. Therefore, next steps include finalizing preparations for the presentation and continuing to troubleshoot the Bluetooth latency discrepancy between the drumsticks.

Belle’s Status Report for 11/16

This week, I mostly worked with Ben and Elliot to continue integrating & fine-tuning various components of DrumLite to prepare for the Interim Demo happening this upcoming week.

In particular, my main contribution focused on fine-tuning the accelerometer readings. To refine our accelerometer threshold values, we utilized Matplotlib to continuously plot accelerometer data in real-time during testing. In these plots, the x-value represented time, and the y-value represented the average of the x and z components of the accelerometer output. This visualization helped us identify a distinct pattern: each drumstick hit produced a noticeable upward spike, followed by a downward spike in the accelerometer readings (as per the sample output screenshot below, which was created after hitting a machined drumstick on a drum pad four times):

Initially, we attempted to detect these hits by capturing the “high” value, followed by the “low” value. However, upon further analysis, we determined that simply calculating the difference between the two values would be sufficient for reliable detection. To implement this, we introduced a short delay of 1ms between sampling, which allowed us to consistently measure the low-high difference. Additionally, we decided to incorporated the sign of the z-component of the accelerometer output rather than taking its absolute value. This helped us better account for behaviors such as upward flicks of the wrist, which were sometimes mistakenly identified as downward drumstick hits (and were therefore incorrectly triggering a drum sound to be played). Thus, we were able to filter out other similar movements that weren’t downward drumstick swipes onto the drum pad/a solid surface, further refining the precision and reliability of our hit detection logic.

To address lighting inconsistencies from previous tests, we acquired another lamp, ensuring the testing desk is now fully illuminated. This adjustment will significantly improved the consistency of our drumstick tip detection, reducing the impact of shadows and uneven lighting. While we are still in the process of testing this 2-lamp setup, I currently believe using a YOLO/SSD model for object detection is unnecessary. These models are great for complex environments with many objects, but the simplicity of our current setup — with (mostly) controlled lighting and focused object tracking — is key. Also, implementing YOLO/SSD models would introduce significant computational overhead, which we aim to avoid given our desired sub-100ms-latency use case requirement. Therefore, I would prefer for this to remain as a last-resort solution to the lighting issue.

As per our timeline, since we should be fine-tuning/integrating different project components and are essentially done setting the accelerometer threshold values, we are indeed on track. Currently, specifically picking an HSV value for each drumstick is a bit cumbersome and unpredictable, especially in areas with a large amount of ambient lighting. Therefore, next week,  I aim to further test drumstick tip detection under varying lighting conditions and try to simplify the aforementioned process, as I believe it is the least-solid aspect of our implementation at the moment. 

Belle’s Status Report for 11/9

This week, we mainly focused on integrating the different components of our project to prepare for the Interim Demo, which is coming up soon.

We first successfully integrated Elliot’s bluetooth/accelerometer code into the main code. The evidence of said success was an audio response (a drum beat/sound) being triggered by making a hit motion with the accelerometer and ESP32 in-hand.

We then aimed to integrate my drumstick tip detection code, which was a bit more of a challenge. The main issue concerned picking the correct HSV/RGB color values with respect to lighting and the shape of the drumstick tip. We positioned the drumstick tip on the desk (which we colored bright red), in-view of the webcam, and took a screenshot of the output. I then took this image and used a HSV color picker website in order to get HSV values for specific pixels in the screenshot. However, because of its rounded, oval-like shape, we have to consider multiple shadow, highlight, and mid-tone values. Picking a pixel that was too light or too dark would cause the drumstick tip to only be “seen” sometimes, or cause too many things to be “seen”. For example: sometimes the red undertones in my skin would be tracked along with the drumstick tip, or the tip would only be visible when in the more brightly-lit areas of the table.

In order to remedy this issue, we are experimenting with lighting to find an ideal setup. Currently we are using a flexible lamp that clamps onto the desk that the drum pads are laid on, but it only properly illuminates half of the desk. Thus, we put in an order for another lamp so that both halves of the desk can be properly lit, which should make the lighting more consistent.

As per our gantt chart, we are supposed to be configuring accelerometer thresholds and integrating all of our code at the moment, so we are surely on track. Next week, I plan to look into other object/color tracking such as CamShift, Background Subtraction, or even YOLO/SSD systems in case the lighting situation becomes overly complicated. I also would like to work on fine-tuning the accelerometer threshold values, as we are currently just holding the accelerometer and making a hit-like motion rather than strapping it to a drumstick and hitting the table.

 

Belle’s Status Report for 11/2

This week, I mainly focused on cleaning up the code that I wrote last week.

Essentially, its purpose is to make a location prediction for each frame from the camera/video feed (0-3 if in range of a corresponding drum, and -1 otherwise) and store it in-order in a buffer with a fixed capacity of 20. I demoed this portion of the code with the sample moving red dot video I made a couple of weeks ago, and it appeared to work fine, with minimal impact to the over frame-by-frame computer vision calculation latency (it remained at ~1.4ms). Given that the prediction function has a worst-case O(1) time (and space) complexity, this was expected.

However, the issue lies with the function that calculates the moving average of the buffer. 

As mentioned in my previous post, the drumstick tip location result for each frame is initially put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer. Then, the aforementioned function calculates the exponentially weighted moving average of the most recent 20 camera/video frames. 

However, during this calculation the buffer is still being modified continuously since it is a global variable, so the most recent frames could very likely change mid-function and potentially skew the result. Therefore, it would be best to protect this buffer somehow: using either mutexes or copying. Though using a lock/mutex is one of the more intuitive options, it would likely not work for our purposes. As previously mentioned, we still need to modify the buffer to keep it updated for other consecutive drum hits/accelerometer spikes, so we would not be able to do this while the moving average calculation function has the lock on the buffer. There is also the option of combining boolean variables and an external buffer such that we read and write to only one (respectively), depending on whether the moving average is being calculated or not. However, I feel as though this needlessly complicates the process, and it would be simpler to instead make a copy of the buffer inside of the function and read from it accordingly.

Since the computer vision code is somewhat finished, I believe we are on track. Next week, since we just got the camera, I hope to actual begin testing my code with the drumsticks and determine actual hsv color ranges to detect the drumstick tips.

Belle’s Status Report for 10/26

This week,  I mainly worked on implementing the actual drumstick detection code and integrated with the current code that Ben wrote.

The code Ben wrote calculates the x-value, y-value, and radius of each drum ring at the beginning of the program and stores them in a list so that they don’t have to be recalculated in the future. I then pass this list into a function that then calls another short function, which calculates the exponentially weighted moving average of the most recent 20 video frames like so:

It may seem strange, but accessing the buffer in this way is optimal because of the way I am putting data into it.:
Currently, with every video frame read by openCV’s video capture module, I first determine which drumstick’s thread is accessing the function using threading.current_thread().name to determine whether to apply a red or green mask to the image, as they are named “red” and “green” (respectively) when spawned. I then use findContours() to acquire the detected x and y location values of said drumstick.  Afterwards, I pass these x and y values to another function that uses the aformentioned drum ring location list to determine which bounding circle the drumstick tip is in. This returns a number between 0 and 3 (inclusive) if the drum is detected in the bounds of a corresponding drum0-3, and -1 otherwise. Finally, this number is put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer.

As a result, it is highly possible for the ring detection yielded for a more recent video frame to be put at a lower index than older ones. Thus, we start at the current value of (bufIndex + 1) % bufSize (which should be the current least recent frame), and loop around the buffer in-order to apply the formula.
I am also using this approach because I am trying to figure out if there is any significant  difference between calculating the drum location of each video frame as it is read and then putting that value into our buffer, and putting each frame into the buffer as it is read, then determining the drum location afterwards. I have both situations implemented, and plan to test how long each takes in the near future in order to reduce latency as much as possible.
Currently, based on our gantt chart, we should be finishing the general drum ring/entry into drum ring detection code, as well as starting to determine the accelerometer “spike” threshold values. Therefore, I believe I am somewhat on track, since we could (hypothetically) plug the actual red/green drumstick tip color range values into the code I wrote, connect the webcam, and be able to detect whether a stick is in bounds of a drum ring. I do have to put more time into testing for accelerometer spike values however, as soon as we are able to transmit its data.
Therefore, next week, I plan to start testing with the accelerometer to determine what an acceptable “threshold value” is for a spike/hit. This would be done by essentially hitting either the air, a drum pad, and/or the table with a machined drumstick, and observing how the output data changes over time.

Belle’s Status Report for 10/19

This past week, I mainly wanted to start drafting a bit more Computer Vision code that could potentially be used in our final implementation. At the time, we had not yet received our accelerometers and microcontrollers in the mail, so I wanted to figure out how to use PyAudio (one of the methods we are considering for playing the drum sound) by creating simple spacebar-triggered sounds from a short input .wav music file. I figured that this process would also help us get an idea of how much latency could be introduced by this process.

I ended up adding code to the existing red dot detection code that first opens the .wav sound file (using the wave module) and a corresponding audio stream (using pyaudio). Then, upon pressing the spacebar, a thread is spawned to create a new wave file object & play the opened sound using the previously mentioned modules/libraries, as well as threading.

Though simple, writing this code was a good exercise for me to get a decent feel of how PyAudio works, as well as what buffer size we should use for when the audio stream is being written (i.e. when the input sound is being played). During our recent meeting with Professor Bain and Tjun Jet, we discussed possibly wanting to have a small buffer size of 128 bytes or so instead of the usual ~4096 to reduce latency. However, I found that the ~(256-512) range is more of a sweet spot for us, as having a small buffer also means that it would get filled more often (which could introduce latency accordingly, especially when it comes to larger audio files). I also found that when the spacebar was mashed quickly (which could simulate multiple, quick, consecutive drum hits), the audio lagged a bit towards the end, despite a new thread being spawned for every spacebar-press detection. I suspect this was likely due to the aforementioned small buffer, as increasing the buffer size seemed to remedy the issue.

Our gantt chart indicates that we should still be working on CV code this week, so I believe I am on schedule.  This week, I hope to work with the ESP32 and accelerometers to get a feel for how they output data in order to determine how to process it. Setting up the ESPs is not necessarily a straightforward task, however, so I plan to work with Elliot in order to get them working (which would be confirmed by at least having some sort of communication between them and the laptop over Bluetooth). From there, if we are able to transmit accelerometer data, I would like to graph it and determine a range for the ‘spike’ generated by creating a ‘hit’ motion with the drumsticks.

Belle’s Status Report for 10/5

This week, most of my time was taken up by a major deadline in another class, so my main task included working on the Design Trade Studies and Testing/Verification sections of our draft Design Report. I spent a couple of hours looking into key components of our project (such as the microcontroller, proposed Computer Vision implementation, and camera type), comparing and contrasting them against other potential candidates to ensure that our choices were optimal, and putting these differences into words.

I was also able to modify the CV dot detection code from last week to determine how long it takes to process one frame of the sample video input. This yielded a consistent processing time of ~1.3-1.5ms per frame, which allows us to determine how many frames can be processed once an accelerometer spike is read (while staying below our CV latency limit of 60ms).

Since our gantt chart still has us working on CV code this week, I believe the rest of the team and I are still on schedule. This coming week, I plan to finalize my parts of the design report and start trying to feed in real-time camera input to the current CV code – if one is delivered this week. If not, I would like to feed in accelerometer data to determine the minimum threshold of a “hit,” and start thinking about how to incorporate that into the code.

Belle’s Status Report for 9/28

This past week, my time was mainly spent on creating a test to mimic a simplified version of our project. In MATLAB, I made a short video of a small red dot moving in a somewhat-square path over 4 colored rings (a still frame of the video is shown below, as I am not sure how to upload a gif here).

still frame of the moving dot animation

 

This is supposed to vaguely emulate the behavior of the tip of a drumstick (which we plan to paint red or some other bright color) moving over the drum rings. It is not exact, but the main goal was to just make the dot move around so that I could figure out how to detect it using CV later on. I also made the proportions approximately equal to the real drum rings we will be using.

Then, in VSCode, I wrote a short program using HoughCircles and other numpy and OpenCV functions to read in/process the video, then output one where the red dot is detected in every frame. Said “detection” is indicated by drawing a small neon blue dot over the targeted red one. One can also pause the video by pressing the spacebar to step through and analyze a given frame, or press ‘q’ to close/force quit the output window.

Since the main task for this past week was to work on the computer vision code to detect rings, I would say that I am on track.

In the next week, I would like to measure how long it takes for the red dot to actually be detected in each frame, which will give us a better idea about what latency ranges we can expect when processing the live video feed from the camera in the real-world implementation. I also want to get started on the sliding window that will house a preset number of the most recent frames from the live video feed. Eventually, locating the drumstick tip in each of these frames will help determine which drum sound to make when an accelerometer spike is detected (by making a hit-like motion with the drumsticks).

 

Belle’s Status Report for 9/21

This past week, I discussed the purpose of a few components with Professor Tamal Mukherjee, mainly including how we plan to mount the camera that will have a top-down view of the drum rings and thus acquire data needed for CV processing. I also began to look at the pinout of the ESP32 microcontroller to determine which registers would be most relevant when interfacing with the MPU 6050 accelerometer, as well as found a few relevant OpenCV libraries and documentation that could be useful for the aforementioned processing. We did not have too much planned out for last week on our Gantt chart besides starting to research and potentially implement Computer Vision code, so I believe we are on schedule.

To remain on schedule, this upcoming week, I plan to put more time into narrowing down which OpenCV libraries are most relevant. I also will begin writing code to experiment with specific color and shape detection functions, and upload it to the group repository. This can potentially be accomplished by generating images of my own with varying levels of noise (to simulate potentially-blurry frames from the webcam) and ring sizes, and trying to detect those rings as well as filter out particular colors. I hope that this process will help us to determine color ranges when detecting the rings and drumstick tips from the camera’s video frames, as we would want to avoid having different lighting conditions affect the functionality of our project. For example, since the drumstick tips are relatively spherical, cast light/shadow on the edges and highest point will have different color values than the color we paint them in.