Team Status Report for 10/26

This week, our team mainly focused on solidifying and integrating our code.

  • Currently, the most significant risks we are facing are persistent, concerning bluetooth and audio latency:

Currently, we are trying to determine a reliable estimate for the one-way bluetooth latency, which would help us massively in determining how much leeway we would have with other components (CV processing, audio generation, etc). This is being done by first sending data to the laptop using an update from the ESP, then sending an update back using a handler in the host code. The one-way latency would then be half of the timestamp taken from start to finish. However, this process is not as simple as it sounds in practice, as having a shared buffer accessed by the server/host and client introduces issues with latency and concurrency. This issue is being managed however, as we still have time blocked out in our Gantt chart to work on data transmission. In a worst-case scenario, we would have to rely on more direct/physical wiring rather than Bluetooth, but we believe this would not be necessary and just need a bit more time to adjust our approach.

Audio latency is also proving to be a slight issue, as we are having issues with overlapping sounds. In theory, it should be the case that each drumstick’s hit on a drum pad should generate a sound individually, rather than waiting for another sound to finish. However, we are currently experiencing the opposite issue, where drum sounds are waiting for another to finish, despite a thread being spawned for each. This situation, if not fixed, could introduce considerable amounts of latency into the response of our product. However, this is a relatively new issue, so we strongly believe that it can be fixed within a relatively short amount of time, if we at least all try to reason about its cause.

  • No changes were made to the existing design of our product. At the moment, we are mainly focused on trying to create solid implementations of each component in order to integrate & properly test them as soon as possible.
  • We have also not made any changes to our schedule, and are mostly on track.

 

Belle’s Status Report for 10/26

This week,  I mainly worked on implementing the actual drumstick detection code and integrated with the current code that Ben wrote.

The code Ben wrote calculates the x-value, y-value, and radius of each drum ring at the beginning of the program and stores them in a list so that they don’t have to be recalculated in the future. I then pass this list into a function that then calls another short function, which calculates the exponentially weighted moving average of the most recent 20 video frames like so:

It may seem strange, but accessing the buffer in this way is optimal because of the way I am putting data into it.:
Currently, with every video frame read by openCV’s video capture module, I first determine which drumstick’s thread is accessing the function using threading.current_thread().name to determine whether to apply a red or green mask to the image, as they are named “red” and “green” (respectively) when spawned. I then use findContours() to acquire the detected x and y location values of said drumstick.  Afterwards, I pass these x and y values to another function that uses the aformentioned drum ring location list to determine which bounding circle the drumstick tip is in. This returns a number between 0 and 3 (inclusive) if the drum is detected in the bounds of a corresponding drum0-3, and -1 otherwise. Finally, this number is put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer.

As a result, it is highly possible for the ring detection yielded for a more recent video frame to be put at a lower index than older ones. Thus, we start at the current value of (bufIndex + 1) % bufSize (which should be the current least recent frame), and loop around the buffer in-order to apply the formula.
I am also using this approach because I am trying to figure out if there is any significant  difference between calculating the drum location of each video frame as it is read and then putting that value into our buffer, and putting each frame into the buffer as it is read, then determining the drum location afterwards. I have both situations implemented, and plan to test how long each takes in the near future in order to reduce latency as much as possible.
Currently, based on our gantt chart, we should be finishing the general drum ring/entry into drum ring detection code, as well as starting to determine the accelerometer “spike” threshold values. Therefore, I believe I am somewhat on track, since we could (hypothetically) plug the actual red/green drumstick tip color range values into the code I wrote, connect the webcam, and be able to detect whether a stick is in bounds of a drum ring. I do have to put more time into testing for accelerometer spike values however, as soon as we are able to transmit its data.
Therefore, next week, I plan to start testing with the accelerometer to determine what an acceptable “threshold value” is for a spike/hit. This would be done by essentially hitting either the air, a drum pad, and/or the table with a machined drumstick, and observing how the output data changes over time.

Elliot’s Status Report for 10/26

My past week was spent working with the MPU6050 chips and cleaning up the Bluetooth module in preparation for integration with the system controller. My goals were to collect data from the 3-axis accelerometers and set up the client notifications to minimize latency. I first soldered the accelerometers and connected the serial clock, data, power, and GND wires, then used the Adafruit MPU6050 libraries to update the firmware. I used the getEvent function to return sensors_event_t data types and also utilized built-in macros to define configuration parameters such as the accelerometer output range, all of which I found from this resource. I packed the three axes of data to one characteristic, and unpacked the content from the server on the client side accordingly. I attached the accelerometer to the drumstick with the Y-axis parallel, and so I averaged the absolute measurements between the X and Z sensors to achieve a desirable output magnitude.

One of the issues I ran into last week was the persistence of the connection, in which the ESP disconnected from the client and was not able to reestablish. I fixed this by adjusting the callback functions to restart advertising automatically following any disconnection. Another potential concern was that I accessed the GATT characteristic by polling manually from the client side, which could add time to our final latency and block relevant processing. If we plan to play our final product at a moderate speed, asynchronous notifications will be required while we evaluate frames for previous hits. Developing the notify behavior brought up a problem, however, namely in the permissions of the BLE service. When I ran start_notify on my laptop, I observed a runtime error saying the attribute could not be written–I eventually realized it was because I had chosen standardized service and characteristic UUIDs with predetermined permission flags. By creating custom UUIDs, I was able to enable notify behavior manually as well as write directly to the characteristic from my laptop.

The write permission I described above is also relevant for the RTT testing I’m currently working on. My strategy is to notify the laptop using an update from the ESP, use a handler in the host code to send an update back, and derive the timestamp offset from start to finish. This, however, is taking longer than expected to achieve an accurate estimate, because having the client access the same buffer as the server introduces concurrency and extraneous latency factors.

I believe I’ve caught up in terms of progress, but I’m aware that the bulk of our team’s difficulty is still ahead in bringing down output delay once we have a functional system. My plan for this upcoming week is to:

  1. Establish a reliable measurement for the one-way latency of our Bluetooth
  2. Begin integrating BLE code with the other modules
  3. Work on the system controller to make more progress towards a testable solution

Ben Solo’s Status Report for 10/26

This week I spent the majority of my time  working on the function “locate_drum_rings” which is triggered via the webapp and initiates the process to find the (x,y) location of the drum rings as well as their radii. This involved developing test images/videos (.mp3), implementing the actual function itself, choosing to use scikit-image over cv2’s hough_circles, tuning the parameters to ensure the correct circles are selected, and testing the function in tandem with the webapp. In addition to implementing and testing this function, I made a few more minor improvements to the other endpoint on our local server “receive_drum_config” which has an error in its logic regarding saving received sound files to the ‘sounds’ directory. Finally, I changed how the central controller I described in my last status report worked a bit to accomodate 2 drumsticks in independent threads. I’ll explain each of these topics in more detail below:

Implementing the “locate_drum_rings” function.
This function is used at the start over every session, or when the user wants to change the layout of their drum set in order to detect, scale, and store the x.y locations and radii of each of the 4 drum rings. It is triggered by the “locate_drum_rings” endpoint on the local server when it receives a signal from the webapp s follows:

from cv_module import locate_drum_rings
@app.route(‘/locate-drum-rings’, methods=[‘POST’])
def locate_drum_rings():
# Call the function to detect the drum rings here
print(“Trigger received. Starting location detection process.”)
locate_drum_rings()
return jsonify({‘message’: ‘Trigger received.’}), 200

When locate_drum_rings() is called here it starts the process of finding the centers and radii of each of the 4 rings in the first frame of the video feed. For testing purposed I generated a sample video with 4 rings as follows:

1.) In MATLAB, I drew 4 rings with the radii of actual rings we plan on using (8.89, 7.62, 10.16, 11.43) at 4 different, non-overlapping locations.

2.) I then took this image and created a 6 second mp4 video clip of the image to simulate what the camera feed would look like in practice.

Then during testing, where I flag testing=True to the function, the code references the video as opposed to the default webcam. One pretty significant change however was that I decided not to use CV2’s Hough circles algorithm and instead use scikit-image’s Hough circles algorithm predominantly because it is much easier to narrow down the number of detected rings to 4, where with CV2’s it became very difficult to do so accurately and with varying radii (which will be encountered due to varying camera heights). The function itself opens the video and selects the first frame as this is all is needed to determine the locations of the drum rings. It then masks the frame  and identifies all circles it sees as present (it typically detects circles that aren’t actually there too, hence the need for tuning). Then I use the “hough_circle_peaks” function to specify that it should only retrieve the 4 circles that had the strongest results. Additionally, I specify a minimum distance between 2 detected circles in the filtering process which serves 2 purposes:

1.) to ensure that duplicate circles aren’t detected.

2.) To prevent the algorithm from detecting an 2 circles per ring; 1 for the inner and 1 for the outer radius of the rubber rings

Once these 4 circles are found, I then scale them based on the ratios outlined in the design report to add the equivalent of a 30mm margin around each ring. For testing purposes I then draw the detected rings and scaled rings back on the frame and display the image. The results are shown below:

The process involved tuning both values for minimum/maximum radii, minimum distances between detected circles, and the sigma value for the edge detection sensitivity. The results of the function are stored in a shared variable “detectedRings” which is of the form:

{1: {‘x’: 980.0, ‘y’: 687.0, ‘r’: 166.8}, 2: {‘x’: 658.0, ‘y’: 848.0, ‘r’: 194.18}, 3: {‘x’: 819.0, ‘y’: 365.0, ‘r’: 214.14}, 4: {‘x’: 1220.0, ‘y’: 287.0, ‘r’: 231.84}}

Where the index represents what drum the values correspond to.

Fixing the file storage in the “receive_drum_config” endpoint:
When we receive a drum set configuration, we always store the sound files in the ‘sounds’ directory under the names “drum_{i}.wav” where i is an index 1-4 (corresponding to the drums). The issue however was that when we receive a new drum configuration, we were just adding 4 more files with the same name to the directory, which is incorrect because a.) there should only ever be 4 sounds in the local directly at any given time, and b.) because this would cause confusion when trying to reference a given sound as a result of duplicate names. To resolve this, whenever we receive a new configuration I first clear all files from the sounds directory before adding the new sound files in. This was a relatively simple, but crucial fix for the functionality of the endpoint.

Updates to the central controller:
Given that the controller needs to monitor accelerometer data for 2 drum sticks independently, we need to run 2 concurrent instances of the controller module. I changed the controller.py file to do exactly this: spawn 2 threads running the controller code, each with a different color parameter of either red or green. These colors represent the colors of the tips of the drumsticks in our project and will be used to apply a mask during the object tracking/detection process during the  playing session. Additionally, for testing purposes I added a variation without the threading implementation so we can run tests on one independent drumstick.

Overall, this week was successful and I stayed on track with schedule. In the coming week I plan on helping Eliot integrate his BLE code into the controller so that we can start testing with an integrated system. I also plan on working at optimizing the latency of the audio playback module more since while it’s not horrible, it could definitely be a bit better. I think utilizing some sort of mixing library may be the solution here since part of the delay we’re facing now is due to the duration of a sound limiting how fast we can play subsequent sounds.

Elliot’s Status Report for 10/19

For this week’s tasks, I put my efforts towards developing the client and server code to transmit accelerometer data over BLE. The firmware for this project’s microcontrollers will be oriented around an Arduino-based framework, providing us access to abstracted libraries for I2C and serial debugging as well as a straightforward IDE to compile, flash, and monitor our code. Because I prefer to work with VS code over the native Arduino platform, I used a third-party extension, PlatformIO, for embedded development on Arduino boards.

I first set up a C++ file to represent the server initialization onboard the ESP32. The code is structured with the standard Arduino setup and loop, with added callback functions declared using the BLEServer library to handle connection status. In initialization, I set the serial baud rate to the UART standard of 115200 in order to allow USB communication to an output monitor. Using this output, I was able to find the MAC address of the microcontroller by printing it with a method from the BLEDevice library. I found that typecasting between the Arduino String type, the C++ std::string, and the C char array was a bit convoluted, which is something I will keep in mind in case we decide to append timestamps with the ESPs rather than the host controller. I then created the generic service for accelerometer data and added a characteristic to store the intended floating point value–the UUIDs used in these two operations were defined globally and found from sections 3.4 and 3.8 of the SIG group’s official identifiers found here:  https://www.bluetooth.com/wp-content/uploads/Files/Specification/HTML/Assigned_Numbers/out/en/Assigned_Numbers.pdf?v=1729378675069. The board then starts advertising and loops on the condition of a valid connection.

Output of MAC address and successful init

I also created the client side code which connects to the device address using bleak. For this stage of development, my goal was to simply get some form of communication between the two devices, and so I opted for a simple polling loop with the asyncio library. I did this by reading straight from the GATT characteristic and unpacking these bytes to a comprehensible float. For future improvements to latency, I plan to use the server to notify the host controller as opposed to the current blocking behavior.  For testing my current program, the loop on the flashed code sets the characteristic to an arbitrary value and increments at a one second interval, which the client then reads directly.

Output of fake data over BLE

This code is a good step forward, but I am a bit behind currently, considering I have not yet soldered the accelerometers to the ESP boards. Moving into the next week, my goal is to lower the latency as much as possible and start incorporating the MPU6050s to get a better idea of what the overall output lag will be. Specifically, this week I will:

  1. Clean up the Bluetooth program to make it modular for integration with Ben and Belle’s work while also ensuring versatility across different devices.
  2. Make an RTT test to get a baseline delay metric for the BLE module.
  3. Connect accelerometers to the ESP32s and start collecting data.
  4. Work on the system controller and the multithreaded code. There will likely be concurrency conflicts even with the system broken into separate threads, meaning that getting an estimate of the delay is our most important objective.

Ben Solo’s Status Report for 10/19

Over the last week I’ve split my time between two main topics: finalizing the webapp/local server interface, and implementing an audio playback module. I spent a considerable amount of time on both of these tasks and was able to get myself back up to speed on the schedule after falling slightly behind the week before.

The webapp itself was already very developed and close to being done. There was essentially just one additional feature that needed to be written, namely the button that triggers the user’s local system to identify the locations and radii of the drum rings at the start of a playing session. Doing this implies sending a trigger message from the webapp to the local server that initiates the ring detection process. To do this, I sent a post request to the local server running on port 8000 with a message saying “locate rings”. The local server needed a corresponding “locate-drum-rings” endpoint to receive this message, which also needed to be CORS enabled. This means I needed a pre-request and post-request endpoint that sets the request headers to allows for incoming POST requests from external servers. This is done as follows (only the pre-request endpoint is shown):

@app.route('/locate-drum-rings', methods=['OPTIONS'])
def handle_cors_prefilght_locate():
    response = app.make_default_options_response()
    headers = response.headers
    headers['Access-Control-Allow-Origin'] = '*'
    headers['Access-Control-Allow-Methods'] = 'POST, OPTIONS'
    headers['Access-Control-Allow-Headers'] = 'Content-Type'
    return response

Though the CV module for detecting the locations/radii of the rings isn’t fully implemented yet, once it is, it will be as easy as importing the module and calling it in the endpoint. This is once of the tasks I plan on getting to in this coming week. Both of the endpoints on the local server “locate-drum-rings” and “receive-drum-config” (which receives 4 sound files and stores them locally in a sounds directory on the users computer) work as intended and have been tested.

The more involved part of my work this week was implementing a rudimentary audio playback module with a few of the latency optimizations I had read about. However, before I explain the details of the audio playback functions, I want to explain another crucial aspect of the project I implemented: the system controller. During a play session, there needs to be one central program that manages all other processes. i.e. receiving and processing accelerometer data, monitoring for spikes in acceleration and spawning individual threads for object detection and audio playback after any given detected impact. Though we are still in the process of implementing both the accelerometer processing modules and the object detection modules I wrote controller.py in a was that simulates how the system will actually operate. The idea is that then when we eventually get these subcomponents done, it will be very easy to integrate them given a thought out and well structured framework. For instance, there will be a function dedicated to reading in stream accelerometer data called “read_accelerometer_data”. In my simulated version, the function repeatedly retruns an integer between 1 and 10. This value is then passed off to the “detect_impact” function which determines whether a reading surpasses a threshold value. In my simulated controller this value is set to 5, so half for the readings trigger impacts. If an impact is detected, we want to spawn a new thread to handle the object detection and audio playback for that specific impact. This is exactly what the controller does; it generates and starts a new thread that first calls the “perform_object_detection” function (still to be implemented), and then calls the “playDrumSound” function with the drum index returned by “perform_object_detection” function call. Currently, since the “preform_object_detection” function isn’t implemented, it returns a random integer between 1 and 4, representing one of the four drum rings.

Now having outlined the controller I designed, I will explain the audio playback module I developed and some of the optimizations I implemeted in doing so. We are using the soundDevice library inside the sound_player.y file. This file is our audio playback module. When the controller first starts up, it calls two functions from the audio playback module: 1.) “normalize_sounds”, 2.) “preloadSounds”. The first call ensures that each of the 4 sounds in the sounds directory have consistent sampling rates, sampling widths, and use the same number of channels (1 in our case). This helps with latency issues related to needing to adjust sampling rates. The second function call reads each of the 4 sounds and extracts the sampling frequency and data, storing both in a global dictionary. This cuts latency down significantly by avoiding having to read the sound file at play time, and instead being able to quickly reference and play a given sound. Both of these functions execute before the controller even starts monitoring for accelerometer spikes.

Once an impact has been detected, the “playDrumSound” function is called. This function takes an index (1-4) as a parameter and plays the sound corresponding to that index. Sounds are stored locally with a formatted name of the form “drum_{x}.wav” where x is necessarily and integer between 1 and 4. To play the sound, we pull the data and sampling frequency from the global dictionary. We dynamically change the buffer size based on the length of the data, ranging from a minimum of 256 samples to a maximum of 4096. These values will most likely change as we further test our system and are able to reliably narrow the range to something in the neighborhood of 256-1024 samples. We then use the “soundDevice.play()” function to actually play the sound, specifying the sampling frequency, buffer size, data, and most importantly the device to play from. A standard audio playback library like pyGame goes through the basic audio pipeline which introduces latency in a plethora of way. However, by interfacing directly with the WASAPI (Windows audio session api) we can circumnavigate a lot of the playback stack and reduce latency significantly. To do this I implemented a function that identifies whatever WASAPI speakers are listed on the user’s device. This device is then specified as an argument to the “sounddevice.play()” function at execution time.

The result of the joint controller and sound player is a simulator that continuously initiates and plays one of the 4 sounds at random. The system is set up so that we can easily fill in the simulated parts with the actual modules to be used in the final product.

As I stated earlier, I caught back up with my work this week and feel on schedule. In the coming week I plan to develop a module to initially detect the locations and radii of the 4 drum rings when the “locate-drum-rings” endpoint receives the trigger from the webapp. Additionally, I’d like to bring the audio playback latency down further and develop more rigorous tests to determine what the latency actually is, since doing so is quite difficult. We need to find a way to measure the time at which the sound is actually heard, which I am currently unsure of how to do.

Team Status Report for 10/19

In the week prior to fall break and throughout the week of fall break our team continued to make good progress on our project. Having received the majority of the hardware components we ordered, we were able to start the preliminary work for flashing the esp32’s with the Arduino code necessary to relay the accelerometer data to our laptop. Additionally, we made progress in our understanding and implementation of the audio playback module, and implemented the final feature needed for the webapp: a trigger to start the ring detection protocol locally.

Currently, the issues that pose the greatest risk to our teams success are as follows:
1.) Difficulty in implementing the BLE data transmission from the drumsticks to the laptop. We know that writing robust code, flashing it onto the esp32’s, and processing the data in real time could pose numerous issues to us. First, implementing the code and flashing the esp32 is a non-trivial task. Elliot has some experience in the area, but having seen other groups attempting to do similar things and struggle, we know this to be a difficult task. Second, since the transmission delay may vary from packet to packet, issues could easily arise given a situation where a packet takes far longer to transmit than others. Currently our mitigation strategy involves determining an average latency through testing many times over various transmission distances. Once we have this average it should encompass the vast majority of expected delay times. If a transmission falls outside of this range, we plan on simply discarding the late packets and continuing as usual.

2.) Drumstick tip detection issues. While it seems that using the cv2 contours function alongside a color mask will suffice to identify the location of the tips of the drumsticks, their is a fair amount of variability in detection accuracy given available lighting. While we currently think applying a strong color mask will be enough to compensate for lighting variability, in the case that it isn’t we plan on adding a lighting fixture mounted alongside the camera on the camera stand to provide consistent lighting in every scenario.

3.) Audio playback latency. As mentioned in the previous report, audio playback can surprisingly introduce significant amounts of latency to the system (easily 50ms) when using standard libraries such as pyGame. We are now using the soundDevice library instead which seems to have brought latency down a bit. However the issue is not as simple as reducing sample buffer size as we have noticed through experimentation that certain sounds, even if the duration of the sounds don’t vary, require higher buffer sizes than others. This is a result of both the sampling frequency used and the overall length of the data in the sound file. Using soundDevice and by interacting directly with the windows WASAPI (windows sound driver) we believe c=we can cut latency down significantly, but if we can’t we plan on using an external Midi controller which facilitates almost instantaneous sound I/O. These controllers are designed for these exact types of applications and help circumnavigate the audio playback pipeline inherent in computers.

The design of our project has not changed aside from the fact that we are now trying to use (and testing with) the soundDevice library as opposed to pyAudio. However, if soundDevice proves insufficient, we will revert and try employing pyAudio with ASIO. We are still on track with our schedule.

Below are the answers to the thought questions for this week.
A was written by Ben Solo, B was written by Elliot Clark, and C was written by Belle Connaught

A.) One of DrumLite’s main appeals is that it is a cost effective alternative to standard electronic drum sets. As was mentioned in the introduction of our design report, a low end drum set can easily cost between $300 and $700 while a better one can go up to $2000. Globally, the cost of drum sets, whether acoustic or electronic, hinder people from partaking in playing the drums. DrumLite’s low cost (~$150) enables many more people to be able to play the drums without having to worry that the cost isn’t justifiable for an entertainment product.
Furthermore, DrumLite makes sharing drum sets infinitely easier. Previously sharing a drum set between countries was virtually impossible as you’d have to ship it back and fourth or buy identical drum sets in order to have the same experience. But with DrumLite, since you can upload any .wav files to the webapp and use these as your sounds, sharing a drum set is trivial. You can just send an email with four .wav attachments and the recipient can reconstruct the exact same drum set you had in minutes. DrumLite not only brings the cost of entry down, but encourages collaboration and the sharing of music on both a local and global scale.

B.) The design of this project integrates several cultural factors that enhance its accessibility, relevance, and impact across user groups. Music is a universal form of expression found in many cultures, making this project inherently inclusive by providing a platform for users to experience drumming without the need for expensive equipment. Given its highly configurable nature, the system can be programmed to replicate sounds from a variety of drums spanning various cultures, therefore enabling cross-cultural appreciation and learning. This project also holds educational potential, particularly in schools or music programs, where it could be used to teach students about different drumming traditions, encouraging cultural awareness and social interaction through drumming practices seen in other cultures. These considerations collectively make the drumlite set not only a technical convenience but also a culturally aware and inclusive platform.

C.) DrumLite addresses a need for sustainable, space-efficient, and low-impact musical instruments by leveraging technology to minimize material use and environmental footprint. Traditional drum sets require numerous physical components such as drum shells, cymbals, and hardware, which involve the extraction of natural resources, energy-intensive manufacturing processes, and significant shipping costs due to their size and weight. By contrast, our project replaces bulky equipment with lightweight, compact components—two drumsticks with embedded sensors, a laptop, and four small rubber pads—significantly reducing the raw materials required for production. This not only saves on manufacturing resources but also reduces transportation energy and packaging waste, making DrumLite more environmentally-friendly.
In terms of power consumption, the system is designed to operate efficiently with the use of low-power ESP32 microcontrollers and small sensors like the MPU-6050 accelerometers. These components require minimal energy compared to traditional electric drum sets or amplification equipment, reducing the device’s carbon footprint over its lifetime.
DrumLite contributes to a sustainable musical experience by reducing waste and energy consumption, all while maintaining the functionality and satisfaction of playing a traditional drum set in a portable, tech-enhanced format.

Belle’s Status Report for 10/19

This past week, I mainly wanted to start drafting a bit more Computer Vision code that could potentially be used in our final implementation. At the time, we had not yet received our accelerometers and microcontrollers in the mail, so I wanted to figure out how to use PyAudio (one of the methods we are considering for playing the drum sound) by creating simple spacebar-triggered sounds from a short input .wav music file. I figured that this process would also help us get an idea of how much latency could be introduced by this process.

I ended up adding code to the existing red dot detection code that first opens the .wav sound file (using the wave module) and a corresponding audio stream (using pyaudio). Then, upon pressing the spacebar, a thread is spawned to create a new wave file object & play the opened sound using the previously mentioned modules/libraries, as well as threading.

Though simple, writing this code was a good exercise for me to get a decent feel of how PyAudio works, as well as what buffer size we should use for when the audio stream is being written (i.e. when the input sound is being played). During our recent meeting with Professor Bain and Tjun Jet, we discussed possibly wanting to have a small buffer size of 128 bytes or so instead of the usual ~4096 to reduce latency. However, I found that the ~(256-512) range is more of a sweet spot for us, as having a small buffer also means that it would get filled more often (which could introduce latency accordingly, especially when it comes to larger audio files). I also found that when the spacebar was mashed quickly (which could simulate multiple, quick, consecutive drum hits), the audio lagged a bit towards the end, despite a new thread being spawned for every spacebar-press detection. I suspect this was likely due to the aforementioned small buffer, as increasing the buffer size seemed to remedy the issue.

Our gantt chart indicates that we should still be working on CV code this week, so I believe I am on schedule.  This week, I hope to work with the ESP32 and accelerometers to get a feel for how they output data in order to determine how to process it. Setting up the ESPs is not necessarily a straightforward task, however, so I plan to work with Elliot in order to get them working (which would be confirmed by at least having some sort of communication between them and the laptop over Bluetooth). From there, if we are able to transmit accelerometer data, I would like to graph it and determine a range for the ‘spike’ generated by creating a ‘hit’ motion with the drumsticks.

Belle’s Status Report for 10/5

This week, most of my time was taken up by a major deadline in another class, so my main task included working on the Design Trade Studies and Testing/Verification sections of our draft Design Report. I spent a couple of hours looking into key components of our project (such as the microcontroller, proposed Computer Vision implementation, and camera type), comparing and contrasting them against other potential candidates to ensure that our choices were optimal, and putting these differences into words.

I was also able to modify the CV dot detection code from last week to determine how long it takes to process one frame of the sample video input. This yielded a consistent processing time of ~1.3-1.5ms per frame, which allows us to determine how many frames can be processed once an accelerometer spike is read (while staying below our CV latency limit of 60ms).

Since our gantt chart still has us working on CV code this week, I believe the rest of the team and I are still on schedule. This coming week, I plan to finalize my parts of the design report and start trying to feed in real-time camera input to the current CV code – if one is delivered this week. If not, I would like to feed in accelerometer data to determine the minimum threshold of a “hit,” and start thinking about how to incorporate that into the code.

Elliot’s Status Report for 10/5

Following the design presentation this past week, I worked on the implementation details for our final design report writeup, where I outline our libraries, equations, and general strategies internally for how we’ll communicate between modules. I spoke with Ben and Belle about how we’d carry out the 30mm buffer zone in our use case requirement, how many frames we would have to process from our sliding window on the event of any given hit, and how the resolution and field of view of our chosen camera would impact the detection of rings on the table. Hence, given that we were not able to successfully place an order for a web camera off Amazon, I had the opportunity to search for a suitable camera with our new priorities being the ability to process 20 relevant frames from our frame window while also avoiding the optical distortion resulting from a high field of view. I found a 1080p, 60FPS camera with an 84 degree field of view in the Elgato MK.2, which we’ll be considering alongside other options within the next few days; the most crucial requirements were the framerate, where I decided that a web camera running at 60 frames per second should allow us to gather 20 frames of relevant imaging (up to 0.33 seconds pre-timestamp), and the field of view, where the team concluded that anything higher than 90 degrees could distort our pixel-based sizing calculations. Apart from exploring our hardware needs, I finalized the connectivity of our BLE, since the online simulation I used only operated up to the advertising stage and wasn’t able to emulate over-the-air pairing. The host device code should be simple, where we’ll run two threads pairing with separate MAC addresses for the microcontrollers and subscribing to the UUIDs of the accelerometer characteristics, although I’m waiting for parts to arrive for testing. Overall, myself and the team are on schedule, and we should be well prepared for bug-fixing and unit testing post-break. This week, I personally plan to:

  1. Work hands-on with the ESP32 and MPU-6050. I plan to solder the jumper cables between the two for the I2C serial communication and flash firmware code to start advertising over bluetooth.
  2. Finalize our design report. I’m looking to complete my bluetooth testing by Thursday, from which point I’ll incorporate it into the repository and describe the technical details within the implementation section of our writeup.