Ben Solo’s Status Report for 11/9

This week myself and the rest of the group spent basically all of our time integrating each of the components we’ve been working on into one unified system. Aside from the integration work, I made a few changes to the controller handles two drumsticks as opposed to 1, altered the way we handle detecting the drum pads at the start of the session, and cut out the actual rubber drum pads to their specific diameters for testing. Prior to this week, we had the following separate components:
1.) A BLE system capable of transmitting accelerometer data to the paired laptop
2.) A dedicated CV module for detecting the drum rings at the start of the playing session. This function was triggered by clicking a button on the webapp, which used an API to initiate the detection process.
3.) A CV module responsible for continually tracking the tip of a drumstick and storing the predicted drum pad the tip was on for the 20 most recent frames.
4.) An audio playback module responsible for quickly playing audios on detected impacts.

We split our integration process into two steps; the first was to connect the BLE/accelerometer code to the audio playback module, omitting the object tracking. To do this, Elliot had to change some of the BLE module so it could successfully be used in our system controller, and I needed to change the way we were previously reading in accelerometer data in the system controller. I was under the impression that the accelerometer/ESP32 system would continuously transmit the accelerometer data, regardless of whether any acceleration was occurring (i.e. transmit 0 acceleration if not accelerating). However in reality, the system only sends data when acceleration is detected. Thus, I changed the system controller to read a globally set acceleration variable from the Bluetooth module every iteration of the while loop, and then compare this to the predetermined acceleration threshold to decide whether an impact has occurred of not. After Elliot and I completed the necessary changes for integration, we tested the partially integrated system by swinging the accelerometer around to trigger an impact event, then assigning a random index [1,4] (since we hadn’t integrated the object tracking module yet), and playing the according sound. The system functioned very well with surprisingly low latency.

The second step in the integration process was to combine the partially integrated accelerometer/BLE/Playback system with the object tracking code. This again required me to change how the system controller worked. Because Belle’s code needs to independently run continuously and populate our 20 frame buffer of predicted drum pads, we needed a new thread for each drum stick that starts as soon as the session begins. The object tracking code treated drum pad metadata as an array of length 4 of tuples in the form (x, y, r). I was storing drum pad meta data (x, y, r) in a dictionary where each value was associated with a key. Thus, I changed the way we this information to coincide with Belle’s code. At this point, we combined all the logic needed for 1 drumstick’s operation and proceeded to testing. Though obviously it didn’t work on the first try, after a few further modifications and changes, we were successful in producing a system the tracks the drumstick’s location, transmits accelerometer data to the laptop, and plays the corresponding sound of a drum pad when an impact occurs. This was a huge step in our projects progression, as we have a basic, working version of what we proposed to do, all while maintaining a low latency (measuring exactly what the latency is was difficult since its sound based, but just from using it, its clear that the current latency is far below 100ms).

Outside of this integration process, I also started o think about and work on how we would handle two drumsticks as opposed to 1, which we already had working. The key realization were that we need to CV threads to continuously and independently track the location of each drum stick. We would also need two BLE threads, one for each drum sticks acceleration transmission. Lastly, we would need two threads running the system controller code which handles reading in acceleration data, identifying what drum pad the stick was in during an impact, and triggering the audio playback. Though we haven’t yet tested the system with two drum sticks, the system controller is now set up so that once we do want to test it, we can easily spawn corresponding threads for the second drum stick. This involved re-writing the functions to case on the color of each drum sticks tip. This is primarily needed because the object tracking module needs to know which drum stick to track, but is also used in the BLE code to store acceleration data for each stick independently.

Lastly, I spent some time carefully cutting out the drum pads from the rubber sheets at the diameters 15.23, 17.78, 20.32, 22.86 (cm) so we could proceed with testing. Below is an image of the whole setup including the camera stand, webcam, drum pads, and drum sticks.

We are definitely on schedule and hope to continue progressing at this rate for the next few weeks. Next week, I’d like to do two things: 1.) I want to refine the overall system, making sure we have accurate acceleration thresholds and assigning the correct sounds to the correct drum pads from the webapp, and 2.) testing the system with two drum sticks at once. The only worry we have is that since we’ll have two ESP32’s transmitting concurrently, they could interfere with one another and cause packet loss.

 

Belle’s Status Report for 11/2

This week, I mainly focused on cleaning up the code that I wrote last week.

Essentially, its purpose is to make a location prediction for each frame from the camera/video feed (0-3 if in range of a corresponding drum, and -1 otherwise) and store it in-order in a buffer with a fixed capacity of 20. I demoed this portion of the code with the sample moving red dot video I made a couple of weeks ago, and it appeared to work fine, with minimal impact to the over frame-by-frame computer vision calculation latency (it remained at ~1.4ms). Given that the prediction function has a worst-case O(1) time (and space) complexity, this was expected.

However, the issue lies with the function that calculates the moving average of the buffer. 

As mentioned in my previous post, the drumstick tip location result for each frame is initially put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer. Then, the aforementioned function calculates the exponentially weighted moving average of the most recent 20 camera/video frames. 

However, during this calculation the buffer is still being modified continuously since it is a global variable, so the most recent frames could very likely change mid-function and potentially skew the result. Therefore, it would be best to protect this buffer somehow: using either mutexes or copying. Though using a lock/mutex is one of the more intuitive options, it would likely not work for our purposes. As previously mentioned, we still need to modify the buffer to keep it updated for other consecutive drum hits/accelerometer spikes, so we would not be able to do this while the moving average calculation function has the lock on the buffer. There is also the option of combining boolean variables and an external buffer such that we read and write to only one (respectively), depending on whether the moving average is being calculated or not. However, I feel as though this needlessly complicates the process, and it would be simpler to instead make a copy of the buffer inside of the function and read from it accordingly.

Since the computer vision code is somewhat finished, I believe we are on track. Next week, since we just got the camera, I hope to actual begin testing my code with the drumsticks and determine actual hsv color ranges to detect the drumstick tips.

Team Status Report for 11/2

This week we made significant strides towards the completion of our project. Namely, we got the audio playback system to have very low latency and were able to get the BLE transmission to both work and have much lower latency. We think a a significant reason for why we were measuring so much latency earlier in HH 13xx was because many other project groups were using the same bandwidth and thus causing throughput to be much lower. Now, when testing at home, we see that the BLE transmission seems nearly instantaneous. Similarly, the audio playback module now operates with very low latency. This required a shift from using sounddevice to pyAudio and audio streams. Between these two improvements, our main bottleneck for latency will likely be storing frames in our frame buffer and continually doing object detection throughout the playing session.

This brings me to the design change we are now implementing. Previously we had planned to only do object detection to locate where the tips of the drum sticks are when an impact occurs; we’d read the impact and the trigger the object detection function to determine which drum ring the impact occurred in from the 20 most recent frames. However we now plan to continuously keep track of the location of the tips as the user plays, storing the (x, y) location in a sliding window buffer. Then, when an impact occurs, we will immediately already have the (x, y) locations of the tips for every frame in recent time, and thus be able to omit the object detection prior to playback, and instead simply apply our exponential weighing algorithm to the stored locations.

This however brings us to our greatest risk: high latency for continuous object detection. We have not yet tested a system that continuously tracks and stores the location of the drum stick tips. Thus, we can’t be certain of what the latency will look like for this new design. Additionally, since we haven’t tested an integrated system yet, we also don’t know if even though the individual components seems to have good latency, the entire system will, given the multiple synchronizations and data processing modules that need to interact.

Thus, a big focus in the coming weeks will be to incrementally test the latency’s of partially integrated systems. First, we want to connect the BLE module to the audio playback module so we can assess how much latency there is without the object detection involved. Then, once we optimize that, we’ll connect and test the whole system including the continual tracking of the tips of the drum sticks. Hopefully, by doing this modularly, we can more clearly see what components are introducing the most latency and focus on bringing those down prior to testing the integrated system.

As of now, our schedule has not changed and we seem to be moving at a good pace. In the coming week we hope to make significant progress on the object tracking module as well as test a partially integrated system with the BLE code and the audio playback code. This would be pretty exciting since this would actually involve using drumsticks and hitting the surface to trigger a sound, which is fairly close to what the final product will do.

Ben Solo’s Status Report for 11/2

This week I spent my time working on optimizing the audio playback module. At the start of the week my module had about 90ms of latency fir every given sound that needed to be played. In a worst case situation, we could work with this, but since we want an overall system latency below 100ms, it was clearly suboptimal. I went through probably 10 iterations before I landed on the current implementation which utilized pyAudio as the sound interface and has what feels like instantaneous playback. I’ll explain the details of what I changed/implemented below and discuss a few of the previous iterations I went through before landing on this final one.
The first step was to create a system that allowed me to both test playing individually triggered sounds via keyboard input while not disrupting the logic of the main controller I explained in my last status report. To do this, I implemented a testing mode. When run with testing=True, the controller takes keyboard inputs w, a, s, d to trigger each of the 4 sounds as opposed to the simulated operating scheme where the loop continually generates random simulated accelerometer impacts and subsequently returns a number in the range [1,4]. This allows me not only to test the latency for individual impacts, but also what the system would operate like when multiple impacts occur in rapid succession.
Having implemented this new testing setup, I now needed to revise the actual playback function responsible for playing a specific sound when triggered. The implementation from last week worked as follows:
1.) at the start of the session, pre-load the sounds so that the data can easily be referenced and played
2.) when an impact occurs, spawn a new thread that handles the playback of that one sound using the sound device library.
The code for the actual playback function looked as follows:

def playDrumSound(index)   
   if index in drumSounds:
        data, fs = drumSounds[index]
        dataSize = len(data)
        print(f'playing sound {index}')
        if dataSize < 6090:
            blockSize = 4096
        elif dataSize < 10000:
            blockSize = 1024
        else:
            blockSize = 256
        with playLock:
            sd.play(data, samplerate=fs, device=wasapiIndex, blocksize=blockSize)

This system was very latent, despite the use of the WASAPI device native to my laptop. Subsequent iterations of the function included utilizing a queue, where each time an impact was detected, it was added ton the queue and played whenever the system could first get to it., This was however a poor idea since this introduces unpredictability into when the sound actually plays, which we can’t have given playing the drums is very rhythm heavy> Another idea I implemented but eventually discarded after testing was to use streamed audio. In this implementation, I spawned a thread for each detected impact which would then write the contents of the sound file to an output stream and play it. However, for reasons still unknown to me (I think it was due to how I was cutting the sound data and loading it into the stream), this implementation was not only just as latent, but also massively distorted the sounds when played.
A major part of the issue was that between the delay inherent in playing a sound (simply the amount of time it takes for the sound to play) and the latency associated with playing the sounds, it was nearly impossible to create an actual rhythm like you would see when playing a drum set. My final implementation, which used pyAudio avoids all these issues by cutting down the playback latency so massively that it almost feels instantaneous. The trick here was a combination of many of the other implementations I had tried out. This is how it works:
1.) at the start of the session we preload each of the sounds so the data and parameters (number of channels, sampling rate, sample width, etc.) were all easily accessible at run time. Additionally, we initialize an audio stream for each of the 4 sounds, so they can each play independent from the other sounds.
2.) during the session, once and impact is detected (a keypress in my case), and the index of the sound to play has been determined, I simply retrieve the sound from our preloaded sounds as well as the associated sounds open audio stream. I then write the frames of the audio to the stream.
This results in near instantaneous playback. The code for this (both preloading and playback) is shown below:

def preload_sounds():
    for index, path in soundFiles.items():
        with wave.open(path, 'rb') as wf:
            frames = wf.readframes(wf.getnframes())
            params = wf.getparams()
            drumSounds[index] = (frames, params)
            soundStreams[index] = pyaudio_instance.open(
                format=pyaudio_instance.get_format_from_width(params.sampwidth),
                channels=params.nchannels,
                rate=params.framerate,
                output=True,
                frames_per_buffer=256
            )

def playDrumSound(index):
    if index in drumSounds:
        frames, _ = drumSounds[index]
        stream = soundStreams[index]
        stream.write(frames, exception_on_underflow=False)

Though this took a lot of time to come to, I think it was absolutely worth it. We now no longer need to worry that the playback of audio will constrain us from meeting our 100ms latency requirement, and can instead focus on the object detection modules and Bluetooth transmission latency. For reference, I attached a sample of how the playback may occur here.

My progress is on schedule this week. In the following week the main goal will be to integrate Elliot’s Bluetooth code, which also reached a good point this week into the main controller so we can actually start triggering sounds via real drum stick impacts as opposed to key board events. If that gets done, I’d like to test the code I wrote last week for detecting the (x, y, r) of the 4 rubber rings in real life, now that we have our webcam. This will probably require me to make some adjustments to the parameters of the hough_circles function we are using to identify them.

Elliot’s Status Report for 11/2

I spent this week cleaning up the system’s Bluetooth module, determining the one-way latency of our wireless data transmission, and establishing a consistent threshold for the incoming accelerometer values on the host device.

To obtain latency metrics, I chose to implement a Round Trip Time (RTT) test. The strategy was to take an initial timestamp on the ESP with the system clock, update the server characteristic and notify the client, wait for a response by observing a change in the server entry, and take the time difference. This came with a few minor issues to be resolved: first, I observed that the characteristic updates were inconsistent and the test resulted in significantly different output values across runs. This was due to the client updating the same buffer as the ESP32 during its response, thus introducing concurrency issues when the devices attempted to update the characteristic simultaneously. I fixed this by separating the transmission and reception to two distinct characteristics, allowing for continuous processing on both sides. Once this was resolved, I noticed that the resulting delay was still too high–around 100ms. After searching online, I came across this article, stating that the default connection interval for the ESP32 ranges from 7.5ms up to as much as 4s: https://docs.espressif.com/projects/esp-idf/en/release-v5.2/esp32c6/api-guides/ble/get-started/ble-connection.html. Having this variance was unacceptable for our purposes, and so I made use of the esp_gap_ble_api library to manually set the maximum connection interval to 20ms. This change greatly reduced the final delay of the test, but having the shorter connection interval means I’ll have to be aware of interference as we integrate a second microcontroller on the 2.4GHz band. The final value of my testing procedure landed our one-way latency at around 40ms, but my belief is that the actual value is even less; this is because of the inherent overhead introduced across the testing code–the operations of looping in the arduino firmware, polling for the client response, and unpacking data all contribute a nonzero latency to the result. Hence, I tested the implementation qualitatively by manually setting a fixed accelerometer threshold and printing over USB on valid spikes. This test produced favorable results, suggesting that the latency could certainly be under 40ms. I was also able to determine an appropriate threshold value for data processing while doing this, which I concluded to be 10 m/s2. This value achieved a reasonable hit detection rate, but we may choose to store multiple thresholds corresponding to different surfaces if the user wishes to play with a uniform actuation force across all surface types. Ultimately, these tests were helpful in our planning towards a low-latency solution, and I believe I’m still on track with the team’s schedule.

In this upcoming week, I plan to move my Bluetooth code into the system controller and assist Ben with audio buffer delay. Specifically, I will:

  1. Create a functional controller to detect accelerometer hits and play specified audio files before introducing CV.
  2. Explore ways to minimize audio output latency as much as possible, such as diving into the PyAudio stack, finding a different library, or considering the MIDI controller route suggested to us by Professor Bain.

Team Status Report for 10/26

This week, our team mainly focused on solidifying and integrating our code.

  • Currently, the most significant risks we are facing are persistent, concerning bluetooth and audio latency:

Currently, we are trying to determine a reliable estimate for the one-way bluetooth latency, which would help us massively in determining how much leeway we would have with other components (CV processing, audio generation, etc). This is being done by first sending data to the laptop using an update from the ESP, then sending an update back using a handler in the host code. The one-way latency would then be half of the timestamp taken from start to finish. However, this process is not as simple as it sounds in practice, as having a shared buffer accessed by the server/host and client introduces issues with latency and concurrency. This issue is being managed however, as we still have time blocked out in our Gantt chart to work on data transmission. In a worst-case scenario, we would have to rely on more direct/physical wiring rather than Bluetooth, but we believe this would not be necessary and just need a bit more time to adjust our approach.

Audio latency is also proving to be a slight issue, as we are having issues with overlapping sounds. In theory, it should be the case that each drumstick’s hit on a drum pad should generate a sound individually, rather than waiting for another sound to finish. However, we are currently experiencing the opposite issue, where drum sounds are waiting for another to finish, despite a thread being spawned for each. This situation, if not fixed, could introduce considerable amounts of latency into the response of our product. However, this is a relatively new issue, so we strongly believe that it can be fixed within a relatively short amount of time, if we at least all try to reason about its cause.

  • No changes were made to the existing design of our product. At the moment, we are mainly focused on trying to create solid implementations of each component in order to integrate & properly test them as soon as possible.
  • We have also not made any changes to our schedule, and are mostly on track.

 

Belle’s Status Report for 10/26

This week,  I mainly worked on implementing the actual drumstick detection code and integrated with the current code that Ben wrote.

The code Ben wrote calculates the x-value, y-value, and radius of each drum ring at the beginning of the program and stores them in a list so that they don’t have to be recalculated in the future. I then pass this list into a function that then calls another short function, which calculates the exponentially weighted moving average of the most recent 20 video frames like so:

It may seem strange, but accessing the buffer in this way is optimal because of the way I am putting data into it.:
Currently, with every video frame read by openCV’s video capture module, I first determine which drumstick’s thread is accessing the function using threading.current_thread().name to determine whether to apply a red or green mask to the image, as they are named “red” and “green” (respectively) when spawned. I then use findContours() to acquire the detected x and y location values of said drumstick.  Afterwards, I pass these x and y values to another function that uses the aformentioned drum ring location list to determine which bounding circle the drumstick tip is in. This returns a number between 0 and 3 (inclusive) if the drum is detected in the bounds of a corresponding drum0-3, and -1 otherwise. Finally, this number is put into the buffer at index bufIndex, which is a global variable updated using the formula bufIndex = (bufIndex + 1) % bufSize, maintaining the circular aspect of the buffer.

As a result, it is highly possible for the ring detection yielded for a more recent video frame to be put at a lower index than older ones. Thus, we start at the current value of (bufIndex + 1) % bufSize (which should be the current least recent frame), and loop around the buffer in-order to apply the formula.
I am also using this approach because I am trying to figure out if there is any significant  difference between calculating the drum location of each video frame as it is read and then putting that value into our buffer, and putting each frame into the buffer as it is read, then determining the drum location afterwards. I have both situations implemented, and plan to test how long each takes in the near future in order to reduce latency as much as possible.
Currently, based on our gantt chart, we should be finishing the general drum ring/entry into drum ring detection code, as well as starting to determine the accelerometer “spike” threshold values. Therefore, I believe I am somewhat on track, since we could (hypothetically) plug the actual red/green drumstick tip color range values into the code I wrote, connect the webcam, and be able to detect whether a stick is in bounds of a drum ring. I do have to put more time into testing for accelerometer spike values however, as soon as we are able to transmit its data.
Therefore, next week, I plan to start testing with the accelerometer to determine what an acceptable “threshold value” is for a spike/hit. This would be done by essentially hitting either the air, a drum pad, and/or the table with a machined drumstick, and observing how the output data changes over time.

Elliot’s Status Report for 10/26

My past week was spent working with the MPU6050 chips and cleaning up the Bluetooth module in preparation for integration with the system controller. My goals were to collect data from the 3-axis accelerometers and set up the client notifications to minimize latency. I first soldered the accelerometers and connected the serial clock, data, power, and GND wires, then used the Adafruit MPU6050 libraries to update the firmware. I used the getEvent function to return sensors_event_t data types and also utilized built-in macros to define configuration parameters such as the accelerometer output range, all of which I found from this resource. I packed the three axes of data to one characteristic, and unpacked the content from the server on the client side accordingly. I attached the accelerometer to the drumstick with the Y-axis parallel, and so I averaged the absolute measurements between the X and Z sensors to achieve a desirable output magnitude.

One of the issues I ran into last week was the persistence of the connection, in which the ESP disconnected from the client and was not able to reestablish. I fixed this by adjusting the callback functions to restart advertising automatically following any disconnection. Another potential concern was that I accessed the GATT characteristic by polling manually from the client side, which could add time to our final latency and block relevant processing. If we plan to play our final product at a moderate speed, asynchronous notifications will be required while we evaluate frames for previous hits. Developing the notify behavior brought up a problem, however, namely in the permissions of the BLE service. When I ran start_notify on my laptop, I observed a runtime error saying the attribute could not be written–I eventually realized it was because I had chosen standardized service and characteristic UUIDs with predetermined permission flags. By creating custom UUIDs, I was able to enable notify behavior manually as well as write directly to the characteristic from my laptop.

The write permission I described above is also relevant for the RTT testing I’m currently working on. My strategy is to notify the laptop using an update from the ESP, use a handler in the host code to send an update back, and derive the timestamp offset from start to finish. This, however, is taking longer than expected to achieve an accurate estimate, because having the client access the same buffer as the server introduces concurrency and extraneous latency factors.

I believe I’ve caught up in terms of progress, but I’m aware that the bulk of our team’s difficulty is still ahead in bringing down output delay once we have a functional system. My plan for this upcoming week is to:

  1. Establish a reliable measurement for the one-way latency of our Bluetooth
  2. Begin integrating BLE code with the other modules
  3. Work on the system controller to make more progress towards a testable solution

Ben Solo’s Status Report for 10/26

This week I spent the majority of my time  working on the function “locate_drum_rings” which is triggered via the webapp and initiates the process to find the (x,y) location of the drum rings as well as their radii. This involved developing test images/videos (.mp3), implementing the actual function itself, choosing to use scikit-image over cv2’s hough_circles, tuning the parameters to ensure the correct circles are selected, and testing the function in tandem with the webapp. In addition to implementing and testing this function, I made a few more minor improvements to the other endpoint on our local server “receive_drum_config” which has an error in its logic regarding saving received sound files to the ‘sounds’ directory. Finally, I changed how the central controller I described in my last status report worked a bit to accomodate 2 drumsticks in independent threads. I’ll explain each of these topics in more detail below:

Implementing the “locate_drum_rings” function.
This function is used at the start over every session, or when the user wants to change the layout of their drum set in order to detect, scale, and store the x.y locations and radii of each of the 4 drum rings. It is triggered by the “locate_drum_rings” endpoint on the local server when it receives a signal from the webapp s follows:

from cv_module import locate_drum_rings
@app.route(‘/locate-drum-rings’, methods=[‘POST’])
def locate_drum_rings():
# Call the function to detect the drum rings here
print(“Trigger received. Starting location detection process.”)
locate_drum_rings()
return jsonify({‘message’: ‘Trigger received.’}), 200

When locate_drum_rings() is called here it starts the process of finding the centers and radii of each of the 4 rings in the first frame of the video feed. For testing purposed I generated a sample video with 4 rings as follows:

1.) In MATLAB, I drew 4 rings with the radii of actual rings we plan on using (8.89, 7.62, 10.16, 11.43) at 4 different, non-overlapping locations.

2.) I then took this image and created a 6 second mp4 video clip of the image to simulate what the camera feed would look like in practice.

Then during testing, where I flag testing=True to the function, the code references the video as opposed to the default webcam. One pretty significant change however was that I decided not to use CV2’s Hough circles algorithm and instead use scikit-image’s Hough circles algorithm predominantly because it is much easier to narrow down the number of detected rings to 4, where with CV2’s it became very difficult to do so accurately and with varying radii (which will be encountered due to varying camera heights). The function itself opens the video and selects the first frame as this is all is needed to determine the locations of the drum rings. It then masks the frame  and identifies all circles it sees as present (it typically detects circles that aren’t actually there too, hence the need for tuning). Then I use the “hough_circle_peaks” function to specify that it should only retrieve the 4 circles that had the strongest results. Additionally, I specify a minimum distance between 2 detected circles in the filtering process which serves 2 purposes:

1.) to ensure that duplicate circles aren’t detected.

2.) To prevent the algorithm from detecting an 2 circles per ring; 1 for the inner and 1 for the outer radius of the rubber rings

Once these 4 circles are found, I then scale them based on the ratios outlined in the design report to add the equivalent of a 30mm margin around each ring. For testing purposes I then draw the detected rings and scaled rings back on the frame and display the image. The results are shown below:

The process involved tuning both values for minimum/maximum radii, minimum distances between detected circles, and the sigma value for the edge detection sensitivity. The results of the function are stored in a shared variable “detectedRings” which is of the form:

{1: {‘x’: 980.0, ‘y’: 687.0, ‘r’: 166.8}, 2: {‘x’: 658.0, ‘y’: 848.0, ‘r’: 194.18}, 3: {‘x’: 819.0, ‘y’: 365.0, ‘r’: 214.14}, 4: {‘x’: 1220.0, ‘y’: 287.0, ‘r’: 231.84}}

Where the index represents what drum the values correspond to.

Fixing the file storage in the “receive_drum_config” endpoint:
When we receive a drum set configuration, we always store the sound files in the ‘sounds’ directory under the names “drum_{i}.wav” where i is an index 1-4 (corresponding to the drums). The issue however was that when we receive a new drum configuration, we were just adding 4 more files with the same name to the directory, which is incorrect because a.) there should only ever be 4 sounds in the local directly at any given time, and b.) because this would cause confusion when trying to reference a given sound as a result of duplicate names. To resolve this, whenever we receive a new configuration I first clear all files from the sounds directory before adding the new sound files in. This was a relatively simple, but crucial fix for the functionality of the endpoint.

Updates to the central controller:
Given that the controller needs to monitor accelerometer data for 2 drum sticks independently, we need to run 2 concurrent instances of the controller module. I changed the controller.py file to do exactly this: spawn 2 threads running the controller code, each with a different color parameter of either red or green. These colors represent the colors of the tips of the drumsticks in our project and will be used to apply a mask during the object tracking/detection process during the  playing session. Additionally, for testing purposes I added a variation without the threading implementation so we can run tests on one independent drumstick.

Overall, this week was successful and I stayed on track with schedule. In the coming week I plan on helping Eliot integrate his BLE code into the controller so that we can start testing with an integrated system. I also plan on working at optimizing the latency of the audio playback module more since while it’s not horrible, it could definitely be a bit better. I think utilizing some sort of mixing library may be the solution here since part of the delay we’re facing now is due to the duration of a sound limiting how fast we can play subsequent sounds.

Elliot’s Status Report for 10/19

For this week’s tasks, I put my efforts towards developing the client and server code to transmit accelerometer data over BLE. The firmware for this project’s microcontrollers will be oriented around an Arduino-based framework, providing us access to abstracted libraries for I2C and serial debugging as well as a straightforward IDE to compile, flash, and monitor our code. Because I prefer to work with VS code over the native Arduino platform, I used a third-party extension, PlatformIO, for embedded development on Arduino boards.

I first set up a C++ file to represent the server initialization onboard the ESP32. The code is structured with the standard Arduino setup and loop, with added callback functions declared using the BLEServer library to handle connection status. In initialization, I set the serial baud rate to the UART standard of 115200 in order to allow USB communication to an output monitor. Using this output, I was able to find the MAC address of the microcontroller by printing it with a method from the BLEDevice library. I found that typecasting between the Arduino String type, the C++ std::string, and the C char array was a bit convoluted, which is something I will keep in mind in case we decide to append timestamps with the ESPs rather than the host controller. I then created the generic service for accelerometer data and added a characteristic to store the intended floating point value–the UUIDs used in these two operations were defined globally and found from sections 3.4 and 3.8 of the SIG group’s official identifiers found here:  https://www.bluetooth.com/wp-content/uploads/Files/Specification/HTML/Assigned_Numbers/out/en/Assigned_Numbers.pdf?v=1729378675069. The board then starts advertising and loops on the condition of a valid connection.

Output of MAC address and successful init

I also created the client side code which connects to the device address using bleak. For this stage of development, my goal was to simply get some form of communication between the two devices, and so I opted for a simple polling loop with the asyncio library. I did this by reading straight from the GATT characteristic and unpacking these bytes to a comprehensible float. For future improvements to latency, I plan to use the server to notify the host controller as opposed to the current blocking behavior.  For testing my current program, the loop on the flashed code sets the characteristic to an arbitrary value and increments at a one second interval, which the client then reads directly.

Output of fake data over BLE

This code is a good step forward, but I am a bit behind currently, considering I have not yet soldered the accelerometers to the ESP boards. Moving into the next week, my goal is to lower the latency as much as possible and start incorporating the MPU6050s to get a better idea of what the overall output lag will be. Specifically, this week I will:

  1. Clean up the Bluetooth program to make it modular for integration with Ben and Belle’s work while also ensuring versatility across different devices.
  2. Make an RTT test to get a baseline delay metric for the BLE module.
  3. Connect accelerometers to the ESP32s and start collecting data.
  4. Work on the system controller and the multithreaded code. There will likely be concurrency conflicts even with the system broken into separate threads, meaning that getting an estimate of the delay is our most important objective.