September 2024 – Team C0: DrumLite

September 28, 2024January 9, 2025

Ben Solo’s Status Report for 9/28

This week I focused predominantly on setting up a working (locally hosted) webapp with integrated backend and database, in addition to a local server that is capable of receiving and storing drum set configurations. I will explain the functionality of both below.

The webapp (image below):
The webapp allows the user to completely customize their drum set. First they can choose to upload any sound files they want (.mp3, .mp4, or .wav). These sound files are stored both in the MySQL database and in a server localized directory. The database holds metadata about the sound files such as what user they correspond to, the file name, the name given to the file by the user, and the static url used to fetch the actual data of the sound file from the upload directory. The user can then construct a custom drum set by either searching for sound files they’ve uploaded and dragging them to the drum ring they wish to play that sound, or by searching for a saved drum set. Once they have a drum set they like, they can save the drum set so that they can quickly switch to sets they previously built and liked, or click the “Use this drum set” button, which triggers the process of sending the current drum set configuration to the locally running user server. The webapp allows for quick searching of sounds and saved drum sets and auto populates the drum set display when the user chooses a specific saved drum set . The app runs on localhost:5000 currently but will eventually be deployed and hosted remotely.

The local server:
Though currently very basic, the local sever is configured to run on the users port 8000. This is important as it defines where the webapp should send the drum set configurations to. The endpoint here is CORs enabled to ensure that the data can be sent across servers safely. Once a drum set configuration is received, the endpoint saves each sound in the configuration with a new file name of the form “drum_x.{file extension}”, where x represents the index of the drum that the sound corresponds to, and {file extension} is either .mp3, .mp4, or .wav. These files are then downloaded to a local directory called sounds, which is created if the user hasn’t already done so. This allows for very simple playback using a library like pyGame.

In addition to these two components, I worked on the design presentation slides and helped develop some new use case requirements which we based our design requirements off of. Namely, I came up with the requirement for the machined drumsticks to be below 200g, because as a drummer myself, increasing the weight much above the standard weight of a drumstick (~113g) would make playing DrumLite feel awkward and heavy. Furthermore, we developed the use case requirement of ensuring the 95% of the time that a drum ring is hit, the correct sound plays. This has to do with being confident that we detected the correct drum from the video footage, which can be difficult. To do this, we came up with the design requirement of using an exponential weighting over all the predicted outcomes for the drumsticks location. By applying a higher weight to the more recent frames of video, we think we can come up with a higher level of confidence on what drum was actually hit, and subsequently play the correct sound. This is a standard practice for many such problems where multiple frames need to be accurately analyzed in a short amount of time.

Lastly, I worked on a new diagram (also shown below) for how the basic workflow would look. It aids more as a graphic for the design presentation, but does convey a good amount of information about how the flow of data looks while using drumLite. My contributions of the project are coming along well and are on schedule. The plan was to get the webapp and local server working quickly so that we’d have a code base we can integrate with once we get our parts delivered and can actually start building out the code necessary to do image and accelerometer data processing.

In the next week my main focus will be creating a way to trigger a sequence of locally stored sounds within the local code base. I want to build a preliminary interface where a user inputs a sequence of drum id’s and delays and the corresponding sounds are sequentially played. This interface will be useful, as once the accelerometer data processing and computer vision modules are in a working state, we’d extract drum id’s from the video feed and pass these into the sound playing interface in the same way as is done in the above described simulation. The times at which to play the sounds (represented by delays in the simulation) would come from the accelerometer readings. Additionally, while it may be a reach task, I’d like to come up with a way to take the object detection script Belle wrote and use that to trigger the sound playing module I mentioned before.

(link to image here)

(link to flow chart here)

September 28, 2024January 9, 2025

Team Status Report for 9/28

This week our team focused heavily on preparing for the design presentation. This meant not only continuing to build up our ideas and concepts for DrumLite, but in doing so, actually starting to develop some base code to be used for both initial experimentation and proof of concept, but also for testing purposes down the line. We initially struggled with coming up with design requirements base on our use case requirements. We couldn’t really understand the difference between the two initially, but came to the conclusion that while use case requirements were somewhat black boxed and focused more on how the product needs to behave/function, design requirements should focus on the requirements that need to be met implementation-wise in order to achieve the use case requirements.

We then developed our design requirements and directly related them to our use case requirements. In doing so, we also added 3 new use case requirements, which are indicated using * below. These were as follows :
1.) (use case) Drum ring detection accuracy within ≤30 mm of actual placement.

(design)Dynamically scale the detected pixel radii to match the actual ring diameters + 30mm of margin

2.) (use case) Audio response within ≤100ms of drumstick impact.

(design)BLE <30 ms, accelerometer @1000Hz, CV operating under 60ms (~15fps)

3.) (use case) Minimum layout area of 1295 cm2 (37cm x 35cm).

(design) For any given frame, the OpenCV algorithm will provide a proximity check to ensure the correct choice across a set of adjacent rings

4.) (use case*) Play the sound of the correct drum >= 95% of the time.

(design) Average stick location across all processed frames.
Exponential weighting 𝞪 = 0.8 on processed frames.

5.) (use case*) BLE transmission reliability within 10ft from laptop

(design) <=2% BLE packet loss within 10 ft radius of the laptop.

6.) (use case*) Machined drum sticks under 200g in weight.

(design) Esp32 : 31g, MPU-6050: 2g, Drumstick: 113.4g –> Ensure connective components are below 53.6g.

Individually we each focused on our own subdivisions within the project in order to either establish some ground work or better understand the technologies were working with.

Ben: Worked on creating a functioning, locally hosted webapp integrated with UI to allow for drum set configuration/customization, sound file and drum set configuration storage (MySQL), and an endpoint to interact with a users locally running server. He also worked on a functioning local server (also using flask) to receive and locally store sound files for quick playback. Both parts are fully operational and successfully communicate with one another. The next tasks will be to integrate the stored sound files with other code for image and accelerometer processing that also runs locally. Finally, the webapp will need to be deployed and hosted. Risks for this component of the project are centered around how deploying the webapp will affect the ability for the webapp server to communicate with the local one, as currently the both run on local host.

Belle: Focused on developing a testing setup for our object detection system. In order to determine the average amount of time required for OpenCV object detection to identify the location of a drumstick tip, Belle created a MATLAB script in which she drew 4 rings of varying size and moved a red dot (simulating the tip of the drum stick) around the 4 rings. We will use a screen capture of this animation in order to determine a.) the amount of time it takes to process a frame, and b.) subsequently the number of frames we can afford to pass to the object detection module after detecting a hit. Currently, she has a python script using OpenCV that is successfully able to identify the location of the dot as it moves around within the 4 rings. The next step here is to come up timing metrics for object detection per frame.

Elliot: Elliot spent much of his time in preparation for the design presentation, fleshing out our ideas fully, and figuring out how best to explain them. In addition, he worked out how we will communicate with the MCU and accelerometer contained by the drumsticks via python. He additionally looked into BlueToothSerial for interfacing with the ESP32, and confirmed that we can use BLE for sending the accelerometer data, and usb for flashing the microcontroller. Finally, he identified a BLE simulator which we plan on using both for testing purpose and for preliminary research. This simulator accurately simulates how the ESP32 will work, including latency, packet loss, and so fourth.

In regards to the questions pertaining to the safety, social, and economic factors of our project, these were our responses (A was written by: Belle , B was written by: Elliot, C was written by: Ben)

Part A: Our project takes both public health and safety into consideration, particularly with regard to the cost and hazards of traditional drum sets. By focusing on mobility, the design enables users to play anywhere as long as they have a surface for the drum rings – effectively removing the space limitations often encountered with standard drum sets, and offering a more affordable alternative that lowers financial barriers to musical engagement. This flexibility empowers individuals to engage with their music in diverse environments, fostering a sense of freedom and creativity without having to worry about transporting heavy equipment or space constraints. Additionally, the lightweight, compact nature of the rings ensures that users can play without concerns of drums falling, causing injury, or damaging surrounding objects. This design significantly enhances user safety and well-being, and promotes an experience where physical well-being and ease of use are key.

Part B: Our CV-based drum set makes music creation more accessible and inclusive. This project caters to individuals who may not have access to physical drum sets due to space constraints, enabling them to engage in music activities without needing traditional instruments. The solution promotes the importance of music as a means of social connection, self-expression, and well-being. The project also helps foster inclusivity by being adaptable to different sound sensitivities, as you can adjust the sound played from each drum. By reducing the barrier to entry for using drum equipment, we aim to introduce music creation to new audiences.

Part C: As was stated in our use case, drum sets, whether electronic or acoustic, are very expensive (easily upwards of $400). This limits the number of people who are able to engaging in playing the drums greatly simply because there is a high cost barrier. We have calculated the net cost of our product which sits right around $150, nearly a quarter of the price of a what an already cheap drum set would cost. The reason for our low cost is that the components of a physical drum set are much more expensive. Between a large metal frame, actual drums/drum pads, custom speakers, and brain to control volume and customization, the cost of an electronic drum set sky rockets. Our project leverage the fact the we don’t need drum pads, sensors, a frame, or a brain to work; it just needs our machined sticks, a webcam, and access to the webapp. This provides access to drum sets to many individual who would otherwise not be able to play the drums.

We are currently on schedule and hope to receive our ordered part soon so we can start testing/experimenting with the actual project components as opposed to the various simulators we’ve been using thus far. Below are a few images showing some of the progress we’ve made this week:

(The webapp UI –> link to view image here)

(The dot simulation)

September 28, 2024January 9, 2025

Elliot’s Status Report for 9/28

This week, I was tasked with establishing a basis for communicating with the ESP32 boards and their corresponding MPU-6050 accelerometers. I spent some time looking into the BLE stack to determine the complexity needed for our GATT services and characteristics–the available options were the native ESP IoT Dev Framework, the Arduino IDE, and a MicroPython-based firmware. I concluded that while the ESP-IDF would give us the most control over the pipeline we implement, since our main purpose is to simply transmit the accelerometer data and its timestamp, the service complexity does not call for any fine tuning. Between the Arduino framework and MicroPython, it would be be best to use a compiled language rather than an interpreted one for the purpose of lower latency. Therefore, I started developing some of the C++ code we’ll eventually be flashing to our microcontrollers; to test functionality, I worked on an ESP simulator on Wokwi to set up a bluetooth connection and send accelerometer data to notified clients. Some libraries necessary for the arduino framework include Wire.h for I2C, BLEUtils for initializing the advertising and notifications, and the MPU6050 device driver.

I also practiced my presentation approach for Monday, where I’ll be talking about how our bluetooth, computer vision, and web server modules interact. I emphasized time spent on identifying why we chose the components we chose, as well as developing concrete requirements to link back to our product use case requirements.

For next week, my plan is to:

Hopefully receive our hardware and begin testing the accelerometer thresholds. I’ll set up communication over bluetooth to relay data, and based on our drum hits, we’ll then look at what signal spikes would indicate an adequate hit.
Test bluetooth latency and packet loss. Once we have the ESP32’s wired up, I can insert packet misses to determine adequate rates, as well as measure the latency of our transmissions with timestamps. This is especially important to us, given the fact that we’ll be using two transmission devices in a low latency environment.

September 28, 2024January 9, 2025

Belle’s Status Report for 9/28

This past week, my time was mainly spent on creating a test to mimic a simplified version of our project. In MATLAB, I made a short video of a small red dot moving in a somewhat-square path over 4 colored rings (a still frame of the video is shown below, as I am not sure how to upload a gif here).

This is supposed to vaguely emulate the behavior of the tip of a drumstick (which we plan to paint red or some other bright color) moving over the drum rings. It is not exact, but the main goal was to just make the dot move around so that I could figure out how to detect it using CV later on. I also made the proportions approximately equal to the real drum rings we will be using.

Then, in VSCode, I wrote a short program using HoughCircles and other numpy and OpenCV functions to read in/process the video, then output one where the red dot is detected in every frame. Said “detection” is indicated by drawing a small neon blue dot over the targeted red one. One can also pause the video by pressing the spacebar to step through and analyze a given frame, or press ‘q’ to close/force quit the output window.

Since the main task for this past week was to work on the computer vision code to detect rings, I would say that I am on track.

In the next week, I would like to measure how long it takes for the red dot to actually be detected in each frame, which will give us a better idea about what latency ranges we can expect when processing the live video feed from the camera in the real-world implementation. I also want to get started on the sliding window that will house a preset number of the most recent frames from the live video feed. Eventually, locating the drumstick tip in each of these frames will help determine which drum sound to make when an accelerometer spike is detected (by making a hit-like motion with the drumsticks).

September 22, 2024January 9, 2025

Belle’s Status Report for 9/21

This past week, I discussed the purpose of a few components with Professor Tamal Mukherjee, mainly including how we plan to mount the camera that will have a top-down view of the drum rings and thus acquire data needed for CV processing. I also began to look at the pinout of the ESP32 microcontroller to determine which registers would be most relevant when interfacing with the MPU 6050 accelerometer, as well as found a few relevant OpenCV libraries and documentation that could be useful for the aforementioned processing. We did not have too much planned out for last week on our Gantt chart besides starting to research and potentially implement Computer Vision code, so I believe we are on schedule.

To remain on schedule, this upcoming week, I plan to put more time into narrowing down which OpenCV libraries are most relevant. I also will begin writing code to experiment with specific color and shape detection functions, and upload it to the group repository. This can potentially be accomplished by generating images of my own with varying levels of noise (to simulate potentially-blurry frames from the webcam) and ring sizes, and trying to detect those rings as well as filter out particular colors. I hope that this process will help us to determine color ranges when detecting the rings and drumstick tips from the camera’s video frames, as we would want to avoid having different lighting conditions affect the functionality of our project. For example, since the drumstick tips are relatively spherical, cast light/shadow on the edges and highest point will have different color values than the color we paint them in.

September 22, 2024January 9, 2025

Ben Solo’s Status Report for 9/21

My time this week was split between preparing for the proposal presentation I delivered on Monday, determining design/implementation requirements, and figuring out exactly what components we needed to order, making sure to identify how components would interact with one another.
The proposal was early on in the week, so most of the preparation for it was done last week and over the last weekend. Sunday night and early on Monday, I spent the majority of my time preparing to deliver the presentation and making sure to drive our use case requirements and their justifications home.

The majority of the work I did this week was related to outlining project implementation specifics and ordering the parts we needed. Elliot and I compiled a list of parts, the quantity needed, and fallback options in the case that our project eventually needs to be implemented in a way other than how we are currently planning to. Namely, we are in the process of acquiring a depth sensing camera (from the ECE inventory) in the case that either we can’t rely on real-time data being transmitted via the accelerometer/ESP32 system and synchronized with the video feed, or that we can’t accurately determine the locations of the rings using the standard webcam. The ordering process required us to figure out basic I/O for all the components, especially the accelerometer and ESP32 [link] so we could make sure to order the correct connectors.

Having a better understanding of what our components and I/O would look like, I created an initial block diagram with a higher level of detail than the vague one we presented in our proposal. While we still need to add some specifics to the diagram, such as exactly what technologies we want to use for the computer vision modules, it represents a solid base for conceptually understanding the interconnectivity of the whole project and how each component will interact with the others. It is displayed below.

Aside from this design/implementation work, I’ve recently started looking into how I would build out the REST API that needs to run locally on the user’s computer. Basically, the endpoint will be running on a local flask server, much like the webapp which will run on a remote hosted flask server. The user will then specify the IP address of their locally running server in the webapp so it can dynamically send the drum set configurations/sounds to the correct server. Using a combination of origin checking (to ensure the POST request in coming from the correct webserver) and CORS (to handle the webapp and API running on different ports), the API will continuously run locally until it receives a valid configuration request. Upon doing so, the drum set model will be locally updated so the custom sound files are easily referenceable during DrumLite’s use.

Our project seems to be moving at a steady pace and I haven’t noticed any reason we currently need to modify our time line. In the coming week, I plan to update our webapp to reflect the fact that we are now using rings to create the layout of the drum set as opposed to specifying the layout in the webapp as well as getting very basic version of the API endpoint locally running. Essentially, I want to get a proof of concept of the fact that we can send requests from the webapp and receive them locally. I’ve already shared the repo we created for the project with Elliot and Belle, but we need to come up with how we plan to organize/separate code for the webapp vs the locally running code. I plan on looking into what the norm for such an application is and subsequently deciding whether another repo should be created to house our computer vision, accelerometer data processing, and audio playback modules.

September 22, 2024January 9, 2025

Elliot’s Status Report for 9/21

This past week, I helped in finalizing the parts list we would request for purchase and for reservation from existing inventory. I spoke with Professor Mukherjee and our team’s TA, Tjun Jet, about some of the components and their purposes. I also worked on the slide deck for the upcoming design presentation and covered a few solutions regarding the interconnectivity of our components–I began researching how to interface with the accelerometers using the ESP32 MCUs as well as the bluetooth stack we’ll be using to relay data back to the host device. I also helped establish our fallback plans in case any given module from our block diagram does not work as expected. Overall, the team is currently aligned with the schedule we laid out in our Gantt chart, but there is still a considerable amount of research needed before I can confidently outline our technical design to the class. This upcoming week, I plan to do the following in preparation for the design submission:

Look into our options for BLE abstraction libraries to easily communicate with the microcontroller. My hope is that Python will have an existing API available for the ESP32, but if not, I am prepared to read the documentation for our device’s Bluetooth module and initialize the advertising, connection, and packet transactions manually.
Similarly, I need to find a way to accept accelerometer data through the MCU. I looked at a few datasheets for the ESP, and it doesn’t use the Cortex M-series microprocessors I’m most familiar with, so I’m not sure if I’ll have to manually write a device driver for the I2C peripheral to communicate with the MPU-6050. Again, hopefully Python has some level of abstraction available for us to use.
Belle will be mostly working on the repository code for the OpenCV, but I plan to also help with the CV code so that I can get a full understanding of our algorithm’s capabilities before we actually enter testing. It will also help me familiarize myself with the repo we’ll be working in, where we plan to put together Ben’s front and backend implementations, the CV code, and the Bluetooth interface for collecting accelerometer data.
Prepare slides and practice verbal delivery. Once all of the above details are established, I can describe our technical strategies in greater detail and identify appropriate visuals to include in order to communicate our project vision clearly.

September 21, 2024January 9, 2025

Team Status Report for 9/21

This week our team took a closer look at planning and strategizing around the implementation process for building DrumLite. In preparation for the Design Presentation, we focused our energy on identifying design requirements and how they connect to the defined use case requirements as well as starting to nail down the specifics of how our components will interact.

We identified 4 main risks we foresee in the near future:
1.) Interfacing between the MPU-6050 accelerometer and the ESP32 Bluetooth transmitter
2.) Processing the data relayed from the accelerometer through the ESP32
3.) Inability to accurately detect the rings that define the drums location using OpenCV.
4.) Not being able to detect objects within frames fast or accurately enough given the field of view of the camera.

For each, we came up with mitigation strategies, all of which are outlined below:
1.) In the case that we can’t interface between the two devices mounted on the drumstick, our contingency plan is to completely back away from the use of the accelerometer and pivot to a new design using strictly computer vision. It is for this reason that we are currently in the process of ordering both a camera with depth sensing, and a standard webcam. The idea is that if we need to, we can resort to detecting a hit by determining the distance of the stick from the camera, knowing the distance of the camera from the desk. We don’t view this as a probable outcome as there is a lot of existing documentation we’ve seen involving the interconnectivity of the MPU-6050 and ESP32 .

2.) If we can’t figure out how to process the data relayed by the ESP32 via Bluetooth, our idea is to fall back on wiring. In this case, the wireless aspect of the project would be eliminated, but this would still stay within the parameters outline by the use case, namely portability and versatility.

3.) If we are unable to accurately determine the location of the rings using standard object such as with the HoughCircles library we were planning on using, the plan is to fall back on a convolutional neural network. Our concern with this is that using a model will introduce a lot of latency into the system, which again, goes against our use case requirements.

4.) In the case that we can’t detect the location of the stick’s tips accurately or fast enough we plan on enforcing a fixed camera height and angle, as well as a much smaller maximum layout size for the drum set. By enforcing that the camera needs to be directly above and downwards facing, capturing the exact shape, size, and location of the drum rings will be much easier and standardized.

In addition to this planning, we’ve also placed our order for all the hardware components well need, all of which were found on amazon at a reasonable price ($264). We’ve decided that in the coming week Elliot will be working on figuring out the interconnectivity of the accelerometer, ESP32, and the code required to receive and process the accelerometer data; Belle will look into figuring basic utilities in OpenCV and the effectiveness of the HoughCircle Library we plan on using for detecting the rings; Ben will be responsible for looking into how to create a REST API that runs locally and interfaces with a webserver for sending and receiving custom drum set layouts.