Team Status Report for 4/12

This week, our team made progress on finalizing and debugging our subsystems as well as starting integration. Lucas added audio playback to the game loop and worked on integrating his components with Yuhe’s main menu. Yuhe worked on the beat map editor, adding waveform viewer and interactions to edit notes. Yuhe is also working on migrating the game to a Windows system in order to solve audio card reading issues when using the Linux virtual environment. Michelle continued testing and refining her rhythm analysis algorithm, moving to a new method that has yields higher accuracy, as shown below in a test on a test piano piece.

After integrating the subsystems we will do some integration tests to ensure all the components are communicating with each other correctly. There are several metrics we will need to focus on, including beat map accuracy, audio and falling tile synchronization, gameplay input latency, persistent storage validation, frame rate stability, and error handling. Both beat map alignment and input latency should be under 20ms to ensure a seamless game experience. The rhythm analysis should capture at at least 95% of the notes and have no false positives. Error handling should cover issues such as unexpected file formats, file sizes that are too large, and invalid file name inputs.

For validation of the use case requirements, we will do some iterative user testing and collect some qualitative feedback about the ease of use, difficulty of the game, accuracy of the rhythm synchronization, and overall experience. During user testing, users will upload their own choice of songs and play the game with the automated beat map and also try out the beat map editor as well. We will want to validate that the whole flow is intuitive and user-friendly.

Michelle’s Status Report for 4/12

This week, I continued testing and refining the rhythm analysis algorithm. I tested out a second version of the algorithm that more heavily weights the standard deviation of the onset strengths into determining whether to count a peak as a note or not. This version is much more accurate across various tempos, as shown in the figure below. These are the results of testing a self-made single clef piano composition. The first version would have more false positives at a very slow tempo and false negatives at a very fast tempo, the missed notes typically being any 32nd notes or some 16th notes. The second version when tested on the same piece performs much better, only missing a few 32nd notes at very fast tempos.

The verification methodology involves creating compositions in MuseScore and generating audio files to test the processing algorithm on. This way, I have an objective truth of the tempo and rhythm and can easily manipulate variables such as the instrument, dynamics, time signature, etc. and see how these affect the accuracy. Additionally, I also test the algorithm using real songs, which often have more noise and more blended sounding notes. Using a Python program, I can run the analysis on a song I uploaded, and playback the song while showing an animation that blinks on the extracted timestamps and record any missed or added notes. To verify that my subsystem meets the design requirements, the algorithm must capture at least 95% of the notes without adding any extra notes of single-instrument songs between 50 and 160 BPM.

Comparing results of V1 and V2 on a piano composition created in MuseScore

I also tested an algorithm that uses a local adaptive threshold instead of a global threshold. This version uses a sliding window so it compares onset strengths more locally, which can allow the algorithm to be more adaptive over the course of a piece especially when there are changes in dynamics. The tradeoff with this is that it can be more susceptible to noise.

I am on track with the project schedule. I think the current version is sufficient for the MVP of this subsystem, so further work will just be more extensive testing and stretch goals for more complex music. I have begun creating more compositions with even more complex rhythms, including time signature changes, which I plan to test this V2 on next week. I also will test the algorithm on pieces with drastic dynamic changes. I plan to play around with the minimum note length more as well. Since V2 is experiencing less false positives, I may be able to decrease this from the current 100ms to accommodate more complex pieces. Additionally, I want to test out a version that uses median absolute deviation instead of standard deviation to see if this outperforms V2. This method will be less sensitive to extreme peaks.

Michelle’s Status Report for 3/29

This week, I continued finetuning the audio processing algorithm. I continued testing with piano and guitar and also started testing voice and bowed instruments. These are harder to extract the rhythm from since the articulation can be a lot more legato. If we used pitch information, it may be possible to distinguish note onsets in slurs, for example, but this is most likely out of scope for our project.

Also, there was a flaw in calculating the minimum note length based on the estimated tempo because sometimes a song that most people would consider 60 BPM, librosa would estimate 120 BPM, which is technically equivalent, but then the calculated minimum note length would be much smaller and result in a lot of “double notes”, or note detections directly after one another that resulted from one more sustained note. For the game experience, I believe it is better to have more false negatives than false positives. I think having a fixed minimum note length will be a better generalization. A threshold of 0.1 seconds seems to work well.

Additionally, In preparation to integrate the music processing with the game, I added some more information to the JSON output that bridges the two parts. Based on the number of notes in for a given timestamp, the lane numbers are randomly chosen from which the tiles will fall from.

Example JSON output

My progress is on schedule. Next week, I plan to finalize my work on processing the rhythm of single-instrument tracks and meet with my teammates to integrate all of our subsystems together.

Michelle’s Status Report for 3/22

This week I continued testing my algorithm on monophonic instrumental and vocal songs with fixed or varying tempo. I ran into some upper limits with SFML in terms of how many sounds it can keep track of at a time. For longer audios, when running the test, both the background music and the generated clicks on note onsets will play perfectly for about thirty seconds before the sound starts to glitch and then goes silent and produces this error:

It seems that there is an upper bound of SFML sounds that can be active at a time and after running valgrind it looks like there are some memory leak issues too. I am still debugging this issue, trying to clear click sounds as soon as they are done playing and implementing suggestions from forums. However, this is only a problem with testing as I am trying to play probably hundreds of metronome clicks in succession, and will not be a problem with the actual game since we will only be playing the song and maybe a few sound effects. If the issue persists, it might be worthwhile to switch to a visual test. This will be closer to the gameplay experience anyway.

Next week I plan to try to get the test working again, try out a visual test method, and work with my team members on integration of all parts. Additionally, after having a discussion with my team members, we think it may be best to leave more advanced analysis of multi-instrumental songs as a stretch goal and focus on the accuracy of monophonic songs for now.

Team Status Report for 3/15

This week, we each made a lot of progress on our subsystems and started the integration process. Yuhe finished building a lightweight game engine that will much better suit our purposes than Unity, and implemented advanced UI components, a C++ to Python bridge, and a test in C++ for rhythm detection verification using SFML. Lucas worked on rewriting the gameplay code he wrote for Unity to work with the new engine, and was able to get a barebones version of the game working. Michelle worked on rhythm detection for monophonic time-varying tempo songs, which is quite accurate, and started testing fixed tempo multi-instrumental songs, which needs more work.

Beat Detection for Time-Varying Tempo
Core Game Loop in New Game Engine

There have been no major design changes in the past week. The most significant risk at this time to the success of our project is probably the unpredictability of the audio that the user will upload. Our design will mitigate this risk by only allowing certain file types and sizes and surfacing a user error if no tempo can be detected (i.e. the user uploaded an audio file that is not a song).

Next steps include finishing the transition to the new game engine, refining the rhythm detection of multi-instrumental songs, and implementing an in-game beatmap editor. With integration off to a good start, the team is making solid progress towards the MVP.

Michelle’s Status Report for 3/15

I started out this week with exploring how to leverage Librosa to analyze the beat of songs that have time-varying tempo. These are the results of processing, using a standard deviation of 4 BPM, an iPhone recording of high school students at chamber music camp performing Dvorak Piano Quintet No. 2, Movement III:

When running the a test that simultaneously plays the piece and a click on each estimated beat, the beats sound mostly accurate but not perfect. I then moved on to adding note onset detection in order to determine the rhythm of the piece. My current algorithm selects timestamps where the onset strength is above the 85th percentile. It then removes any timestamps that are within a 32nd note of each other, which is calculated based on the overall tempo. This works very well for monophonic songs that can have some variation in tempo. For multi-instrumental tracks, it tends to detect the rhythm of the drums if present, since these have the most clear onsets, and some of the rhythm of the other instruments or voices.

I also worked on setting up my development environment for the new game engine Yuhe built. Next week I plan to continue integrating the audio analysis with the game. I also plan to adjust the rhythm algorithm to dynamically calculate the 32nd note threshold based on the dynamic tempo, as well as experiment with different values for the standard deviation when calculating the time-varying tempo. I also would like to look into possible ways that we can improve rhythm detection in multi-instrumental songs.

Michelle’s Status Report for 3/8

This week, I worked on creating an algorithm for determining the number of notes to be generated based on the onset strength of the beat. Onset strength at time t is determined by max(0, S[f, t] – ref[f, t – lag]) where ref is S after local max filtering along the frequency axis and S is the log-power Mel spectrogram.

Since a higher onset strength implies a more intense beat, it can be better represented in the game by chords. Likewise, a weaker onset strength would generate a rest or a single notes. Generally we want more single notes than anything else, with three note chords being rarer than two note chords. These percentiles can be easily adjusted later on during user testing to figure out the best balance.

My progress is on schedule. Next week, I plan to refactor my explorations with Librosa into modular functions to be easily integrated with the game. I will also be transitioning from working on audio analysis to working on the UI of the game.

Michelle’s Status Report for 2/22

This week, I continued working on validation of fixed-tempo audio analysis. The verification method I created last week of playing metronome clicks on beat timestamps while playing the song was not ideal because of the multi-threading timing issues and then human error introduced when taking out the threading and playing the song manually attempting to start at the same time.

This week, I created an alternate version that that uses matplotlib to animate a blinking circle on the timestamps while playing the song using multi-threading. The visual alignment will be more accurate to the gameplay as well. I used 30 FPS since that is the planned frame rate of the game. Here is a short video of a test as an example: https://youtu.be/54ToPpPSpGs

When testing tempo error on the self-composed audio library where we know the ground truth of tempo and beat timestamps, faster songs of 120 BPM or greater had a tempo error of about 21ms which is just outside our tolerance of 20ms. When I tested fast songs with the visual animation verification method, the error was not very perceivable to me. Thus, I think fixing this marginal error is not a high priority and it might be justified to relax the beat alignment error tolerance slightly, at least for the MVP. Further user testing later on after integration will be needed to confirm this.

My progress is on track for our schedule. Next week I plan to wrap up fixed-tempo beat analysis and move onto basic intensity analysis which will be used to determine how many notes should be generated per beat. This is a higher priority than varying-tempo beat analysis. Testing with a wide variety of songs will be needed to finetune our algorithm for calculating the number of notes for each level for the most satisfying gaming experience.

Team Status Report for 2/15

This week, we began implementing the Rhythm Genesis game in Unity, including the User Interface and the core game loop, and also continued work on tempo and beat tracking analysis, calculating the current beat alignment error.

  1. Yuhe worked on the User Interface Layer of the game, implementing the main menu and song selection.
  2. Lucas focused on making the core game loop, implementing the logic for the falling tiles.
  3. Michelle worked on verification methods for beat alignment error in audio analysis.

Some challenges that we are currently facing are figuring the best method of version control. We initially tried using GitHub, but this did not work out since Unity projects are so large. We are now using Unity’s built-in Plastic SCM, which is not super easy to use. Another challenge is that we are discovering that faster tempos are experiencing beat alignment error outside of our acceptance criteria. We will need to spend some more time finetuning how we detect beat timestamps especially for fast songs. As of now there are no schedule changes as the team is on track with our milestones.

A.  Written by Yuhe Ma

Although video games may not directly affect public health or safety, Rhythm Genesis may benefit its users mental well-being and cognitive health. Rhythm games are known to help improve hand-eye coordination, reaction time, and focus. Our game offers an engaging, music-driven experience that enhances people’s dexterity and rhythm skills, which can be useful for both entertainment and rehabilitation. Music itself is known to reduce stress and boost mood, and by letting users upload their own songs, Rhythm Genesis creates a personalized, immersive experience that promotes relaxation and enjoyment. From a welfare standpoint, Rhythm Genesis makes rhythm gaming more accessible by offering a free customizable alternative to mainstream games that lock users into pre-set tracks or costly DLCs. This lowers the barrier to entry, allowing more people to enjoy rhythm-based gameplay. By supporting user-generated content, our game encourages creativity and community interaction, helping players develop musical skills and express themselves. In this way, Rhythm Genesis is not only a game but also a tool for cognitive engagement, stress relief, and self-expression.

B. Written by Lucas Storm

While at the end of the day Rhythm Genesis is just a video game, there are certainly things to consider pertaining to social factors. Video games provide people across cultural and social backgrounds a place to connect – whether that be via the game itself or just finding common ground thanks to sharing an interest – which in my opinion is a very valuable thing. Rhythm Genesis, though not a game that will likely incorporate online multiplayer, will still allow those who are passionate about music and rhythm games to connect with each other through their common interest and connect with their favorite songs and artists through the gameplay.

C. Written by Michelle Bryson

As a stretch goal, our team plans to publish Rhythm Genesis on Steam where users will be able to download and play the game for free. Steam is a widely used game distribution service, so the game will be accessible to a wide range of users globally. Additionally, the game will be designed so that it can be played with only a laptop keyboard. We may consider adding functionality for game controllers, but we will still maintain the full experience with only a keyboard, allowing the game to be as accessible and affordable as possible.

Michelle’s Status Report for 2/15

This week I worked on building a verification process for determining beat alignment error. I created some songs as tests so that I know the ground truth of the tempo and beat timestamps for those songs with some math. Then, I compared this with what the beat tracking observed. I focused on tempo invariant songs. For most tempos, the beat alignment error fell within the acceptance criteria of 20ms. For tempos of about 120 BPM or greater, the beat alignment error was 21+ milliseconds. Further work is needed to finetune that.

I also created another method of testing to use for real songs where the exact tempo and beat alignments are unknown. I built a Python program extracts beat timestamps from an audio file then plays back the audio while playing a click sound at the observed beat timestamps. I initially used threading to play the song and the metronome clicks at the same time, but this introduced some lagging glitches in the song. I removed the threading and just played the generated metronome while separately playing the song myself, attempting to minimize human error in starting the song at the exact correct time. With this method, the timestamps sounded highly accurate and stayed on beat throughout.

The audio analysis portion of the project is on schedule. Next week, I want to see if I can find some way to reduce the beat alignment error for songs that are above 120 BPM.