Status update 11/25

Michael- Worked with Chris to update him and get him up to speed with how the webapp works and how to modify the user interface. I wrote a rest api endpoint for Chris to access chord data to display in user interface. I started working on incorporating the midi keyboard however did not have midi cables yet so I could not test the code. I plan on focusing on that the next couple days and believe I should be able to start testing our machine learning with midi keyboard input by Wednesday.

Chris – For the start of the week I spent some time getting myself familiar with the code base for the user interface part. Later I met with Micheal and worked with him to answer some of my questions and get myself up to speed. For the past week was Thanksgiving, I did not go too far on the code. In the coming week, my sole focus will be on implementing the user interface and I will keep updating the interface design as the functionality may change as time progresses.

Aayush – This week I finished processing the beatles dataset, with 19 songs available for the training set. I have also started collecting and processing songs for other artists, currently Tom Petty and Bryan Adams. Our predictions for tom petty songs are far better than compared to beatles songs, so collecting the training data is moving much faster. The current bottleneck is converting multi-track midi files to single track that contains only the melody, I plan to take Chris’ help with this part to speed up the process. The past week was mostly spent on manual work, and given that we have 194 songs in the original dataset, I believe we need at least a 100 parsed songs to effectively mix and match in order to create 3 final networks that we plan to demo. The total dataset will consist of –

  1. The wikifonia songs
  2. Beatles + Tom Petty + Brian Adams (50 songs / 2000 measures) (currently have 25)
  3. A third genre, probably more traditional rock such as RHCP, Deep Purple, Def Leopard. (50 songs / 2000 measures)

This week I plan to finish collecting songs in category 2, and train a new network with these songs. Since we already have the code in place to parse midi files, I need to work with Michael to integrate it so that it can be used in the training phase.

Status Update 11/18

Michael – The core functionality of our final webapp is working now. The user is able to upload a midi file without chords. The user is able to then generate chords and play the original and the melody with chords from the browser at the same time. The user can then download the file with chords at any time. Goals for this week include working with the midi keyboard and helping Chris incorporate what he has been doing into the application.Expanding the application to have multiple chord progressions once Aayush provides the multiple models will also be fairly trivial 

 

Chris – For the past week I have been working on the interface design for the web app, and I have got some pretty good results. After a few iterations of sketching and prototyping, I have landed on a visual structure and the corresponding color scheme. Improvements of the new design over the original design can be categorized into two aspects: the user experience and the visual. In terms of improvement on the user experience, I decided to break up the uploading interaction and the chord generation part into two interfaces because when user first land on our website, the interface should be a clear page with a description of what our web app does and a very clear call to action button (the Upload Melody button). The second improvement is dedicating the left part of the screen for clearer navigation. The content of the navigation menu is still to be decided as we add more/alter features of our product. In terms of visual, there are a number of changes, the most important of which are using blocks to create a more user-friendly visual hierarchy and displaying chord while the music is being played, which should be the central feature of our app. Other visual improvements include better color and typography choices to improve visual hierarchy, better organization of functional buttons, and more aesthetic rendering of the music. Finally, I also made a logo for our app. For the coming week, I will be working with Micheal to implement the design to the best degree. Some designs will be changed or dropped as always, and I will be continuing to improve the design as we go.

Aayush – I have been planning the final steps we need to take in order to prepare a test dataset for human evaluation. My initial goal was to use our chord prediction to ease the process of labelling the type of music we wanted (Beatle pop and rock). There were a few issues with Beatles songs that made this more difficult than anticipated:

  1. Beatles songs turned out to be far more complicated than I previously imagined, both in terms of chord changes and having more than one chord per bar.
  2. Songs often contain off-scale chords, which naturally makes prediction much tougher.

Nonetheless, we have managed to collect 20-25 single track Beatles melodies. I have processed 8 of these melodies (around 400 bars of music), 5 of which required major changes to our predicted output (> 60% of chords changed), while the other 3 required very minor changes (< 25% of chords changed) for a chord progression good enough to add to the training set. By changes I mean predicted chords I manually replaced.

Moreover, for non-simplistic melodies (usually characterized by a faster melody but not always), I noticed the output being extremely repetitive at random occasions, for example 4 C’s in a row where it should not have been. While training the data, I went for a less complicated model because test accuracy would not increase with model complexity. However, as I mentioned earlier, the predicted chords were much more varied with increasing model complexity. (For any complexity, the graph was almost identical except the number of epochs required to train)

We were always aware that accuracy was a poor measure, yet defining another loss function seemed impractical given the lack of any good measure for test predictions. To tackle these issues, I think the following strategy would be most effective –

  1. Add modern pop songs to our dataset, which have much simpler chords. My aim is to have this as the vast majority of the dataset, so the network prefers simplistic output. The addition of songs form the wikifonia dataset and our collection of beatles songs can help add a small tendency for the network to take risks.
  2. Retune the network from scratch after accumulating the final dataset. It took me a long time, thorough testing, and detailed analysis of the output scores to finally realize the above issues with the network, implying the baseline model produced fairly pleasant sounds for the most part.

I will continue to test, fix predictions, and accumulate valuable training data for the rest of the week. With the evaluation looming I also chalked out a rough plan for the testing phase. For both the rating test and the confusion matrix –

  1. 5 curated songs where our network performs well. ( 2 of the Beatles songs could be used here, but I fell the retrained network can perform much better)
  2. 5 random songs chosen from our collection of pop songs.
  3. 5 composed melodies, not taken from original songs.

We will also collect data separately about which of the 3 groups performed best and worst.

Status Update 11/11

Chris – The first half of the past week for me was devoted to preparing for the mid-point demo. The three of us spent some time together to work out some problems we ran into when integrating all of our previous work. For me specifically, I worked with Michael to fix some problems with the key-finding algorithm when integrating with the machine learning module. Besides that, I also spent some time preparing music for the demo; specifically, we decided to use some Beatles songs for the purpose of the demo. Most of the sources we found online contain multiple tracks so I had to manually find out which track the melody is and strip the rest of the tracks. Additionally, to give me a better sense of music software user interface, I did some competitive analysis on some existing music making software, including MusicLab and Soundation. For the coming week, my goal is to make a design for the web app user interface to replace the current one and work with Michael to implement the new design.

Michael – Besides working with Chris on what was mentioned above, I also spent significant time working on fixing the bug with some more complex midi files that was discussed in the midpoint demo. It ended up being an issue with extended periods of rest and this issue is now resolved. The goal for this week is to get the midi file with chords to play in the browser and also start to look into using the midi keyboard

Aayush – Spent the first half of the week working on the demo. I was tuning the algorithm, passing in Beatles songs and evaluating the chords generated. I found approximately 60% of the chords were very similar to the original, and around 75% of the chords sounded like what I would consider subjectively acceptable outputs of the algorithm. During this I also ran into the bug mentioned above for the first time. Midi files with multiple bars of rests would crash our output save. My plan for the next week is to accumulate data and gather statistics about our chord progressions. I would be trying to answer questions like –

a) What percentage of chords predicted needed to be changed?

b) What percentage sounded good but differed from the original?

c) Collect a list of 10-15 songs which we can use for evaluation. <Evaluation methods described in design doc>

I talked to Chris about starting the frontend for our testing mechanism while I collect the preliminary data in (point c)). We can then have a decent testing mechanism to use for evaluation and further improvement.

Status Update 11/5

Michael- Aayush and I spent a lot of time working on integrating the machine learning with the webapp. This raised a lot of unexpected issues getting code to compile with because of certain dependencies which was very frustrating. Right now the web app is able to run on Aayush’s computer with no issues. The webapp currently takes in a midi file and outputs a midi file with the added appropriate chords  based on machine learning algorithm. I also fixed a bug with the output of the chords and making sure each chord falls exactly at the start of each measure. My goals for next week include working more on the user interface such as being able to play the chords added midi file in the browser. Aayush and I have also talked about adding a feature in which you can see the different probabilities for different chords for each measure and being able to select which chord you would like from the browser.

Aayush = Worked with Michael on integration of webapp and machine learning. We still need to add a couple things to the webapp before the demo, such as handling time signatures and integrating the key recognition chris worked on (this part is nearly done). My focus this week will be to help michael implement the feature described above. The goal is to reduce the input set to a small number of possible chords. We believe the algorithm does this reasonably. We can then add songs, play the midi, tweak the chords if they don’t sound right. If we like the result we can then save the labels and the input. This way we plan to make many datasets, including separating by genres and separating by person (i.e. a model trained by only one person in the group), so that we can incorporate individual musical taste as well. We can measure if there are any differences in the output of models trained by different users on the same songs, and see if this is a direction worth pursuing. Similar strategy with genres, which we will start with first as per the initial plan which was also strongly suggested by the professors.

Chris – This week I worked with Michael to incorporate the key recognition part that I have been working on with the web app he has been working on. The bulk of work was focused on feeding data with the right format that’s parsed from the MIDI input to the key recognition module which is implemented with the MIT Music21 library. One thing that we had to deal with was that for the machine learning part, the input data format for each measure is the frequency of each note in the measure, regardless of which note comes first. However, for the key recognition to work properly, the sequence of notes is necessary. For the coming week, I will be working with Michael on the front-end of the web app in a few different areas, the first being improving the UI/UX of the overall web app experience, hiding the technical details that might not be very useful to the users. The second task is to implement the rating and evaluation part of the interface.