Luke’s Status Report for 04/27/24

This week, as discussed I focused mainly on tying up some loose ends in my parts of the system and also switched gears to focus on validation testing. Also, I spent a lot of time working on the final presentation that we had this week.

I primarily worked on robustness improvements, such as handling failure cases. This includes being able to smoothly handle the situation where a user input is not found in Spotify resources (ie. the semantic match fails), in which we now handle by removing the item from the queue and having the use try again. This makes the overall user experience better because while our semantic match handles legitimate input very accurately, there is always the chance the user tries to queue something that isn’t even a song or something along those lines. In addition to that, I helped with the rest of the system integration with Matt and Tommy.

In terms of unit and validation tests, I focused on a few things. For one, we needed an accurate metric for semantic matching accuracy, which I tested on Sunday prior to the presentation. This included the following results on 30 test cases, each carefully structured to test a common situation arising from the use or misuse of the system by users.

In addition to this, I worked on a survey to ask users how our system recommendations compare to Spotify’s traditional recommendations. I am still collecting the results but will finish that process tomorrow.

In terms of schedule, we are approaching the final few days of the project where we will continue to collect results from our validation tests and will put the finishing touches on the robustness of the system.

Luke’s Status Report for 04/20/24

As said last report, my main focus this week was to work on the integration of our backend managed queue with the actual Spotify queue. This consisted of writing a queue request scheduler that uses data about the length of the current songs playing to ensure the state of our queue is consistent with regards to the Spotify queue in terms of timing. I then used futures and the java concurrent safe scheduler class to implement this in an efficient way. With this in place, all of our veto and ordering of the queue functionalities work well with no problems. This was one of the lat big steps we needed to take so it is great we were able to get it done so efficiently.

Also, I worked more on various other tasks this week. I worked on some verification tests for my subsystems, primarily involving the accuracy of the semantic matching algorithm. In addition to this, Ice worked more on the authorization driver by replacing our current method of using a web driver with using refresh tokens instead, which makes our lives a lot easier snd limits the communications between the pis.

Further, I worked on the endless queue functionality. This basically just means, that when it is time for the scheduler to queue whichever song is at the top of the queue, if there are currently no songs in the queue, then the system will just queue a song that is a recommendation based on the previous song that played. However, this is slightly inefficient because it takes a couple of seconds to generate the recommendation and communicate between the pis, so we implemented this function as a prefetching process. Basically, the system will prefetch a recommendation if the queue is empty, and then when it is time to actually queue it, if the queue is still empty we use the rec and prefetch another. This was great progress and now makes our system fully functional as originally planned .

We are on schedule. This week was massively important in terms of all of the progress we made, so we just need to continue strong for these last two weeks. Next week I will focus mainly on small robustness improvements and verification and validation testing

What new tools or new knowledge did you find it necessary to learn to be able to accomplish these tasks? What learning strategies did you use to acquire this new knowledge?

I definitely learned a lot about new classes and packages that Java supports, such as the concurrent library, and some of the ML related classes that we use for the semantic matching. My core learning strategies involved reading lots of code documentation and examples – this helped me understand the use cases of these tools and then figure out how to incorporate them into our own project.

Luke’s Status Report for 04/06/24

This week, I took a bit of a detour from my original plans and worked more on the authorization model. Basically, to access Spotify’s resources you need to get a proper access token, but this token expires after an hour. So if we want to be able to host larger events then we need to be able to automatically regenerate access tokens for use on both the core and recommender pis. This involved some more automation with selenium, but the main task was querying and sending the access tokens across pis. This involved some more intra-pi communication code and integration into the rest of the subsystems. Overall, this process went well and we were able to test it, which was successful. Now we can operate music mirror for an indefinite period of time without having to worry about losing authorization to Spotify’s resources.

With this in mind, I did start to work on the timing systems that we will use on the core pi by introducing timestamp timers. This infrastructure will be used with the Spotify queue integration which I will work on more this upcoming week.

Verification: I started to write a comprehensive test base for the semantic match. One of the big requirements for accuracy was being able to translate user input into accessible Spotify resources, and I’ve developed multiple solutions for this but we need to be able to test efficiently. So, I created a bunch of test cases for common situations we’ll run into, and began to compare the 1-gram frequency map with the embedding transformer solutions to see which is more accurate.

Validation: In terms of the overall system, we started testing more live concurrent users and also did some robustness checks, such as spamming session req requests to see how the system would handle it. To our satisfaction, it was actually quite robust to these failures but we’d still like to incorporate some sort of limit on the requests a single user can make to ensure nobody monopolizes the queue

Overall, we are still on schedule but need to keep working hard for these last few weeks. I plan to work on the queue integration further in the upcoming days before carnival.

Luke’s Status Report for 3/30/24

As discussed last week, the primary goal for this week was to integrate all of our subsystems. With that being said, I focused on the intra-pi communication which bridges the gap between the core logic and recommendation code I’ve written over the past couple of weeks. This involved some more work with websockets, but after a little bit we were able to connect the two pis using a unified data transmission object (MessageRequest and MessageReponse classes).

This process involved a lot of small subtle debugging. For example, I caught a few different minor bugs with my recommendation code, particularly some edge cases with the session recs. We had some issues if there were fewer than 3 songs that have been queued, and also if the songs all had 0 likes for the session. that would cause the weighted centroid computation to be incorrect so I had to fix that.

After we fixed some minor bugs, we did some fun testing. I had a bunch of my teammates from soccer go to our website and concurrently queue songs and test around with the recommendation functionality. It was really cool to see everything working well. This also exposed some minor bugs that I worked on.

I would say that we are on schedule, but there is still a lot to be done. The integration process revealed many robustness challenges we are going to have to address, mainly with the management of the queue.

Next week I am going to work on the integration between our backend managed queue and the actual Spotify queue. this is going to be a little tough because we need to implement some timing of the songs by querying the duration of each song. This will be my main priority, and if I have any extra time I will work more on the semantic matching accuracy.

Luke’s Status Report for 3/23/24

As stated in last week’s report, I spent a lot more time on the recommendation system this week. I worked to improve the recommendations from a single song, and then also implemented the session recommendations as well. I will describe both in detail:

Recommendation From Song:

In the last post, I described how I actually generated a seed to then use Spotify’s recommendation endpoint to get a list of recommendations. But now, we want to improve further on Spotify’s recs to ensure the user gets the best possible recommendation. So this is where I had some fun. At this point we have two things: an input song, and then a list of recommended song’s Spotify returns. We want to determine which of these songs to return, which is ideally the one most similar to the input song. Now also consider that as mentioned before, we have many song characteristics available to make this decision. So, I narrowed the parameters to the values that are actually meaningful when comparing two songs, and have the following 9 characteristics: acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness, tempo, valence. So now, let’s treat each ‘song’ as a point in a 9-dimensional vector space. Our problem described above, now simplifies to choosing the recommendation that minimizes the L2 norm with the original input song (or another distance metric). Thus I implemented this exact described process in code and now we can successfully further refine Spotify’s initial recommendations, with an output that more closely matches the input.

Session Recommendation:

Now, we also want to support the functionality of a generic ‘session’ recommendation –  recommendation that takes in all of the songs that have been played into account. As discussed before, we can do this in a smarter manner because we have access to user feedback: the likes and dislikes a song has received at the event. So the problem simplifies to, given a map of played songs and their number of likes, generate a song recommendation. Mainly, how can we translate this information into a Seed to send to Spotify’s recommendation endpoint. This boils down to two things: compiling the seed songs and artists and then choosing the song characteristic values for the seed. For the former, we can sort the session songs based on their likes, and then just choose the 3 song_ids of the top 3 songs and then the 2 artists of the top two songs. This gives us the needed 5 params for the seed in terms of song and artist. Where things get more interesting is selecting the numerical values for the song characteristics in the seed. An initial idea is finding the geometric average (ie. the centroid) of the input songs in this 9-dimensional vector space I talked about earlier. But this creates dull values. Think about the case of 3 session songs, each being drastically different. if we just take the geometric mean of these songs’ characteristics vectors, then we’ll just get song attributes that don’t resemble any of the songs at all, and are just a bland combination of them. So instead, we compute the weighted centroid, which is the weighted geometric sum where each weight is the number of likes corresponding to each of the songs. Then, we use  this resulting weighted centroid as the input to our seed generator. This works well which is great

The next interesting question in this realm is doing the same thing we talked about for the single song rec. Once we get recommendation results from Spotify, how do we further refine to get the best possible recommendation. This is something that is not urgent but it’s super interesting so I’ll spend time on it in the coming weeks. What I rlly want to do is K-means cluster the session songs and then choose a recommendation result that minimizes the L2 norm from any of the cluster centroids, but that’s a bit over the top. We’ll see

Also, note that the team as a whole did a lot of integrations together this week. We integrated the queueing and recommendation features with the actual frontend and backend internal queue components, so now Music Mirror is fully functional (an early version). You can now go on our site from multiple sessions and queue songs which are then played automatically. In fact I’m using music mirror right now as I’m writing this.

This is really good progress ahead of the demo. The next step is to continue to integrate all of these parts within the context of the broader system. Immediately, we will be integrating all of this code with the two Pis and managing the communication between the two.

In all, the team is in a good spot and this has been another great week for us in terms of progress towards our MVP.

Luke’s Status Report for 03/16/24

Well, I did a lot of coding this week. I did a lot of work to integrate all of the submodules I’ve been working on into a full pipeline. Now, I am happy to say that the system can go from any song request by name and artist -> to beginning playback of that song on any connected speaker to our wifi streamer. This is huge progress and puts us in a great spot to continue working towards our interim demo. In addition to this, I created the pipeline to go from a song recommendation request to playing the resulting request on the speaker, streamed from Spotify. I will explain these two pipelines and the submodules that they use to get a sense for the work I put into this.

Recommendation:

As seen above, using the SeedGenerator I’ve discussed in previous posts, a recommendation seed is created, fed to the model, which then returns recommended songs that match this seed. This includes their song IDs, which then is used to send an add to queue request and start playback request to the spotify player. The Web API, then uses Spotify Connect to stream the resulting song via wifi to our WiiM music streamer, which then connects via aux to an output speaker. This full pipeline allows us to make a recommendation request, and have this result begin to play on the speaker with no human interaction. It’s awesome. I can’t wait to work on the model with further ideas that I’ve mentioned in previous write-ups and the design report.

Queue by Song/Artist:

This workflow is the one I am more excited about, because it uses the semantic match model I’ve been discussing. Here’s how it works. You can give any request to queue by providing a song name and the corresponding artist. Keep in mind that this can include misspellings to some degree. This song name and artist combination is then encoded into a Spotify API search request, which provides song results that resemble the request as close as possible. I then iterate thru these results, and for each result, use its song name and artist name to construct a string combining the two. So now imagine  that we have two strings: “song_request_name : song_request_artist” and “search_result_song_name : search_result_artist”. Now, using the MiniLM-L6 transformer model, I map these strings to embedding vectors, and then compute the cosine similarity between these resulting embeddings. Then, if the cosine similarity meets a certain threshold (between 0.85-0.90), I determine that the search result correctly matches the user input. Then, I grab the song_id from this result, and queue the song just as described before in the recommendation system. Doing this has prompted more ideas. For one, I basically have 3 different functional cosine similarity computations that I have been comparing results between: one using the embedding model I described, another using full words as tokens and then constructing frequency maps of the words, and lastly using a 1-gram token, meaning I basically take a frequency map character by character to create the vectors, and then use cosine similarity on that. The embedding model is the most robust, but the 1-gram token also has fantastic performance and may be worth using over the embedding because it is so light weight. Another thing I have been exploring is using hamming distance.

This is really good progress ahead of the demo. The next step is to continue to integrate all of these parts within the context of the broader system. Immediately, I will be working further on the recommendation system and build the sampling module to weight the parameters used to generate the recommendation seed.

In all, the team is in a good spot and this has been a great week for us in terms of progress towards our MVP.

Luke’s Status Report for 03/09/24

This week, I spent my time on a variety of different tasks. Primarily, I worked with my group to complete our design review report. We spent a lot of time finalizing design decisions and trying to convey these decisions in the most accurate and concise way possible. However, a lot of this work was just writing the report so I won’t go into it in too much depth for this report.

Onto the more interesting stuff, I spent some time thinking about ways our recommendation model can outperform Spotify’s vanilla model. The big takeaway from this was essentially using Spotify as a backbone model, but then introducing a second tier to the model which more carefully generates seeds to feed into the rec endpoint. Basically, the Spotify rec model takes in a “seed” to generate a rec, which consists of a bunch of different parameters such as song name, artist, or more nuanced concepts like tone, BPM, or energy level. So, how to construct this seed becomes just as important as the actual model itself. With that being said, I decided on the idea to sample the parameters for this seed generation using our real-time user feedback. This includes user’s thumbs up or thumbs down feedbacks on the songs in the queue via the web app, which gives an accurate representation on how to weight songs based on how much the audience is enjoying them. For example, if a user wants a song similar to the ones that have been played, we can choose which songs to use in this seed generation based on the feedback received by the users. More technically, we can have a weighted average of the song params where the most emphasis is placed on the most liked songs. I will continue to flush out these concepts and actually implement them in code in the coming week.

Additionally, I spent time this week on the semantic match algorithm, which is pretty critical for being able to find songs from Spotify’s web API resources. Basically, I implemented cosine similarity between two strings constructed by a song name, its artist, and its album. For example, if a user requested Come Together by the Beatles, which is in Abbey Road, the string that would be used is “Come Together The Beatles Abbey Road”. And that way, we can compare this with a search result such as “Come Together (Remastered) The Beatles Abbey Road” or “Come Together The Beatles Best of the Beatles”.

I won’t get too much into the technical terms but we basically construct vectors of the word frequencies in these input strings, and then find the dot product of the vectors normalize by their lengths. Here is some code and results. Below is how we create the frequency map for the input strings

Now here are some examples. Note that a value of 1 represents maximum similarity

This accuracy should be usable for our purposes, however, I want to improve it. Because I am tokenizing by full words, it is not as accurate at detecting typos and one or two character mistakes by a user, which is illustrated in the last example. Ideally, we want to be more robust to this.

In terms of schedule, we are in a great spot but we have a lot of coding and work to do this week. Now that the design doc stuff is done, we can focus a lot more on actually implementing our ideas.

Next week, I will continue with the semantic match, ML model, and will complete the pipeline between Spotify Connect and a wifi-connected speaker so we can actually play music via an API call.

Luke’s Status Report for 2/24/24

As mentioned last week, I spent the majority of this week working on the ML module. The main goal was to be able to generate song recommendations from a generated seed containing info about a song. I was able to build a Java module that communicates with the proper Spotify endpoints to do exactly this. Below is an example of the generated song recommendations for an input seed that represents: artist = The Beatles, song = Help!, genre = rock. This is the full output

As we can see it works amazingly which is very cool to see. The generated songs from such a simple query are already very relevant to the input song. This shows very promising results for how much we will be able to fine tune our results with much more complex input seeds based on the listening session data we obtain. The next steps for this will be to create a cleaner class for building seeds to input into the model.

In addition to this coding, I helped my team prepare for the design presentation by working on the slides and contributing some important things. Mainly, I created the block diagram which took a lot of thought and effort.

As we can see, our system design is really coming together which is fantastic to see.

In terms of schedule, I would say that we did a great job in catching up on schedule this week and getting some important things done. Once we are done with the design presentation, we will be able to really buckle down and grind out a lot of the critical models for the design.

Next week, I plan to continue with the ML development, and will hope to integrate communication between the two pi’s because this will be important for the actual lifecycle of adding a recommended song to the queue.

Team Status Report 2/17/24

What are the most significant risks that could jeopardize the success of the project?

  • Not being able to properly control the lights with our PI. This is for both actually connecting the lights / controlling them through a program and making them display what we want them to.
  • Communication between different parts of the project. We still need to make sure that all of our parts can communicate efficiently.
  • Note that the authorization with Spotify API without a GUI was a significant challenge last week, but we have seemingly solved this

 

 How are these risks being managed?

  • For the first one: We have been researching ways that other people have controlled DMX lights and found some pretty good resources
  • For the second one: we can pick up the Rasberry Pi’s so next week (after we finish the design presentation) we can actually test if the Pi can communicate with the spotify API as well as with the other PI and the frontend. 

 

What contingency plans are ready?

  • If things go really bad with the lights (which I don’t think they will) we could always connect small LEDs to breadboards and make a light show that way. Since it is much easier to just connect the LEDs straight to rasberry pi pins

 

Were any changes made to the existing design for the system (requirements, block diagram, system spec, etc)? Why was this change necessary, what cost does this change incur, and how will these costs be mitigated going forward? 

  • No major changes were made to the system at this point: no new User input features, modules added/reassigned/reorganized, and no new pieces of hardware added.

 

  • Part A was written by Matt Hegi, B by Luke Marolda, C by Thomas Lee
  • Part A:
    Our project mostly targets health in a psychological sense. Everyone gets a chance to get their song heard without much effort, it promotes collective enjoyment amongst a crowd, and the music reflects the views of the majority which means that most people will be happy.  Our project does not introduce nor help avoid things that cause physical harm, maybe if someone used to have to go up to the DJ to request a song they would have to go through a crowd which could be dangerous. Our product does not apply to welfare very much since music is not a basic need.
  • Part B:
    Our project aims to bring groups together by allowing everyone to contribute to what is being played. This allows for a diverse representation of music preferences, acknowledging the varied social and cultural background of the guests.
  • Part C:
    Our project aims to be a cost effective way to enhance events. It is a relatively cheap one-time cost that is meant to permanently replace costly DJs. Our product aims to do what DJs do, but more personalized to each customer, more targeted toward the crowd, and without bias.

 

Luke’s Status Report for 2/17/24

This week, I focused a bit more on the Spotify web player and authorization protocol rather than the ML model. The architecture of our server interactions with Spotify were more complex than initially expected so I had to come up with unique solutions that would work on a RasPi. The resulting solution was as followings: launching a NanoHTTPD server to send and receive requests, and then using Selenium WebDriver to authorize with the Spotify authentication endpoint. The tricky part here is that typically with Spotify web apps, there will be multiple users logging into your app with their own Spotify credentials via a web browser. But, we want to house this infrastructure in a Pi, with no graphic interface for the login credentials, since we will only be using a single Spotify account for the system. Thus, I needed to use web driver to automatically interact with the redirect URI that Spotify responds to an auth request with. This involves entering credentials, logging in, and then grabbing the session params after the successful login to retrieve the Authorization code that will be used to make future Spotify API requests that need user scoping. Below you can see this in action as we start the app:

Then, a Chrome WebDriver instance is launched, directed to the Spotify login page.

The credentials are then automatically entered. For this example, I am using my account but in the future we will have a premium account solely for the Music Mirror system.

And finally, the driver is automatically closed once the authorization code has been received.

In addition to this, I worked on more solution approach work to prepare for this week’s presentation

We are mostly on schedule, but I am a bit behind on the ML module development as this authorization solution took longer than expected to devise. However, we are on track in all other areas and should be making great progress next week.

Speaking of, I plan to now use this working auth infra to build the ML rec model wrapper that will communicate with our user’s input and the Spotify API to create song recs to add to the system queue. I also plan to solidify any other solution design decisions in terms of the ML modules for the system.