Team Status Report for 3/1/2025

Sensor Adhesive

We had previously thought that the electrode gel, which is applied to the neck sensors before the sensors are fixed to the neck, might function as an adhesive. However, the electrode gel serves as an aid in conductivity, and doesn’t have adhesive properites. Established methods for attaching the sensors include a neck band, tape, and simply having the user hold the sensors up physically. Holding the sensors is obviously cumbersome and a violation of our use-case requirement of comfortability. Discussing this issue with our vocalists, we confirmed that a neck band would also be uncomfortable and could impede movements necessary for singing freely. As a result, we are purchasing some specifically designed skin-adhesive tape to tape the sensors in place, which will hopefully maximize both comfort and secure sensor placement.

Schedule for SoM Meetings

We determined a schedule for our meetings with our School of Music partners following break. We’ll start with a week of practicing using the electroglottograph with vocalist users, then start gathering weekly warmup data so that our final presentation can include the data over time for five weeks for two vocalists. At the same time, we’ll start experimenting with repertoire recording, likely with piano accompaniment, in week 10.

Future Test Users

While we have the opportunity to work with two vocalists in the School of Music (a soprano and a mezzo-soprano), our hope is to ultimately test our product with a larger number of vocalists to get more meaningful data for our user feedback survey. This will depend on whether or not more vocalists are willing to sign up for the music course we’re currently a part of. There’s also the question of whether we’d be interested in expanding the target audience of the product slightly to include singers who are trained, but aren’t necessarily vocal majors or opera singers. Even though our current use case is more specific, other singers might still be able to offer feedback for things like the ease of setting up the device. This decision will depend largely on whether or not more vocalists are willing to join the class.

Data Privacy

One issue we were considering is that of securing user data, since some users might consider their CQ data or vocal recordings to be private. However, with the advice of Tom and Fiona, we’ve concluded that this actually falls outside of our use case requirements: like any recording software, this application is meant to be used on personal devices, or in a controlled lab setting, and all the data is stored locally. As a result, we will not be worrying about the encryption of user data for our application.

Product Solution Meeting a Specified Need

A was written by Melina, B was written by Tyler, C was written by Susanna

Section A

Our Product Solution considers global factors by including users who are not in academia or do not consider themselves technologically savvy. Although our product utilizes specialized hardware and software, our solution includes a dedicated setup page that aims to facilitate the use of these technologies for users who will be assumed to have no prior experience with them. The feature pages of the app will also include more information about how to interpret the CQ in the context of a controlled exercise time series analysis and a distinct repertoire. We have also considered that the accessibility to our product beyond Pittsburgh is limited by the purchase of EGG hardware and Voce Vista software. Our product solution makes use of a shared lab that can be duplicated with these purchases for any other local region.

Section B

Some of the cultural factors our product solution take into account is how opera singers normally sing and what the accepted practice is. We spent a lot of time conducting research on the vocal pedagogy of opera singers to ensure that when we output data it does not contradict what the user’s vocal coach instructs them to do. On top of that, we have taken into account that it is usually taboo to try and get a singer to change the form of how they sing and have decided to instead just output the information in a useful way so that the opera singer can decide whether or not to make changes, instead of originally suggesting form changes for the opera singer.

Section C

The physical components used in this product (electroglottograph, connector cables, microphone, etc) were created by extracting materials from natural sources. While the overall goal of the project is not directly related to environmental concerns, the overall impact of the product can be minimized by using durable and reusable components where possible. Notably, we found a preexisting electroglottograph to borrow rather than buying or building our own. This certainly saved us considerable cost and effort, but is also notable for significantly reducing the amount of new material that went into building the project. While we did need to purchase a microphone new, we purchased a high-quality model that will be able to be reused in future projects.

Susanna’s Status Report for 3/1/2025

Well, I significantly overestimated what my bandwidth would be this week before spring break. I spent a lot of time working on our design report, and given that my spring break travel started on Thursday, I didn’t end up having time for much else. 

MVC vs MVVM

One thing that I did more research into this week was the model for our software architecture. Originally, I had conceived of our backend as MVC (Model View Controller) architecture due to my familiarity with this paradigm from my experience developing web applications. However, looking into it a bit more, it turns out that the standard for desktop applications is MVVM (Model View Viewmodel).

Basically, MVVM introduces complete separation of concerns between the GUI (View) and the database and program logic (Model) with the Viewmodel acting as a mediator. This will make it easy for us to develop these elements in separation. Plus, it’s a reactive architecture: the Viewmodel automatically responds to changes in the Model, and likewise, the View responds to changes in the Viewmodel, which will be useful for real-time updating, like the animation of the cursor that scrolls over the music. MVC is a similar paradigm, but less strictly enforced, and more suitable to web development. Of course, for either paradigm, it’s up to us how strict we’re going to be, and there’s always options to customize. Tentatively, I think this will be a helpful general framework for us to follow.

Last Week’s Goals Revisited

  • Write my parts for the Design Review report – COMPLETE
  • Research & make a decision about DearPyGUI (ASAP) – COMPLETE
    •  Given the sorts of data analytics we will need in the application, and our desire for a flexible and engaging user interface, we decided that DPG is our best option for
  • Better documentation for code – IN PROGRESS 
    • started README file, but haven’t tested it on other devices
  • Record Screen File Saving – NOT STARTED
  • (Optional) Playback Screen Upgrades – NOT STARTED

Goals For Next Week

  • Better documentation for code
    • Finish README file for setting up dependencies on different OS
  • Re-implement basic menu, record screen, and playback in DearPyGui
  • Record Screen
    • User can choose to either save/discard a recording instead of auto-saving
    • User can customize naming a recording upon saving
  • (Optional) Playback Screen
    • Add cursor that moves while playing
    • Let user drag cursor to various points in the playback
    • Add separation of signal into rows + scrolling

 

Susanna’s Status Report for 2/22/25

Week Goals Revisited

  • Implement the basic MVC application with homepage
    • COMPLETE – It’s not much to look at, but the homepage exists:

  • Create GitHub repo
    • IN PROGRESS – I created a repo, but still need to update the README with instructions for using the app on different operating systems (and testing this with my teammates to make sure it actually works)
  • Create baseline audio recording page functionality
    • COMPLETE – I created a simple page that can record an audio signal. However, given that we know now that the microphone signal will be going through the EGG directly, I don’t think this version will be used in the final product, though it’s useful for testing now (instead, I think that the audio signal will come from VoceVista– though, the record page could still be useful for giving the singer visual/auditory cues to go through given warmups)
  • Create baseline audio playback page functionality 
    • COMPLETE (ish?) – I didn’t exactly give a definition for “baseline” here, but I do have a basic version of the page, including a visual of the amplitude of the audio waveform. There’s still some things left to do, though: I’ll need to splice the audio signal into rows, allowing the user to scroll through a longer recording and syncing more seamlessly once we add the EGG signal. I also haven’t yet gotten the cursor to work (showing where the user is in playback)

  • Help Melina with creating wireframes for application
    • COMPLETE – we have designed basic versions of the main application views, which was very helpful for clarifying some things in the design (namely, the separation of recording pages from analytic pages, and we’re planning to have separate recording views for recording repertoire (the original vision) and for recording guided warmups (which ends up being more useful for formally tracking CQ over time)). Below, for example, is a view of the general idea for the CQ warmup page.

Overall, I’m pleased with this progress, as it gives us some good ground for developing the app further, and I’m on track with what we have on the schedule.

DearPyGUI

As noted in our team update, we’re seriously considering switching frameworks from TKinter to DearPyGUI for the sake of its more seamless mechanisms for data visualization. I’m a bit torn on this, because TKinter has very good documentation and is more of a standard choice– plus, I’ve already spent some time learning about it and getting the application started. On the other hand, I don’t want to be knee-deep into a TKinter application and suddenly realize that it doesn’t give us the functionality that we need– and I’d love to have a more modern look and feel for our app, though it’s not a top priority.

Goals for Next Week

  • Write my parts for the Design Review report
  • Research & make a decision about DearPyGUI (ASAP)
    • Copy a sample DearPyGUI app and gauge its intuitiveness
    • Determine a more specific list of plots that we’d like to have for our final project, and see if there’s anything that DearPyGUI clearly does better
    • Rewrite what I’ve done so far if we do decide to switch
  • Better documentation for code
    • README file for setting up dependencies on different OS
    • Better code comments
    • Better MVP formalization
  • Record Screen
    • User can choose to either save/discard a recording instead of auto-saving
    • User can customize naming a recording upon saving
  • (Optional) Playback Screen (if we do switch to DearPyGUI, I don’t think I’ll have time for this)
    • Add cursor that moves while playing
    • Let user drag cursor to various points in the playback
    • Add separation of signal into rows + scrolling

Susanna’s Status Report for 2/15/2025

This week, I learned more about the EGG and its setup and ground truth research methods with the rest of the term. I also continued to work on building my Tkinter skills and think about the frames needed for the desktop app.

School of Music Conversations

Before getting into technical stuff, I wanted to note some new thoughts from conversations with School of Music collaborators, particularly with regards to options for sheet music syncing and microphone recording. One thing that we hadn’t thought much about is the need of an accompanist to help our singers to stay on key, if we’re doing anything beyond basic warmups (where it would be relatively easy to play ascending/descending starting notes). Fortunately, there’s accompanists in the SoM who would be willing to help with this. I suppose one of the possible complications could be getting a clear microphone signal when multiple ins. However, a strong possible advantage of this setup is that it might actually help us with sheet music syncing– if the accompanist uses a MIDI keyboard, the direct MIDI input will be comparatively easy to match to the MusicXML file. This would be a helpful way of syncing to sheet music without forcing our vocalists to stay on a super strict tempo.

Speaking of the tempo issue, our vocalists expressed mixed feelings about singing to a metronome. On the one hand, both of them appreciate, to an extent, the opportunity to do more practice with a metronome. However, they also worry about having the room to properly and fully form notes as they sing, which (as I understand it) could be impeded by an overly rushed/strict tempo. I think we could definitely sacrifice some level of matching accuracy to give some leeway here. Also worth noting that both vocalists strongly preferred the idea of an audible metronome to a visual one. However, I’ve come to tentatively prefer the MIDI idea (if that ends up being possible) given that an accompanist will be necessary in any case, and adding an accompanist would make the metronome option more complicated besides.

That being said, we’ve decided that sheet music syncing is a stretch goal for now– even though, as a musician, I think this will be one of the most useful extra things we can do, and I’d definitely like to prioritize it if possible! However, it is secondary to simply recording and displaying the EGG/audio signals themselves, so I’m going to put a pin on this rabbithole until I get more of the basics done.

Tkinter Learning

I’d never used Tkinter before, so I started by following parts of this Real Python tutorial to get started, with particular focus on learning about frame widgets (which I plan to use for switching between windows in the application). 

Based on my experience with building web applications, and given the number of frames in the application, I decided to follow a Model-View-Controller (MVC) paradigm for the desktop app to keep things organized. After some trial-and-error, I worked through this tutorial by Nazmul Ahsan to learn how this can be done in Tkinter. 

Progress Status

I am still on schedule with my parts of the project according to the plan. That said, I haven’t really made progress in the code for the application itself, and I think I ought to have set apart more explicit time in the schedule for simply getting to know the technologies that I’m using for this project. 

Goals for Next Week

  • Implement the basic MVC application with homepage
  • Create GitHub repo
  • Create baseline audio recording page functionality
  • Create baseline audio playback page functionality (with room for adding a future EGG signal)
  • Help Melina with creating wireframes for application

Team Status Report for 2/8/2025

This week, our main focus has been on trying to track down our EGG sensor. We got in touch with Professor Leah Helou at the University of Pittsburgh, but were unable to actually pick up the sensor. A new pickup date has been rescheduled for early this upcoming week.

We also did further research into reading about the laryngeal closed quotient, getting more clarity on how our app ought to measure CQ and give feedback. There was also some important methodology concerns that were further discussed through readings, such as getting a proper CQ measured from sung notes that vary in frequency.

We worked on making decisions about and setting up our technical stack. We’ve decided to use a Python backend due to the accessibility of signal-processing libraries like Librosa and music21. For the frontend, we will tentatively be using TKinter for our GUI. See a more detailed schedule for the frontend development in Susanna’s update post (specifically with regards to the recording and replay pages).

On the signals processing side of things, we were hoping to use VoceVista to process the EGG signal. However, this is an expensive application, and we’re not sure if it’ll be possible for us to get a free trial to test it out.

Susanna’s Status Report for 2/8/25

While we’re still waiting to actually acquire the EGG sensor, my goal is to get a head start on planning out the recording feature + sheet music tracking. Ideally, the user will be able to input sheet music and a recording of them singing, and when they analyze their feedback in the app, the two will sync up. While the sheet music element wasn’t initially something we deemed important, it became a priority due to how this would dramatically increase usability, allowing the user to visually interact with the app’s feedback rather than be forced to relisten to a potentially lengthy audio file in order to get context for the EGG signal they see. It’s also a significant way of differentiating our project from our main existing “competitor,” VoceVista, which doesn’t have this feature. Additionally, if for some reason the EGG thing falls through, this mechanism will still be useful if we end up measuring a different form of vocal health.

General Brainstorm

Here’s what I’m envisioning, roughly, for this particular view of the app. Keep in mind that we don’t yet know how the feedback is going to look, or how we want to display the EGG signal itself (is the waveform showing vocal fold contact actually useful? Maybe we just want to show the closed quotient over time?). But that’s not my problem right now. Also note that there ought to be an option for inputting and viewing an audio/EGG signal without a corresponding sheet music input, as that’s not always going to be possible or desirable.

The technical challenges here are far from trivial. Namely:

  1. How do we parse sheet music (what should be sung)?
  2. How do we parse the audio signal input (what’s actually being sung)?
  3. How do we sync these things up?

Researching how, exactly, this synchronization could be achieved, the prospect of a complicated system using pitch recognition and onset detection is daunting (though, we do hope to incorporate pitch detection later on as one of our analytic metrics). Doing something like fully converting the input signal into a MIDI file is difficult to achieve accurately even with established software, so, basically a project in its own right. 

So here’s an idea: what if the whole synchronization is just based on tempo? If we know exactly how long each beat is, and can detect when the vocalist starts singing, we can match each measure to a specific timestamp in the audio recording without caring about the content of what’s actually being sung. This means that all the app needs to know is:

  1. Tempo
    1. This should be adjustable by user
    2. Ideally, the user should be able to set which beats they want to hear the click on (eg, full measure vs every beat)
  2. Time signature
    1. Could be inputted by user, or extracted from the score
  3. Where measure divisions occur in the score 
    1. This should be detectable in a MusicXML formatted file (the common file format used by score editors, and probably what we’ll require as the format for the sheet music input)
    2. However, if this is the only thing that needs to be extracted from the score, reading just the measure divisions off of a PDF seems more doable than the amount of parsing originally assumed necessary (extracting notes, rhythms, etc)
      1. But let’s not overcomplicate!
  4. When the vocalist starts singing
    1. Onset detection
    2. Or, the recording page could already include the sheet music, provide a countdown, and scroll through the music as it’s being sung along to the click track. That could be cool, and a similar mechanism to what we’d be using for playback.

Potential difficulties/downsides:

  1. Irregularities in beats per measure
    1. Pickup notes that are less than the full measure
    2. Songs that change time signature partway through, repeats, DC Al Coda, etc
      1. We just won’t support these things, unless, maybe, time signature changes come up a lot
  2. A strict tempo reduces the vocalist’s expressive toolbox 
    1. Little to rubato, no tempo changes
    2. Will potentially make vocalists self-conscious, not sing like they normally would
    3. How much of an issue is this? A question for our SoM friends
      1. Sacrifice some accuracy in the synchronization for the sake of a bit of leeway for vocalists? 
      2. If an individual really hates it, or it’s not feasible for a particular song, could just turn off sheet music sync for that specific recording. Not necessarily a deal breaker for the feature as a whole.
  3. Click track is distracting
    1. If it’s in an earpiece, could make it harder to hear oneself
      1. Using something blinking on screen, instead of an audio track, is another option
      2. And/or, will singing opera make it hard to hear the click track?
    2. If it’s played on speaker, could be distracting on playback, plus make the resulting audio signal more difficult to parse for things like pitch
      1. This could be mitigated somewhat by using a good microphone?
      2. Again, the visual signal is an option?
  4. Latency between producing click track and hearing it, particularly if using an earpiece

Despite these drawbacks, the overall simplicity of this system is a major upside, and even if this particular mechanism ends up insufficient in some way, I think it’ll provide a good foundation for our sheet music syncing. My main concerns are how this might limit vocalists expression, and how distracting a click track might be to the singer. These are things that I will want to bring up at next week’s meeting with School of Music folks.

Workflow

In the meantime, this will be my order of operations:

  1. Finalize framework choice
    1. We’ve decided that we (almost certainly) will be building a desktop app with a Python backend (the complications of a webapp seem unnecessary for a project that only needs to run on a few devices in a lab setting, and Python has very accessible/expansive libraries for signal processing)
    2. The frontend, however, is still up in the air– this is going to be my main domain, so it needs to be something that I feel comfortable learning relatively quickly
  2. Set up basic “Hello World”
    1. Figure out a basic page format
      1. Header
      2. Menu
  3. Create fundamental music view
    1. Both the recording and playback pages will use the same foundational format
    2. Parse & display the MusicXML file input
      1. This could be very complicated. I have no idea.
    3. Separate music into lines, with room for modularity (to insert EGG signal data etc)
    4. Figure out where measure separations are
      1. Measures will likely need to all be equal widths for the sake of signal matching ultimately? Not sure if this is crucial yet
    5. Create cursor that moves over score in time, left to right, when given tempo
      1. On the record page, tempo will be given explicitly by the user. On the playback page, the tempo will be part of the recording’s metadata
      2. How the cursor aligns with the measures will depend on the MusicXML file format and what is possible. Could just figure out amount of space between measure separation lines + move cursor smoothly across, given the amount of time per measure
    6. Deal with scrolling top-to-bottom on page
  4. Create record page
    1. Set tempo
    2. Record audio 
      1. While music scrolls by, or by itself
    3. Countdown
    4. Backing click track
  5. Create playback page
    1. Audio sync with measures + playback
    2. Play/pause button
    3. Drag cursor

Picking Framework & Technologies

I considered several options for a frontend framework for the Python desktop app. I was initially interested in using Electron.js, as this would work well with my existing knowledge of HTML/CSS/JavaScript, and would allow for a high level of customizability. However, Electron.js has very high overhead, and would introduce the complications of basically having to run an entire web browser instance in order to run the app. I also considered using PyQt, but I haven’t had experience with this framework, and it seemed like its idiosyncrasies would mean a pretty steep learning curve. So, I’ve decided to proceed with using TKinter, the standard Python GUI– in my small experiments, it seems to so far be straightforward, and since I’m starting out with some of the more complicated views of my GUI, I think I’ll be able to tell fairly quickly whether or not it’ll be sufficient for my purposes.

Finally, I wanted to learn how to parse and display MuseScore files in a Python program. I’ve started by learning how to use music21, a package developed in collaboration with the MIT music department, that offers a wide range of functionalities for dealing with MusicXML files. I’ve only really had the time to get this library set up, so I have yet to actually do much with it directly. Till next week!