Melina’s Status Report for 2/15/2025

EGG Hunt

  • Picked up all hardware components associated with the EGG EG2-PCX from Prof. Helou, along with the corresponding digital user manual
  • This includes a simple microphone which we plan to replace with a more sophisticated microphone

Ground-Truth Research

  • Annotated reading “Results from a pilot longitudinal study of electrolaryngographically derived closed quotient for adult male singers in training” (David Howard)
  • Annotated reading “The Physiology of Vocal Damping: Historical Overview and Descriptive Insights in Professional Singers” (Marco Fantini)
  • Annotated reading “Musical Theater and Opera Singing—Why So
    Different? A Study of Subglottal Pressure, Voice Source, and Formant Frequency Characteristics” (Eva Björkner)
  • Concluded that identifying a “proper” CQ would not be the most appropriate solution approach since the number of years in singing experience and genre would be required for such a metric. This data is limited at this time and would not be reliable for the purposes of our project
    • Because of this we have adjusted our scheduled tasks to reflect this change
      • Understand/identify improper CQ CANCELLED – we want to stray away from defining a universal truth for what a proper CQ is, instead focusing on helping vocalists track and understand their formant tuning
      • Inform about change in CQ ← changed from detect/warn about improper CQ
  • Proposed our project solution approach should shift from warning of “improper” CQ to providing more flexible analysis tools (2) for tracking CQ over time
    • Analysis tool 1: Allowing user to view their CQ at a given moment for a specific recording playback
    • Analysis tool 2: Providing user with a visual representation of an evaluation of their CQ range over time
      • This would be approached by asking the user to record a controlled vocal exercise, such as an octave warmup that covers their tessitura (comfortable vocal range)
      • The CQ ranges for these controlled (constant) exercises can be summarized visually over time as suggested by David Howard’s graph of idealized changes in CQ
    • Proposed our use case and solution approach should shift to focus on advanced vocalists in one genre, opera singers
      • CQ can be significantly more difficult to measure for untrained singers, in fact, David Howard had to completely throw out some data samples from untrained singers due to unreliable CQ measurements
      • Unreliable CQ measurements are detrimental to our project, as an incorrect analysis could mislead a vocalist to make unhealthy decisions
      • CQ has also been found to vary significantly with genre, and as of now, we only have guaranteed access to opera singers
    • Created a ground-truth metric
      • EGG passes calibration test with the laryngeal simulator before and after usage
        • This is handy dandy calibration hardware component that came with the EGG itself thanks to Prof. Helou
      • Use built-in Larynx Tracking & Signal-Strength indicators
      • CQ measurement must be at least 20% to be considered detected

Schedule

  • COMPLETED Acquired sensor
  • COMPLETED Create a “Ground-Truth” metric

My scheduled tasks are on time, and have been slightly adjusted as described above (I will reiterate below)

  • Understand/identify improper CQ CANCELLED – we want to stray away from defining a universal truth for what a proper CQ is, instead focusing on helping vocalists track and understand their formant tuning
  •  Inform about change in CQ ← changed from detect/warn about improper CQ

Next Steps

My goals for this next week include the following deliverables

  • Present Design Review Slides
  • Draft Design Review Report with the team
  • Ensure the team has a repository set up along with agreement on code style/organizational expectations
  • Drafting wireframes for the frontend with feedback from Susanna
  • Draft code for pitch analysis

Tyler’s Status Report for 2/15/2025

So far I have been able to get the electroglottograph to work, as well as learn about the larynx simulator attachment we got. For VoceVista however I am a little worried that once the trial version runs out and I do not input the key in time I will lose any files I upload into VoceVista if I can’t get the software key before the trial runs out. PhaseComp is the backup plan for me to be able to continue using the electroglottograph after the trial runs out and before I get VoceVista purchased.

For the larynx simulator it is able to mimic the larynx vibrations and is quite useful in guaranteeing that our electroglottograph is producing proper data.

Some things I want to focus on are getting feedback on the recordings I upload and understanding how to manipulate the closed quotient physically. I have meetings scheduled with opera singers as well to test out placements of the sensors to get the best data. Another thing I am not unsure of right now is how the microphone is incorporated with the electroglottograph and why it needs to be connected to the electroglottograph.

So far I at least on track, if not slightly ahead of schedule as I keep experimenting with VoceVista as well as the electroglottograph.

Tyler’s Status Report for 2/8/2025

Voce Vista Usage

  • Voce Vista is a relatively expensive software, emailed them through their student discount application and hopefully will be approved to purchase a student discounted version for our project
  • Researched display of Voce Vista usage
    • In Voce Vista, CQ is displayed as a percentage in the analysis panel
    • The CQ is calculated based on the proportion of the glottal cycle where the vocal folds are closed

Purchasing components for EGG sensor

  • As we have not yet acquired the electroglottograph sensor I have not been able to purchase the parts yet
  • Professor Leah has a lot of the auxiliary items we will need, as soon as we acquire the EGG we need to take inventory to see what else we will need

Next weeks tasks

  • After acquiring the Electroglottograph, purchase a software to use with the sensor (Voce Vista if they give us a student discount otherwise PhaseComp)
  • Start measuring/utilizing Voce Vista to determine what type of data we can acquire from the EGG

Team Status Report for 2/8/2025

This week, our main focus has been on trying to track down our EGG sensor. We got in touch with Professor Leah Helou at the University of Pittsburgh, but were unable to actually pick up the sensor. A new pickup date has been rescheduled for early this upcoming week.

We also did further research into reading about the laryngeal closed quotient, getting more clarity on how our app ought to measure CQ and give feedback. There was also some important methodology concerns that were further discussed through readings, such as getting a proper CQ measured from sung notes that vary in frequency.

We worked on making decisions about and setting up our technical stack. We’ve decided to use a Python backend due to the accessibility of signal-processing libraries like Librosa and music21. For the frontend, we will tentatively be using TKinter for our GUI. See a more detailed schedule for the frontend development in Susanna’s update post (specifically with regards to the recording and replay pages).

On the signals processing side of things, we were hoping to use VoceVista to process the EGG signal. However, this is an expensive application, and we’re not sure if it’ll be possible for us to get a free trial to test it out.

Melina’s Status Report for 2/8/25

Ground-Truth Research

  • Found a collection of EGG-related research articles available for free through CMU account login at ScienceDirect
  • Annotated research article “Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers” (David Howard)
    • Found important data that will likely be useful for deriving a ground-truth metric of ideal CQs for men and women
    • Identified methodology concern about basing ideal CQs off of the data presented in this paper
      • The researchers note that the ideal CQs they derive are based off singers who are already trained, but that more research should be done to confirm these ideal ranges
      • Researchers suggest following changes in CQ ranges for singers being trained over time. If by the time those singers are considered “trained”, their CQ ranges matches those of the trained singers in this study, that would suggest their proposed ideal CQ ranges are a good basis for ground-truth
    • Identified methodology concern about CQ measurement for singing that includes varying frequencies
      • The laryngeal significantly changes between low and high pitches which can cause inaccurate CQ measurements from the sensors
      • Researchers in this paper addressed this by asking singers to adjust the placement of the sensors with their fingers as they were changing pitch, but this would be a problem for us since we are trying to develop a methodology that is comfortable and does not significantly impact singing physically or psychologically (i.e. making the singers overly self conscious while they sing)
    • Identified statistically significant CQ trends observed in trained singers
      •  Trained singers were significantly associated with higher CQs across the board; “Howard et al. (23) suggest that an increase in CQ with singing training/experience could be interpreted in terms of a more physically efficient voice usage”
      • Trained male singers tend to have a CQ range that remains constant
      • Trained female singers tend to have a CQ ranges that increases with frequency

EGG Hunt

  • Met with Prof. Helou at Forbes Tower to pick up the EGG, but she had not actually had a chance to find it yet; it is most likely in her other office at PMC
  • Prof. Helou plans to pick up the EGG from her PMC office on Monday and drop it off at CMU campus either Monday or Tuesday
  • During my visit, we discussed the CQ measurement methodology concern that the research article “Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers” (Howard) brings up
    • Prof. Helou confirmed this is an important methodology concern that she encountered, where she also mentioned it was difficult to move the sensors with the changing laryngeal height
    • Prof. Helou suggested considering limiting the variation in frequency range for a given measurement, however this would also mean our app would no longer be geared towards providing visual feedback for any music the user chooses to sing, since the change in frequency of sung notes would have to be limited
    • As described in Ground-Truth Research, asking the user to support the sensors manually is another proposed solution, but other approaches should be researched and considered for appropriate and comfortable sensor calibration

Susanna’s Status Report for 2/8/25

While we’re still waiting to actually acquire the EGG sensor, my goal is to get a head start on planning out the recording feature + sheet music tracking. Ideally, the user will be able to input sheet music and a recording of them singing, and when they analyze their feedback in the app, the two will sync up. While the sheet music element wasn’t initially something we deemed important, it became a priority due to how this would dramatically increase usability, allowing the user to visually interact with the app’s feedback rather than be forced to relisten to a potentially lengthy audio file in order to get context for the EGG signal they see. It’s also a significant way of differentiating our project from our main existing “competitor,” VoceVista, which doesn’t have this feature. Additionally, if for some reason the EGG thing falls through, this mechanism will still be useful if we end up measuring a different form of vocal health.

General Brainstorm

Here’s what I’m envisioning, roughly, for this particular view of the app. Keep in mind that we don’t yet know how the feedback is going to look, or how we want to display the EGG signal itself (is the waveform showing vocal fold contact actually useful? Maybe we just want to show the closed quotient over time?). But that’s not my problem right now. Also note that there ought to be an option for inputting and viewing an audio/EGG signal without a corresponding sheet music input, as that’s not always going to be possible or desirable.

The technical challenges here are far from trivial. Namely:

  1. How do we parse sheet music (what should be sung)?
  2. How do we parse the audio signal input (what’s actually being sung)?
  3. How do we sync these things up?

Researching how, exactly, this synchronization could be achieved, the prospect of a complicated system using pitch recognition and onset detection is daunting (though, we do hope to incorporate pitch detection later on as one of our analytic metrics). Doing something like fully converting the input signal into a MIDI file is difficult to achieve accurately even with established software, so, basically a project in its own right. 

So here’s an idea: what if the whole synchronization is just based on tempo? If we know exactly how long each beat is, and can detect when the vocalist starts singing, we can match each measure to a specific timestamp in the audio recording without caring about the content of what’s actually being sung. This means that all the app needs to know is:

  1. Tempo
    1. This should be adjustable by user
    2. Ideally, the user should be able to set which beats they want to hear the click on (eg, full measure vs every beat)
  2. Time signature
    1. Could be inputted by user, or extracted from the score
  3. Where measure divisions occur in the score 
    1. This should be detectable in a MusicXML formatted file (the common file format used by score editors, and probably what we’ll require as the format for the sheet music input)
    2. However, if this is the only thing that needs to be extracted from the score, reading just the measure divisions off of a PDF seems more doable than the amount of parsing originally assumed necessary (extracting notes, rhythms, etc)
      1. But let’s not overcomplicate!
  4. When the vocalist starts singing
    1. Onset detection
    2. Or, the recording page could already include the sheet music, provide a countdown, and scroll through the music as it’s being sung along to the click track. That could be cool, and a similar mechanism to what we’d be using for playback.

Potential difficulties/downsides:

  1. Irregularities in beats per measure
    1. Pickup notes that are less than the full measure
    2. Songs that change time signature partway through, repeats, DC Al Coda, etc
      1. We just won’t support these things, unless, maybe, time signature changes come up a lot
  2. A strict tempo reduces the vocalist’s expressive toolbox 
    1. Little to rubato, no tempo changes
    2. Will potentially make vocalists self-conscious, not sing like they normally would
    3. How much of an issue is this? A question for our SoM friends
      1. Sacrifice some accuracy in the synchronization for the sake of a bit of leeway for vocalists? 
      2. If an individual really hates it, or it’s not feasible for a particular song, could just turn off sheet music sync for that specific recording. Not necessarily a deal breaker for the feature as a whole.
  3. Click track is distracting
    1. If it’s in an earpiece, could make it harder to hear oneself
      1. Using something blinking on screen, instead of an audio track, is another option
      2. And/or, will singing opera make it hard to hear the click track?
    2. If it’s played on speaker, could be distracting on playback, plus make the resulting audio signal more difficult to parse for things like pitch
      1. This could be mitigated somewhat by using a good microphone?
      2. Again, the visual signal is an option?
  4. Latency between producing click track and hearing it, particularly if using an earpiece

Despite these drawbacks, the overall simplicity of this system is a major upside, and even if this particular mechanism ends up insufficient in some way, I think it’ll provide a good foundation for our sheet music syncing. My main concerns are how this might limit vocalists expression, and how distracting a click track might be to the singer. These are things that I will want to bring up at next week’s meeting with School of Music folks.

Workflow

In the meantime, this will be my order of operations:

  1. Finalize framework choice
    1. We’ve decided that we (almost certainly) will be building a desktop app with a Python backend (the complications of a webapp seem unnecessary for a project that only needs to run on a few devices in a lab setting, and Python has very accessible/expansive libraries for signal processing)
    2. The frontend, however, is still up in the air– this is going to be my main domain, so it needs to be something that I feel comfortable learning relatively quickly
  2. Set up basic “Hello World”
    1. Figure out a basic page format
      1. Header
      2. Menu
  3. Create fundamental music view
    1. Both the recording and playback pages will use the same foundational format
    2. Parse & display the MusicXML file input
      1. This could be very complicated. I have no idea.
    3. Separate music into lines, with room for modularity (to insert EGG signal data etc)
    4. Figure out where measure separations are
      1. Measures will likely need to all be equal widths for the sake of signal matching ultimately? Not sure if this is crucial yet
    5. Create cursor that moves over score in time, left to right, when given tempo
      1. On the record page, tempo will be given explicitly by the user. On the playback page, the tempo will be part of the recording’s metadata
      2. How the cursor aligns with the measures will depend on the MusicXML file format and what is possible. Could just figure out amount of space between measure separation lines + move cursor smoothly across, given the amount of time per measure
    6. Deal with scrolling top-to-bottom on page
  4. Create record page
    1. Set tempo
    2. Record audio 
      1. While music scrolls by, or by itself
    3. Countdown
    4. Backing click track
  5. Create playback page
    1. Audio sync with measures + playback
    2. Play/pause button
    3. Drag cursor

Picking Framework & Technologies

I considered several options for a frontend framework for the Python desktop app. I was initially interested in using Electron.js, as this would work well with my existing knowledge of HTML/CSS/JavaScript, and would allow for a high level of customizability. However, Electron.js has very high overhead, and would introduce the complications of basically having to run an entire web browser instance in order to run the app. I also considered using PyQt, but I haven’t had experience with this framework, and it seemed like its idiosyncrasies would mean a pretty steep learning curve. So, I’ve decided to proceed with using TKinter, the standard Python GUI– in my small experiments, it seems to so far be straightforward, and since I’m starting out with some of the more complicated views of my GUI, I think I’ll be able to tell fairly quickly whether or not it’ll be sufficient for my purposes.

Finally, I wanted to learn how to parse and display MuseScore files in a Python program. I’ve started by learning how to use music21, a package developed in collaboration with the MIT music department, that offers a wide range of functionalities for dealing with MusicXML files. I’ve only really had the time to get this library set up, so I have yet to actually do much with it directly. Till next week!

Introduction and Project Summary

EGGceptional Vocals is a project designed to help vocalists and vocal coaches gain deeper insights into vocal technique using real-time data. Since it can be difficult to assess vocal form, we want to use technology to measure openness in singing form using tools such as an electroglottograph (EGG), displaying it in an easy-to-understand web application. The app will provide visual feedback, helping users track their vocal progress over time and tailor their technique based on their personal singing goals—whether they are opera singers, rock vocalists, or new beginners. By integrating hardware and software, EGGceptional Vocals offers a unique, non-invasive way to measure and improve vocal performance, making healthy singing more accessible to all.