Rahul’s Status Report for 2/25

I delivered my team’s design review presentation this week, and overall conveyed the design requirements and status of our project well. Analyzing the feedback, I see we need to bring some more of the use case requirements into light. I will work with the team to ensure Nora emphasizes the use case metrics in our final presentation. 

Last week, I talked about the preliminary code skeleton which I have worked to expand upon. Since then I have made some modifications and additions. The powershell script to call Audiveris now executes in foreground to signal the completion of the OMR job. Following this, I modified the python script to unzip the generated mxl file after the powershell job completes. For testing purposes, I also added a testing function that runs the whole pipeline of reading music, unzipping, passing the xml to python data structures, and then playing the music data through computer speakers. This was possible after doing more research into the music21 library and understanding respective formats and syntaxes.

I have also better understood a portion of how to create our notes scheduling algorithm. Once music21 has loaded the musicxml file into a stream, it separates the notes into parts. For the case of piano music, a bass clef and treble clef. Then within each of these parts I can access an array of measures, each of which contains an array of notes, rests or chords (which are also sort of arrays of notes). More work will have to be done in integration to set up our own player which will line up the note to the appropriate solenoid, but otherwise whatever note(s) the application is sitting on in the music21 structure should be directing some solenoid with the GPIO signal to be on. 

Our team has also migrated code to GitHub. All of my contributions are pushed to my forked copy of the team repo. This allows us to verify modifications by inspecting each other’s commits before merging. Overall, I am on schedule. My task for the week was to “modify XML/MIDI output to integration specs”, and I accomplished this with my preliminary music21 code. Next up, I will diagram the front end layout for the application. I think if time permits (though I probably will dedicate the rest of my time to writing up the design report with my team) I should also research what framework will be best to implement the application in. At the moment, pygame seems like a reasonable choice to meet our design requirements (especially the 150 ms latency time).



Nora’s Status Report for 2/18

This week I received the Raspberry Pi 4 from ECE Receiving. After picking up the RPi, I worked on setting it up so that we could control and program it. Since we didn’t have a wired keyboard, I had to install the VNC Viewer application to ssh into the RPi using its IP address. This allowed us to open up the RPi’s desktop and type inputs to it. After seeing the full capabilities of the RPi, we considered migrating all of the code, including the UI and OMR Python application, onto the RPi to reduce latency when starting and stopping. However, given our current toolchain, we require a Windows device to run Audiveris, so we decided the microseconds-milliseconds we save from integrating together was not worth the added effort of setting up a new environment and OS on the RPi.

On Saturday, I worked with Aden on testing the exploratory solenoids and transistors that we bought. I wrote a simple Python file using a basic GPIO library to set pins high so that we could test the power switching. The output voltage from the RPi was above the necessary threshold voltage of the MOSFET, so the power switching was quite seamless.

One challenge we encountered when testing the solenoids was that our initial code using the sleep function from the time library could only get the solenoids to depress and retract about four times a second (video linked here) which is below the six times a second target frequency in our use-case requirements. Since the sleep function takes an argument in seconds, one possibility for the bottleneck is that it couldn’t delay for a shorter amount of time. I will be working on installing and using the pigpio library so that we can have microsecond delays instead of the limited sleep function. However, if the bottleneck on the timing ends up being due to the hardware itself, then we will need to rescope that requirement and change the code that limits the max tempo/smallest note value to account for this frequency cap.

One big change we addressed in our team status report was the switch to the music21 library for parsing rather than a custom parsing process. This resulted in me being behind schedule, but the new Gantt chart developed for our Design Review Presentation accounts for this change.

Next week I will be looking more into using music21 to extract the notes from the data and converting them into GPIO high/low signals. I was able to convert the test XML file that Rahul generated into a Stream object, as shown in the image below. However, as you can see, there are several nested streams before the list of notes is accessible, so I will need to work on un-nesting the object if I want to be able to iterate through the notes correctly.

As for the actual process for scheduling, I will try to explain the vision here. We can have a count that keeps track of the current “time” unit that we are at in the piece, where an increment of 1 is 1 beat. This corresponds nicely to the offsets from music21 (i.e. the values in the {curly braces}). We can also calculate the duration of each beat (in milliseconds) by calculating 60,000/tempo. So we will iterate through the notes and at a given time, if a note is being played, we’ll set the GPIO pin associated with it to high and we will set the rest of the pins to low (which is easily accomplished with batch setting from the pigpio library). Thus we will also need a mapping function that connects the note pitch to a specific GPIO pin.

Overall, the classes I have taken that have helped me this week are 18-220 and 18-349. Knowledge of transistors and inductors as well as experience with the lab power supplies that I gained from 18-220 helped me when working on the circuitry. Embedded systems skills from 18-349 was very helpful when looking at documentation and datasheets for the microcontroller and circuit components respectively.

Rahul’s Status Report for 2/18

Since our OMR solution will be run on Windows, this week I put some work into setting up the shell scripts and python actions to call such a script. While our main hub application development won’t start for a few weeks, I still wanted to build a skeleton of functionality for calling the OMR without the default Audiveris GUI that could be modified later on. For this I had to learn some features of the .ps1 or Windows powershell scripting language by consulting stack overflow. Though the syntax is not as kind as bash, the operations remain the same, and I was able to allow python to execute it via the os module. I recalled some libraries from 15-112 for file path opening through GUI and decided to incorporate those into the skeleton code, as this will make our UX better come app design time. 

I also have spent time preparing for the design review presentation next week, as I will be delivering the presentation on behalf of my team. In the effort to expand sections of our block diagram, I felt it best to segment our project in three dimensions: a transcription phrase, a scheduling phase, and an execution phase. 

As will appear in our presentation:

I hope this will provide our audience(s) some clarity to some of the uncertainties regarding technicalities of our project. By doing this, I uncovered that our notes scheduling was defined rather weakly, and deserves more planning time. As a group we knew that converting music scores to MusicXML format was the way to go and that the RaspberryPi could go from there. After generating the XML with Audiveris, and trying to move forward with its output, we realized how much extraneous information there is just in the readable XML. This led me to do some digging on open source XML “condensing” code, just so that it could be organized into data structures that might be more easily accessible and operable by our (to be determined) mode of scheduling. Fortunately, I found that MIT has poured in years of experience and expertise into developing music21, a python module for importing music file formats for conversion to data structures that can be easily traversed or manipulated, and permitting export of different file types or playing imported source directly (Plus, they have awesome documentation). Considering the RaspberryPi will be switching on and off the solenoids from a python script, I can foresee having music21’s to preprocess the XML being an important intermediate step. 

In terms of staying on schedule, I needed to configure the OMR to output XML/MIDI. I consider this accomplished, since MIDI was an idea that was not necessarily needed (plus I found there are many resources available for XML to MIDI conversion). Since music21 will be able to play back our XML, our sound quality testing will be facilitated as such. Next week, I will have to work with Nora and Aden on formalizing our scheduling better to determine most if not all of the necessary transformations of the transcribed XML. Hopefully, I may get to writing a portion of the corresponding code.



Team Status Report for 2/18

This past week, we ordered and received solenoids for testing. We also received a Raspberry Pi 4 from the 18500 Inventory. This allowed us to explore the parts to help gather data for metrics for the upcoming Design Review Presentation. During our meeting with Professor Sullivan, we also received feedback on a set of additional requirements that we would need to include during the Design Review

Principles of Engineering, Science, and Mathematics

From the 7 ABET principles of STEM, we believe this week we incorporated principles 3, 5, 6, and 7. 

Our rationale for these choices is as follows:

(3) Our work towards the upcoming design review presentation involves effectively engaging and communicating to our audience how feasible our project is turning out to be after having already put together some of the pieces.

(5) Every week, we make sure to meet up outside of class at least once to regroup our work and try to help debug or discuss integration strategies for future development based on current progress/knowledge. This week was focused on how we may go about scheduling our newly arrived solenoids for key pressing. 

(6) To make sure we are meeting the quantitative targets for our design, we must gather and analyze data that motivates our design decisions. This week specifically, we used laboratory power supplies and testing equipment to measure the voltage and current needed to power our solenoids. The data we collected from this experimentation helped us determine which solenoids out of the initial batch we would go with for our final implementation.

(7) Since we are engaging with a lot of new technology on both the hardware and software sides, it was crucial for us to acquire and apply new knowledge when fleshing out our design. For instance, Rahul needed to learn to write powershell scripts, and Nora and Aden figured out thresholding power for the solenoids.

Risk and Risk Mitigation

The main risk for our project is the issue of safely powering multiple solenoids at once. During our testing of the solenoids, we found that the 25N solenoid required around 0.6 A of current at 10 V which results in 6W to power one solenoid. This was much less power than we initially expected, and if we consider our worst case scenario with five solenoids being powered at once, the total average power will be far less than we expected (about 30 W). However, we were only able to test one solenoid, so having all solenoids drawing current at the same time may cause potential problems if the additional load results in a greater current that does not scale linearly (since our power supply has a max amperage of 5A). To mitigate this risk, we are willing to decrease our max number of solenoids to 3.

Of the initial order of solenoids, we noticed that one arrived broken and thus could not test it well. These mechanical components accelerate rather quickly and thus could be susceptible to damage or may cause harm in cases of malfunction. They also make a lot of noise, which may interfere with the sound of the actual piano. To solve such issues we may need to add padding or modify our circuit to better average out the impact.

Changes Made to Design

The main change we have made to the design is the choice to incorporate the music21 Python library to aid in the parsing of the music. We chose to go with this library instead of writing our own parser because the musicXML file generated by Audiveris is quite bulky and contains a lot of extraneous text whereas the music21 library has a lot of functionality that can aid in scheduling, which, as we have been warned, is a non-trivial task. While this comes at no extra cost budget wise, it does require alterations to the schedule in order to include time learning how to use the library and incorporate it into the project. Our updated Gantt chart is included in the Design Review Presentation slides.

Aden’s Status Report For 2/18

This week I completed everything I aimed to following proposal presentation week. I wanted to order a handful of solenoids to test before deciding which ones we will be using to depress the piano keys. Additionally, I ordered a handful of MOSFETs designed to handle high voltage and current, which is perfect for our project since solenoids require a high voltage and draw about one ampere of current. Being able to get my hands on those parts as soon as possible was my top priority, and since I was able to get them within the week, we were also able to test the solenoids with a power source, our raspberry pi, and the MOSFETs I ordered.  All the parts worked exactly as planned, and now that we have tested the different solenoids, we have decided to go with the Adafruit 25 N solenoids.

Image of our single MOSFET and solenoid circuit:

As of this week I am on track according to the first iteration of our Gantt chart. I had hoped to get some parts ordered and tested by the end of the week, and that is exactly what I accomplished. Furthermore, I have helped a little with our design presentation and will be the audience for Rahul tomorrow when he practices for the presentation.

Next week, I hope to order the rest of our soleniods and MOSFETs, so I can begin building what will hopefully be the final circuitry for our accompanyBot. Additionally, I would like to help Nora develop the python code that will turn the MOSFETs on and cause the solenoids to move according to the XML file produced by the optical music parser. Lastly, if I have time, I would like to think about what the final structure that will be holding the circuitry and microcontroller will look like and possibly have a rough mockup of it sketched.

Finally, I used skills primarily developed in 18220 throughout the course of the past week.  Specifically, skills relating to principles 6 and 5 enumerated in our team report. 18220 helped develop analytical and team skills that I was able to utilize when working on the project and putting together a basic circuit to power the solenoids.

Rahul’s Status Report for 2/11

I did further research into alternative OMR technology earlier in the week as I was having trouble building a custom version Audiveris on Mac. Since the default output is an MXL file and not XML, I wanted to edit the source code to build it to my needs. I figured out all the modifications necessary to make, however, ran into a dependencies issue. Audiveris requires an older version of an OCR (optical character recognition) library called tesseract. Since my Mac is of the M2 type, it was practically impossible to get a hold of the older version of this software for M2. This was as far as I could get:

I was able to download the relevant jar files, and made sure to specify the classpath to link the files, but it seems that I need dynamic link files as well which will be impossible to get.

This led me to look for an alternative python-based OMR solution Oemer, which essentially runs a pretrained ML model on a pdf of music. The simplicity of usage was great, however runs take a few minutes to complete and upon reconverting the xml back to pdf form I was very dissatisfied with its accuracy  on the Charlie Brown example from the Team Status Report (probably like 50%).

Last week, I mentioned how Audiveris was able to run fairly well on windows, though it was outputting MXL files which was annoying as it read like a compressed binary. I eventually discovered that these MXL files are just zipped XML files, and unzipping a few kB per page would hardly be expensive for meeting the parsing time requirements that we set.

Eventually, I will write a bash script to run the OMR, callable by the GUI application from our Proposal. The only thing to keep in mind is that it will have to use Windows commands (yikes). This is a sample of what the commands would look like.

Running the OMR:

This is able to generate the MXL in the default output directory (though there is another parameter that can be used to specify the directory). It also produces a log of what was executed:

If you check the time stamps of the log, you will see this took roughly 11 seconds to parse the single page, which is very reasonable and should not be too cumbersome for our end user.

Previously I was solely running the omr from the Audiveris GUI which though pretty would not be ideal for our pipeline app.

Audiveris GUI build:

Next week I will integrate the file generation and unzipping into a preliminary version of the bash script mentioned earlier. I also hope to test the OMR on more music scores to come up with a numeric metric for comparison with our goals. My current progress is good and is on schedule.

Aden’s Status Report For 2/11

This week, I spent a significant amount of my time creating, editing, and preparing our proposal presentation. I presented on Monday and look forward to reviewing the feedback I will receive in the next few days. I felt like I did a decent job, but I know that with time and experience, the nerves will go away, and I will be able to present much more fluidly. My group and I also appreciated the questions from other groups and have taken them into consideration.

Aside from preparing for the presentation, I have also been looking into possible solenoids that I can use for the actuator part of our accompanyBot. We have selected two 25 N solenoids to test and one 5 N solenoid. After looking through previous projects with similar ambitions to ours, we have determined that getting solenoids with a higher force output, like 25 N, would likely be better when pressing down the piano key. Unfortunately, they are more expensive, so we would also like to order a 5 N solenoid to verify that they do not work as effectively before buying all 25 N solenoids.

Furthermore, I have done some preliminary research into how we will manage the power required for our system and what else will likely be necessary for the circuitry component of the project.

To conclude, I am currently on track with the Gantt chart presented in our project proposal. I hope to be more productive in the coming weeks as I aim to talk with our TA and order some parts at the start of next week. Hopefully, our parts ship fast, and I can begin testing and determining which solenoid we should use; however, if that is not the case, I plan on helping Nora develop and implement a plan for the microcontroller component of the project as we wait for parts.

Team Status Report for 2/11

This week, we presented our project proposal to faculty and fellow teams. We received some insightful questions that we should keep in mind for our design. For instance, one question asked how we would interpret key words in music such as “rubato” that are not associated with specific tempo values. 

Our project includes considerations for safety, cultural, and economic. In regards to safety, our project will likely be dealing with a large amount of power due to the high current required by the solenoids. Limiting this power is an important consideration for the safety of us as well as the user. Culturally, we recognize that our sheet music parsing is centered more on Western styles of music which puts a limitation on the style of music we can play. Importantly, our project aims to lower the cost associated with piano accompaniment. Since the cost of hiring a professional piano accompanist is high, we can provide a more inexpensive alternative.

We have also begun the process of looking ahead for parts that will be needed. Since we have uncertainties about power consumption, we will be ordering and testing different solenoids for our key press mechanism. The biggest risk associated with our project so far comes with the issue of powering our worst case number of solenoids at once. To mitigate this risk, we have considered reducing the max number of keys pressed to three keys at once. This would reduce the current draw to below the max current of our existing power supply while still allow us to have a polyphonic system that can cover a three note chord.

Our identified OMR solution is looking pretty good so far. We have decided to go with Audiveris, and here is a sample of the transcription.

Original Music Pdf file:

Corresponding Transcribed XML

Nora’s Status Report for 2/11

At the start of this past week I helped polish the Proposal Presentation slides in preparation for the week’s presentations. In particular, I created the Physical Implementation Model mockup as well as the block diagram. I also helped Aden prepare for the oral presentation.

Since my main responsibility area for the project is handling the microcontroller component, I have been brainstorming ways to organize the information from the XML so that the microcontroller can translate each individual note into scheduled signals that can be sent to the transistors to switch our solenoids on or off. Currently, I am envisioning a dictionary-like structure whose keys are certain “time” pointes (where the length of a time unit are determined by the tempo from the parser) and the values are a list of tasks to execute at the corresponding time (i.e. turn a specific solenoid on or off). When the microcontroller is playing a song, it will increment a counter for the time and retrieve the corresponding tasks at that time.

Since the XML file contains a lot of extraneous text, I will be working on providing Rahul with specifications for what should be kept in the file that is sent to the microcontroller. At a minimum, I will need the tempo value (or text description) of the piece and a sequence of notes with their pitches and duration.

We are planning on using a 30V 5A DC power supply that I have on hand to power our solenoids. Based on some preliminary power calculations using data from last semester’s Talking Piano group (since we don’t have access to the physical components to test yet), we determined that we would need a max power of (12V)(1.5A)(5 solenoids) = 90W for 5 solenoids turned on at one time. If the solenoids we end up buying draw comparable current, we may have to reduce the number of active solenoids to three at a time so that we are below the 5A limitation of our power supply. To lower the overall power so we meet the 60W average power requirement, I looked into the possibility of using a PWM servo driver for switching the transistors so that we can lower the average power. However, since PWM inputs for solenoids are used for position control, we would have to make sure it is still able to press a key down fully.

I requested a Raspberry Pi 4 from the 18-500 parts inventory, so I am hoping to set it up and familiarize myself with the environment in the coming week.

While I do not have any physical code written, I am ready to start implementing the data structures and initial algorithm for the scheduler. Thus I believe I am on schedule and should have completed a first pass implementation by the next status report.

Rahul’s Status Report for 2/5

This week, I worked with Nora and Aden to finalize proposal slides. In particular, I contributed mostly to the technical challenges, UI application mockup, and software solution approach slides. I constructed my portion of the Gantt chart schedule and worked with Nora to settle dependencies in our timelines between production of XML data and processing of this data. I also wrangled with the initial setup for our website.

I have done some digging for good OMR tools, and have played around with the following. This first project I configured on my mac, but certain dependencies are unavailable. Here was another project that was research backed with a pretrained ML model. I was able to build and run this ML software, however I found its accuracy to be around 70-80% which is a little low for our project standards. Additionally, it only was capable of producing a monotone interpretation of a music score. A third project I looked at was Audiveris. After installing relevant JDK toolkits, I was able to get it working for Windows OS. I think this may be the way to go. It seems to have a robust note tracking algorithm in play, however it is only capable for converting scores to MXL format. Hence, I will need to do further research on how to convert this MXL format into XML or equivalent.

I believe I am on schedule at the moment. I will have to drop the small possibility of manually generating and training an OMR machine learning model. Just from the work and documentation that I have read, such a task is a capstone project itself. Ideally within a week’s time, I will have determined the best OMR solution for our project and have it up and running. To reiterate from the proposal, this solution will parse the notes into an XML/JSON style structure with >95% accuracy.