Lizzy’s Status 2/16
This week, our team started truly building our project. I personally created the GitHub repository for most of the code to sit in, so we could share our code back and forth. I also read a few more papers on typical OMR cases and common algorithms for solving the pressing problems. From there, my main task for the week was on determining the best software/modules/language to use for the OMR side of our project. I am most comfortable with Python and it is often my language of choice, so considering I would be building the majority of this part of the project, I started with options in that language. Because we are doing heavy image manipulation and visual pattern recognition, I knew we need to use Pillow (Python Imaging Library), or OpenCV. Pillow is a little more barebones, and while easier to install and use, may not have the ability to do some of the more complex parts of the project. OpenCV has a steeper learning curve, however it is well documented, commonly used in applications similar to ours, and much better and the more complicated computations. OpenCV with python also requires numpy so that was also installed. Once I started playing with these modules, it was clear OpenCV would do the job we needed and it would be a good module for us. From there, I started working on the first steps of OMR. First, I worked on binarizing the image by using an absolute threshold. Because our spec is dealing with ideal scans, we are assuming little to no noise, so I am taking a fairly high threshold and every pixel below that is turned to white, the rest go to black. From there, I was able to find the horizontal staff lines in an image by using horizontal projection and counting all the black pixels in each row in the image. I find the row with the maximum number of black pixels, take 80% of it, and that is the threshold I am using to determine if it is a staff line. Furthermore, any consecutive rows that are considered staff lines are grouped together to be considered one staff line. From here I was also able to remove the staff lines by checking if pixels above or below a staff line are black and if so, leave the black, otherwise turn it to white. I also whited out any pixels significantly above the first staff line, removing extraneous title information. See the images below.
So far my progress is on schedule. Everything I hoped to accomplish this week was completed, and I am on my way to developing a sufficient OMR. Since I am not behind, I don’t need to take any actions to catch up. For the next week, I hope to find the connected components in the image as well as create a data structure that can hold the note’s information such as positioning, where the note head is, where the bounding box is, etc. This will help later when determining pitch and intensity, without needing to all the work while parsing through the image itself.