Accomplishments this week

  • Finished the basic setup for the code for image processing
    • Wrote code to get the scores from database used for matching, currently the files are stored locally in a folder that mimics database
      • Used os.walk to walk through and get the pdf files from the static folder, ignore all files which do not end with pdf
    • Convert the scores of pdf format to png types to prepare for matching with input scores
      • Tried to use pdf2image library, however, the pdf files downloaded from MuseScore does not have high enough resolution and seems corrupted because they do not contain xref table which is required by the library. And the files are seen as malformed to the library functions, thus the library cannot be used.
      • Decided in the end to use wand library to deal with converting pdf files to png files and store all the png files in a folder whose name is the same as the piece of music, disadvantage with this library being the files output do not have as high resolution as the original files and could create difficulty when doing image matching
    • Get the input image from a specified input path, currently only supports one image in the input folder, later will create a queue for all the input images
      • Same process as retrieving files from the database, later on only need to change the input path in the configuration variables
  • Tried some ways of image matching algorithm
    • Compute the Structural Similarity Index (SSIM) between two images, and the difference between the two images are returned
      • Does not work well because the shape of the photo I took is different from the images stored in database, and by changing the shape of the input files, the music notes don’t exactly match, and difference can go up to 0.5 (maximum being 1) if comparing the whole page
    • Mean Squared Error (MSE) between two images, and the difference between the two images are returned, but it’s range not between 0 and 1, 0 means exactly the same and as the number increases, it means more difference between the two images
      • Does not work well for a similar reason with SSIM, since it is comparing patterns more than pixel intensities, music notes, to this algorithm, looks all similar and it is difficult to determine which music scores look the same and since it is applied to the image globally, it is even less accurate than SSIM

Progress for schedule

  • Slightly behind schedule
    • Reason: I imagined image matching to be easier, but cv2 methods like comparing image difference and MSE don’t work well with inputs like music scores which have repeated music notes
    • Ways to catch up: Decide to try image processing with the title of the music score (which most scores have), and if failed, turn to natural language processing which I have done before and can make sure that it will succeed

Deliverables I hope to accomplish next week

  • Finish matching an input music staff with music staff in the database

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *