Skip to main content

Activity 9 Applications Morphological Operations 2 of 3 : Playing Musical Notes though Image Processing

For this activity, we will try to read a sheet music via image processing and play the music using Scilab. A sheet music is basically black and white so it is easy to separate the notes from the background by thresholding and applying morphological operations. 

The first task for this is to find a sheet music. I used the song Clementine and the sheet music is found in Figure 1.
Figure 1. Sheet music for Celementine

To make processing easier we divide the sheet into different lines. In this case we have four lines as shown in Figure 2.
Figure 2. The 4 lines in the sheet music of Clementine were separated
 (line 1 to 4 from top to bottom)

We then binarize the images with a threshold of 0.95 and invert them such that the background is black and the foreground is white. We disregard the area of the image where the clef is located. Using morphological operations, specifically CloseImage(),OpenImage() and ErodeImage() we reduce the notes into small blobs. We then take the centroid to further reduce the blobs of the notes into points. Shown in Figure 3 are the centroids of each note's head which were dilated with a circle of radius 2 for visibility.
  
    
  
Figure 3. Blobs representing the positions of the centroids of the notes' heads

The lines of the staff are separated by 10 pixels so each pitch has an interval of 5 pixels. We assign the pitch of the notes by checking the pixel coordinates and determining the interval in which it belongs. The code used to do this is shown in Figure 4. This is in order of decreasing pitch from the G2 to C1 (middle C). After doing this, we have the list of pitches for each note in each line.
Figure 4. Code used to determine the pitch of each note; y is the list of coordinates of each note along the y axis (row) and notes is a list equal to y whose values are replaced by the corresponding pitch of each note

The next consideration is the duration of each note. There are four different notes in this piece (i.e. half note, quarter note, eighth note, and sixteenth note) with one special character (i.e.dot). These notes have different areas. In order to distinguish them we took the area of some notes and used them for thresholding. Of course, we cannot use the blob image previously used to determine pitch since we need the notes to be whole. Applying only the OpenImage() and CloseImage() operations used, we get the binarized images in Figure 5.
  
  
  
Figure 5. Binarized image of the notes

The presence of the barlines is not a concern since it can be filtered out when categorizing the notes using their areas. The code used to do this is shown in Figure 6.

Figure 6. Code used to distinguish the notes using their areas. notes is a list where the corresponding durations are saved (half note = 2, quarter note = 1, eighth note = 0.5, sixteenth note = 0.25 and dot = 1.5*preceding note's duration)

Finally, we have the notes for each line and their corresponding pitch. These were saved in text files and then loaded in Scilab and input to the function note to produce our sound file. :D

Listen here:


I give myself a grade of 10/10 for successfully producing the required output. 

I would like to thank Ms. Maria ELoisa Ventura for helpful discussions.

References:
[1] M. Soriano,Activity 9 Applications Morphological Operations 2 of 3 : Playing Musical Notes though Image Processing, AP186 Manual, 2012
[2] www.phy.mtu.edu/~suits/notefreqs.html

Comments

Popular posts from this blog

Activity 10 Applications of Morphological Operation 3 of 3: Looping through images

When doing image-based measurements, we often want to separate the region of interest (ROI) from the background. One way to do this is by representing the ROIs as blobs. Binarizing the image using the optimum threshold obtained from the image histogram simplifies the task of segmenting the ROI. Usually, we want to examine or process several ROIs in one image. We solve this by looping through the subimages and processing each. The binarized images may be cleaned using morphological operations.  In this activity, we want to be able to distinguish simulated "normal cells" from simulated "cancer cells" by comparing their areas. We do this by taking the best estimate of the area of a "normal cell" and making it our reference.  Figure 1 shows a scanned image of scattered punched papers which we imagine to be cells examined under the microscope. These will be the "normal cells." Figure 1. Scattered punched paper digitized using flatbe...

Activity 12: Basic Video Processing

Hello!  In this activity we will try to process a video of a kinematic event in order to extract information such as constants, frequencies, etc. For our group, we took a video of a 3D spring pendulum which we observed in one plane. We would like to trace its path and then try to determine its phase-space plot. The mass was covered in masking tape with the bottom colored red to facilitate easier segmentation. The video was taken using a Canon D10 camera at frame rate of 30fps.  Media 1. Video of the spring pendulum (first 50 frames only) The frames of the video were then extracted using Avidemux 2.5. The mass was then segmented from each frame using parametric segmentation. The patch of the region of interest (ROI) used for color segmentation is shown in Figure 1. Figure 1. Patch used to segment ROI  Using morphological operations, particularly Open and Close operations, the segmented images were cleaned. The extracted frames for different observation...

Activity 7: Morphological Operations

When talking about morphology, what immediately comes to mind are the forms and structures or shapes of objects. Hence, performing morphological operations imply that the shape or form of an object is altered.       In this activity, we will perform morphological operations on binary images. In particular, we make use of erosion and dilation . Erosion and dilation were performed on the following: 1. A 5×5 square 2. A triangle, base = 4 boxes, height = 3 boxes 3. A hollow 10×10 square, 2 boxes thick 4. A plus sign, one box thick, 5 boxes along each line Using each of the structuring elements below: 1. 2×2 ones 2. 2×1 ones 3. 1×2 ones 4. cross, 3 pixels long, one pixel thick. 5. A diagonal line, two boxes long, i.e. [[0 1],[1 0] ].      When performing these operations, it is important to note the “anchor” or “origin” of the structuring element in order to give an accurate prediction of the result. For the 2x2 ones, 2...