Skip to main content

Activity 8 Applications of Morphological Operations 1 of 3: Pre-processing Text

In this activity, we aim to be able to extract text (handwritten or typed) using image processing techniques that we have learned. The image in Figure 1 is the source of the text we need to extract.  
Figure 1. Image of a document from which text will be extracted

The image is tilted so it was rotated using Gimp 2.8. Using the same software, I selected a portion of the image and cropped it (Figure 2a). The image was then loaded in Scilab 4.1.2 and converted to grayscale for image processing. The grayscale image is shown in Figure 2b.  


Figure 2. (a) Cropped portion from the rotated image of the document
 (b) grayscale version of the patch in (a)

The first task was to remove the lines, to do this, I took the fftshifted FT of the grayscale image and multiplied it by a mask to filter out the higher order frequencies that contribute to the lines. I then took the inverse FT to get the image with the lines removed. I then binarized the image and inverted it so that I can clean the image using morphological operations. Figure 3 shows the (a) FT of the grayscale image (with the masked center for visibility of other frequencies), (b)mask used to remove the lines and (c) the binarized and inverted of the masked image.

Figure 3. (a) FT of the grayscale version of the selected patch (masked zero order for visibility of other  frequencies) (b) Mask used to remove lines (c) Binarized and inverted version of the resulting image after implementing the mask in (b) to remove the lines

Morphological operation were applied on the binarized and inverted image after line removal to clean the image and connect the broken texts. The images in Figure 4 are the best that I can process so far.

Figure 4. Images cleaned using a  series of morphological operations (a) Close operation with rectangle, (b) Dilation of (a) with a diagonal, (c) Close operation applied on (b) with a diagonal 

I give myself a grade of 8 for this activity because I am not satisfied with what I have done. I wasn't able to reduce the thickness of the text to 1 pixel and separate each of the letters.

I would like to thank Ms. Eloisa Ventura for helpful discussions and Dr. Maricor Soriano for the hints given during class.

Comments

Popular posts from this blog

Activity 11: Color image segmentation

In image segmentation, we want to segment or separate a region of interest (ROI) from the entire image. We usually do this to extract useful information or identify objects from the image. The segmentation is done based on the features unique to the ROI.  In this activity, we want to segment objects from the background based on their color information. But real 3D objects in images, although monochromatic, may have shading variations. Hence, it is better to use the normalized chromaticity coordinates (NCC) instead of the RGB color space to enable the separation of brightness and pure color information.  To do this, we consider each pixel and the image and let the total intensity,  I, for that pixel be  I = R + G + B. Then for that pixel, the normalized chromaticity coordinates are computed as: r = R/I;                g = G/I;                   b = B/I The sum of all thre...

Activity 1 - Digital Scanning

The first activity for our AP 186 class was very interesting and quite useful. I have had problems before concerning manufacturers who give calibration curves but do not give the values. It’s really troublesome when you need them and you can’t find any way to retrieve the data. Fortunately, this digital scanning experiment resolves this dilemma. Way back when computers were not yet easily accessible, graphs were still hand-drawn. In this activity, we went to the CS Library to find old journals or thesis papers from which we can choose a hand-drawn graph. Our chosen graphs are to be scanned as an image where data are to be extracted. The graph that I chose was taken from the PhD Dissertation of Cherrie B. Pascual in 1987, titled, Voltammetry of some biologically significant organometallic compounds . The scanned image was tilted so I had to rotate it using Gimp v.2 (see Figure 1). Figure 1. Concentration dependence of DPASV stripping peaks of triphenyltin acacetate usi...

Activity 2: SciLab basics

For the second activity we had a bit of practice in using the SciLab programming language. We had to produce the following synthetic images: a.        Centered square aperture b.       Sine wave along x direction (corrugated roof) c.        Grating along x direction d.       Annulus e.       Circular aperture with graded transparency (Gaussian function) But first we had to follow a sample code given by Dr. Soriano. The code produced a 100 x 100 pixel – image of a centered circular aperture with radius of 35 pixels (Figure 1). Figure 1. Code and synthetic image for centered circular aperture After doing the centered circular aperture I am ready to do the other synthetic images. The easiest was the annulus since you just have to tweak the code for the centered circular aperture. I just replaced line 7 of the code with: A(find(r...