Tuesday, July 22, 2008
Activity 10: Preprocessing Handwritten Text
Continuing the challenge of real life image processing problems we were tasked to label handwritten texts. We were given 2 scanned documents:



Interpreting the instruction "Crop a portion of the image... with text within the lines." I "selectively, selected" the numbers written within the document on the right since these do not overlap with the lines (making my work easier, hehehe). The results are shown below:



As can be seen, applying the right filter can eliminate the hard horizontal lines to appear as smudges instead. The next step is to binarize the image and label the results. But as we would notice below, binarizing would yield black handwritten text (black is background!). Therefore, inversion of the image was performed after which the morphological tools of opening and closing were done. Having completed these procedures we can now label safely:



As can be seen from the results, the numbers: 7,6,8 are not only recognizable to us but also to the computer (in the context that it, was labeled as 1 character hehehe). The other two characters were butchered in the process however... although it is right that the two should be recognized as having two characters each, its just that the 10 and twelve both look like mangled 11s.

The code I implemented was my own:

clear all;
chdir('G:\poy\poy backup\physics\186\paper 10');

im = gray_imread('4piece.jpg');
scf(0);
subplot(1,3,1)
imshow(im);
fim = fftshift(fft2(im));
fim(1:23,55:57) =0;
fim(29:51,55:57) =0; //filter shape
F = real((fim).*conj(fim));
fim1 = log(F+1);
subplot(1,3,2)
imshow(fim1,[]);// shows the fourier transform
xset('colormap',jetcolormap(255));
new = abs(fft2(fim));
subplot(1,3,3)
imshow(new,[]);// shows the new image

Im = im2bw(new, 110/255) // set the right contrast
scf(1);
subplot(2,2,1)
imshow(Im);
Im = 1*(Im==0); // inverts the values
subplot(2,2,1)
imshow(Im);

se = ones(2,1);
Im = erode(Im,se);
Im = dilate(Im,se); //open
Im = dilate(Im,se);
Im = erode(Im,se); //close
subplot(2,2,3)
imshow(Im);

[L,n] = bwlabel(Im);
subplot(2,2,4)
imshow(L+1,rand(n+1,3)); //shows the labeling

And since I felt bad after I've "selectively selected" what I wanted to process I performed the same procedure for a different "selectively selected" portion of the first image on the left, the results are shown below:



Unlike the first I've processed, these had characters coinciding with the hard horizontal lines (albeit only a few)and like earlier, the lines were removed to be changed to smudges. Performing the next procedure to see if a good labeling can be done we can observe that From the word "Extension cord", using a (2x1) rectangular structuring element, the word "cord" can be labeled and distinguished, I also chose the same structuring element since the letter "S" from the first word is very evident albeit connected with the mangled "extension"



I enjoyed this activity! I performed everything on my own recalling my experience from previous activities! I give myself 10/10 neutrinos!

posted by poy @ 11:26 PM   1 comments
Thursday, July 17, 2008
Activity 9: Binary Operations
Given the image below, the task is to find the size of the scanned punched paper pieces.



The important concepts needed to accomplish this task is the morphological operations of opening and closing. These operations can be derived from the basic operations we obtained earlier, that is, eroding and dilating. The simple definition of closing and opening can be found from this website:

1. http://www.ph.tn.tudelft.nl/Courses/FIP/noframes/fip-Morpholo.html#Heading98



Opening can (in my simple terms) remove noise or small unwanted pixel values. Close on the other hand, as the name implies, closes the object to be identified as one.

I divided the image above into 7 parts of approximately 256x256 pieces (because it was hard to splice accurately, but of the same magnification, so its okay, hehehe). And the code I implemented was my very own!

clear a
chdir('G:\poy\poy backup\physics\186\paper 9');
getf('imhist.sci');


se1=ones(3,3); //square structuring element

im = imread('Circles002p1.JPG');

pref = 'Circles002p';
area=[];
counter=1

for i=7:7
im=imread(strcat([pref,string(i),'.JPG']))
im=im2gray(im);
im = im2bw(im,210/255)
er1=erode(im, se1);
op1=dilate(er1,se1); //open
di1=dilate(op1,se1);
cl1=erode(di1,se1); //close
[L,n] = bwlabel(cl1);
reg = (L==255);
for j=1:n
f = find(L==j);
reg_size = size(f,'*');
if reg_size <200 | reg_size > 600
L(f) = 0;
end
end
for k=1:n
area(counter) = length(find(L==k));
counter=counter+1;
end

scf(i);
subplot(2,2,1);
imshow(im);
subplot(2,2,2)
imshow(op1)
subplot(2,2,3);
imshow(cl1);
subplot(2,2,4)
imshow(L+1,rand(n+1,3));


end


scf(8);
histplot(length(area),area);
x=find(area<600 & area>450);
scf(9)
histplot(length(x), area(x));
a=area(x);
a=sum(a)/length(x) //area
y=stdev(area(x)) //error

From the first loop of the code, I attempted to eliminate the unwanted data already, that is, the large and small blobs that can be found in the image. The results of my implementation can be seen below:



This was done for all 7 segments. As can be seen from the results, the data that my code processed for the histogram was already reduced of clumped pieces of punched paper, but to be safe I also implemented that only values between 400 and 600 are acceptable. Also, using the strcat function of Scilab I was able to automatically link the 7 segments of the original image and collect data from them. (So do I get a bonus Ma'am? :D) The results of the histogram plot and standard deviation plot are shown below.



The area I obtained was 519.48 pixel values with a standard deviation of 28.127863. Using the 7th segment which does not contain any large/clumped pieces of paper to verify these results, the obtained area was 519.4! Giving us only 0.015% error!



Acknowledgements

I absolutely enjoyed this activity! Given the good values I obtained and the fun I had I give myself 10/10 neutrinos! Thank you to Dr. Soriano for her patience even when I was badgering her with a LOT of questions and Jeric for providing the histogram and area calculation part of the code.

posted by poy @ 6:27 PM   2 comments
Tuesday, July 15, 2008
Activity 8: Morphological Operations
The morphology of an image refers to its shape or structure. In image processing, morphological operations are treatments done on binary images, particularly images composed of 1's that form a particular shape (0's represent the background), in order to improve the image or to obtain the desired effect on it. All morphological operations affect the shapes of images; in this activity, we expand and shrink them.

We will use the erode and dilate functions in Scilab for this activity, therefore specifying each command to a corresponding structuring element (since the command won't make sense without one, much like an mkfftfilter without the filter) and we will apply them to simple geometric shapes like: a circle, a cross, a square, a hollow square, and a triangle.



Each of these shapes will be dilated and eroded to different structuring elements: a 4x4 square; a 4x2 horizontal rectangle; a 2x4 horizontal triangle; and a 5x1, 1x5 cross.



Contrasting the Scilab results from my predictions given from what I gathered from Dr. Soriano's lecture and from other internet resources:
1. http://www.ph.tn.tudelft.nl/Courses/FIP/noframes/fip-Morpholo.html
2. http://homepages.inf.ed.ac.uk/rbf/HIPR2/dilate.htm
3. http://homepages.inf.ed.ac.uk/rbf/HIPR2/erode.htm
Albeit my predictions are "slightly" exaggerated versions of what I want them to show, I included numerical predictions and from there it is safe for me to say that I obtained a good correlation! Yey!











Acknowledgements

Thank you to Jeric, JC, and Julie for clearing some confusing parts with me. I think I actually understood this activity! I give myself 10/10 neutrinos!

posted by poy @ 9:58 PM   0 comments
Thursday, July 10, 2008
Activity 7: Enhancement in the Frequency Domain
Anamorphic property of the fourier transform

The Fourier Transform of a signal is its spatial frequency distribution. Basically, any image or signal can be expressed as a superposition of sinusoids. Unique to the 2 dimensional Fourier Transform is the fact that rotation of the sinusoids result to the rotation of their transformations.

In this activity, we were tasked to observe what the Fourier Transformation of a sinusoid looks like and what happens: when its frequency is changed, when it's rotated, and when it's superimposed with another sinusoid. Using Scilab we can create a sinusoid (similar to Activity 5), take its FT, and expect that the rotation of the sinusoid results into the rotation of its Fourier Transformation.



As expected from previous activities, the Fourier Transformation of a sine function is a delta function. The two dots in the figure above for each frequency (f=4,8,12) were the delta functions. And as we can observe, if frequency were to be changed the spaces between the sinusoids will change. From the results we can see that, as frequency increased, the distance between the lines (the sine waves) becomes smaller. In the frequency domain (fourier transform), this will cause the delta function to move away from each other as seen above.

For varying angle of projection we obtain the following results:



The image of two sinusoids that are combined looks like a checkerboard. The Fourier Transform are four points that indicate two delta functions, that are probably from their respective sinusoids.



Fingerprints: Ridge Enhancement

Obtaining my scanned fingerprint, we process this image using Scilab. First of all, we convert the image into grayscale. After that, we remove the Dark Current (DC) by subtracting the mean of the image. To enhance the ridges of the fingerprints, we use the mkfftfilter function of Scilab. With that, we use a high pass filter which is the filter commonly used to increase the magnitude of high frequency components relative to the low frequency components thus producing a sharp contrast between the two. The results show the phenomenon clearly:



Lunar landing scanned pictures: Line removal

To reduce vertical lines in an image, in this case a juxtaposed lunar landscape. To do this reduction, we first subtract the mean of the image. Then we analyze the Fourier Transform and apply a filter that would eliminate the distinct Fourier Transformations of the vertical lines.



Acknowledgements

Thanks to Jeric for the absolutely brilliant method of removing the vertical lines from the lunar image. And thanks to Dr. Soriano for the codes on the first part of the activity. This had been a long activity, but worth the effort. I give myself a 9/10 neutrinos.

posted by poy @ 4:54 PM   0 comments
Tuesday, July 08, 2008
Activity 6: Fourier Transform of Image Formation
For this activity we were given 4 parts, each related with Fourier transform of images.

Familiarization with the discrete FFT

We were tasked to create a 128x128 image of a circle in MS Paint and to perform the Fast Fourier Transform (FFT) on the image. Below are the results I obtained, the color was from the jetcolormap function of Scilab.



The intensity image shows the circle transposed at the four corners, that is when it is shifted we will observe the correct transform of the circle: an Airy disk.

An Airy disk is a bright spot with concentric rings (like halos) surrounding it. Performing the FFT twice on the circle produced the same image. Or did it?

Performing the same method but this time with the image of the letter A we see that performing FFT twice inverts the original image.



Simulation of an imaging device

With the FFT we can actually simulate a camera. That is, with an image acting as an "aperture" we can actually see the resulting image of the object the camera is aiming at (that is, the other image. In this case an image with the acronym: VIP)



As with an actual camera, when the aperture is small (hence few light rays enter to expose the negative of what the image really is) the image is blurred. The bigger the aperture opening, the clearer the image is, and the greater its resemblance with the original.

Template matching using correlation

Template matching is a pattern recognition technique which looks for the pattern you wish to locate in an image. In this activity, we look for the pattern given by "A" (which we actually saw in the first activity) in a sentence. As we can see we have 5 bright dots indicating where "A" is. The "A"s can be seen where an inversion of the words: "rain", "spain", "stays", "mainly", and "plain" are supposed to be.



Edge detection using the convolution integral

Edge detection is simply finding out the edges of a particular orientation. Depending on what kind of pattern an image is convolved with, it shows the edges sharing the same orientation. For example when the pattern is horizontal, the horizontal edges shine brightly (similar with what we observed earlier in template matching).



I give myself 9/10 neutrinos for this activity since honestly I think I need more time to fully grasp these concepts all in one sitting.



Oh, and I want to thank Dr. Soriano for providing the codes. Thanks ma'am!
posted by poy @ 10:40 PM   0 comments
Thursday, July 03, 2008
Activity 5: Physical measurements from Discrete Fourier Transforms
Using the code given by Dr. Soriano on generating a 1D sinusoid and obtaining its Fourier Transform and the shifted transformation:

//to generate a 1D sinusoid
T = 2;
N = 256;
dt = T/256;
t = [0:dt:(N-1)*dt];
f = 5;
y = sin(2*%pi*f*t);
f1 = scf(1); plot(t,y);



//obtaining Fourier Transform and Frequency scale
FY = fft(y);
F = 1/(2*dt);
df = 2*F/256;
f = [-(df*(N/2)):df:df*(N/2 -1)];

//displaying the FFT shifted output with the frequency axis
scf(2);
f2 = scf(2); plot(f, fftshift(abs(FY)));



As expected from a sine function, the resulting transformation is a delta function.

Applying Discrete Fourier Transforms in image processing involves decomposing image into sine and cosine components. The process then becomes similar with the process of getting the Fourier Transforms of temporal signals. But this time, in images, we don't use "time" instead we use "pixels" (so instead of temporal signals, we might call it pixel signals). Obviously then, instead of a 1D Fourier Transform, in the context of images, a 2D Fourier Transform must be done instead.

Answering Guide Questions:

a.) Light from a fluorescent lamp is known to flicker at 120Hz. what should be the threshold sampling interval for accurate Fourier Transform analysis?

Using the Nyquist Theorem which states that the maximum frequency is inversely proportional to twice the threshold sampling interval: Fmax = 1/2Δt; then Δt = 0.004167 seconds.

b.) What is the effect of increasing the number of samples (N) in the Fourier Transform?

Increasing the number of samples (N) would give rise to widening the domain of our Fourier Transform.

c.) What is the effect of decreasing the sampling interval (Δt) in the Fourier Transform?

Decreasing Δt, T inevitably decreases for the same number of samples (N). The maximum frequency will increase, N stays the same, so Δf increases also.

d.) What is the effect of fixing the time interval T but increasing the number of samples (N)?

If we increase (N) and set the total time to be fixed, we inevitably decrease dt. We then would discover that there is minimal difference on the peak frequency, and that the amplitude is half of N. This is caused by the addition of the signal's components as compared to lower samples.

I give myself 9/10 neutrinos since even if I know I gave the right answers I'm not particularly confident of my grasp of the subject matter. Nevertheless I acknowledge Jeric for his grasp of the subject which helped me substantially in answering the questions.
posted by poy @ 7:11 PM   0 comments
Tuesday, July 01, 2008
Activity 4: Image Enhancement by Histogram Manipulation
The image I used was the controversial Lochness Monster picture that was a huge hit for paranormal enthusiasts worldwide when it was released decades ago - a poor contrast, gray scale photograph.

In this activity, my goal was to enhance the image by either stretching the histogram such that the highest value is 255 and the lowest is zero or by linearizing the image's cumulative distribution factor or CDF. Using the histogram code from the previous experiment: when we normalize the histogram we get the probability density function or PDF:

Evaluating the PDF with respect to the image's parameters we get the cumulative distribution function or CDF:

Using backprojecting which is done by acquiring the pixel value of each pixels and replacing it with the y-axis value of the CDF at that particular pixel value. We therefore obtain a new PDF and consequently a new CDF giving us an enhanced version of the original image:


Notice that the new image has a sharper and more pronounced contrast than the original image.

Using other Non-linear methods besides backprojecting gives a smaller image with good contrast almost similar to the first image. With PDF and CDF given by:



I performed this activity with the help of Jeric and Julie Ting for the codes. Due to the good correspondence that my enhanced images have with the original, I give myself 10/10 neutrinos! Yey!
posted by poy @ 7:00 AM   0 comments
 
About Me

Name: poy
Home: Quezon City, NCR, Philippines
About Me:
See my complete profile
Previous Post
Archives
Template by
Blogger templates