Machine Learning Algorthims for Computer Vision: 2014

Progress update (2-09-14)

Here's what I did today:

Streamlined the code. Got rid of unnecessary steps and modified some others to save some time
Generalized the loading of test images to allow for any kind of file name, this way it supports images that belong to a known class as well as images not belonging to any class
Modified the data save procedure for an easier evaluation at a glance.
Added tracking of classification, this data gets then displayed and saved in a precision/recall table

Progress (1-09-14)

Today I managed to learn python and make a program that normalizes, crops and reshapes all the images in a folder. This will be used for both the java database and the python testing database.

Progress(30-08-14)

While I've had this blog a bit abandonned since the break, the research has continued during this 6 weeks of the semester. Here's a list of new features the code has achieved:

Ive changed the classification algorithm from KNN to SVM. Now i train a single SVM for each class, using all the faces in the database that dont belong to the class as negative examples.
All the intermediate results are saved (as csv) this way, when data that is needed has been calculated previosuly, it can be reused. This has also been done thinking on the smartphone implementation, where we can transfer just the proyection matrix and avoid long calculations
ive started using the Yale database of faces to get a standarized result that can be compared with other algorithms. (150 faces, 15 classes)
the code has been streamlined, removing and changing some opertaions to make it faster and easier to interpret
I also made a testing code to analyse the optimal number of eigenvectors to use, it runs the code with different number of eigenvectors and outputs save files with results for later analysis

at this point, the SVM has something missing, i thin the interpretation of the results is failing, as before i got some accuracy of 60% while now i always get 6.6%. I'm currently working on solving this.

Ive also started writing (using LaTeX), with the document strcuture finished and all the requiered packages and files created, the content seems to be progressing.

This following week i will use for three things, implement a python algorithm to test my algoritm against, fix the SVM and continue the writing.

Progress (2-06-2014)

This past two weeks I’ve been trying different image sizes to solve the eigenvalues problem. I found out that with a 62X96 image it takes around 2-3 hours to run the program. I've decided to leave it this size for now, but I still have to check the implications on accuracy that this image reduction carries.

I have also implemented a way to save to a .csv the eigenvectors and eigenvalues. This solves two purposes. First, once I’ve calculated the values for the first time don’t have to re-calculate them to debug and streamline the code (unless I change the image database). Second, when this code is moved to the android platform, the .csv with the values can be imported, so the phone wont be hang up in a 3-hour long calculation, assuming it has the sufficient processing power to actually calculate these values.

Progress (9-05-2014)

Fabio made me realize today that the matrices I am using are too large. Getting the eigenvalues in a 55000x55000 matrix would take too much time, even having the memory it'll be in the range of years. This means the approach I was thinking on following, just saving the matrices, won't work; I need to reduce them to about 10000x10000.

The simplest way to reduce the matrices is reduce the images. A 187x296px image will result in a 55352x1 vector, and that eventually leads to a 55352x55352 matrix. I have made a database of reduced photos and I'm testing the code against its data. This reduction will probably result in a decrease of the accuracy of the algorithm, so I will study different ways of reducing the size of the matrix.

An idea I've been thinking about lately is dividing the face in regions and identify each region separately, then make a "score" for each face in the database and select the face with biggest score. The score of each feature would be weighted based on experiments, giving a higher score if a difficult part of the face is recognized. For example, eyes would have a higher score than foreheads as in first glance, foreheads don't seem to offer much information. This method would use a subspace for each feature, making instead of 1 really big matrix, a bunch of smaller ones.

Another one is making an average of the pixels, get a pixel, get the average value of all the surrounding pixels and store the average. Most probably, this average will also be calculated as a weighted average, so no edges are missed and accuracy is compromised the least. This will reduce the matrix sizes, but it is esentially the same as reducing the image, the quality will be lower (less px/in) so accuracy will be affected.

The problem with these two methods is that I'm still implementing fisherfaces to use as a control algorithm, so I can't really modify the algorithm in such a big way, at least for the first method, or it won't serve as a control algorithm. However, they might prove useful for the real algorithm.

Progress (7-05-2014)

It's been a month since my last update. With the mid-semester week and all the midSem exams/assignments I haven't been able to work on my thesis.

Today I got back to work. I managed to solve the memory issue I had by increasing the virtual memory of my PC to 8GB; for some reason it was set at 512MB.

With that change, the memory issue I had ( the product of [55000X9] * [9X55000]) is solved. Now it runs out of heap memory when calculating the eigenvalues of the resultant matrix.

I'm going to try to save to a file the 55000X55000 matrix. That way I can obtain the eigenvalues in a new process (or a different, faster one) and go "around" the memory problem.

Progress (1-04-2014)

Today I have finished migrating my project to a windows machine. Talking with Fabio led me to discover some errors in the code, it got stuck when i was using the OpCV function mulTransposed(). When I fixed this a new error appeared: "OpenCV Error: Insufficient memory (Failed to allocate 3665441028 bytes)". I have now started researching on how to solve this error.

About Fisherfaces

I started to implement FisherFaces about a month ago. At this moment, I have succesfully implemented PCA for small matrices. The biggest simple matrix (not a proper image) I've tested has 5X5.

When it comes to proper images, i've encountered that my computer might not be good enough to obtain the eigenvalues of big matrices, when I run the code with a set of sample faces, my computer freezes after starting the PCA, more especificaly, it freezes when starting the eigenvalue decomposition of the matrices. I plan to run the code in a different machine, to see if the error is a code error or if it is just my computer not being able to handle it. If the latter is true, I would consider obtaining theeigenvalues via Matlab.

Literature

Over this summer I've read this list of CV papers:

[1] The SVM-minus Similarity Score for Video Face Recognition- L.Wolf & N.Levy
[2] In Defense of Sparsity Based Face Recognition- W.Deng, J.Hu & J.Guo
[3] Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-based Classification- E.Ortiz, A.Wright & M.Shah
[4] Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild- Zhen Cui et. al.
[5] Towards Pose Robust Face Recognition -D.Yi, Z.Lei & S.Li
[6] Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer- L.Zhuang et.al.
[7] Facial feature detection using Haar Classifiers - P.Wilson & Dr. J. Fernandez
[8] Robust and Efficient Parametric Face Alignment -G. Tzimiropoulos et.al
[9] A Practical Transfer Learning Algorithm for Face Verification -X.Cao et.al
[10] Self-taught Learning: Transfer Learning from Unlabeled Data - R.Raina et.al.
[11] Face Recognition: A Literature Survey - W. Zao et.al.
[12] PCA vs. LDA - A.Martinez & A.kak
[13] Local Linear Regression (LLR) for Pose Invariant Face Recognition - X.Chai et.al
[14] Toward Pose-Invariant 2-D Face Recognition Through Point Distribution Models and Facial Symmetry- D.Gonzalez-Jimenez & J. Alba-Castro
[15] Face Recognition Using Eigenfaces -M.Turk & A.Pentland
[16] Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection- P.Belhumeur et.al.
[17] Two-Dimensional PCA: A New Approach to Appearance-Based Face Representation and Recognition -J.Yang et.al.
[18] Face Description with Local Binary Patterns Application to Face Recognition - T.Ahonen et.al.
[19] Face Authentication Using Adapted Local Binary Pattern Histograms -Y.Rodriguez
[20] Robust Face Recognition via Sparse Representation -JWright et.al
[21] Extended SRC Undersampled Face Recognition via Intraclass Variant Dictionary - W.Deng et.al
[22] Gabor Feature Based Classification Using the Enhanced Fisher Linear Discriminant Model for Face Recognition - C.Liu et.al
[23] Robust, accurate and efficient face recognition from a single training image: An uniform pursuit approach - W.Deng et.al

Out of those, we decided to implement the Sparsity Based algorithm discussed in [2]. To do that, first I will implement Fisherfaces, as practice and to use as a comparator with the algorithm I'll be writing.

Work over the summer

I started working on this project in December 2013. First, I had to learn Android programming. I knew iOS and Java programming so after a few tutorials and guides i was able to make a rudimentary app. After that, I moved onto what is going to be the backbone of the project, OpenCV.

Open Source Computer Vision (OpenCV) is an open source library for computer vision. Its built focusing on speed and eficiency and its written in C++. It has a variety of wrappers, including a Java and Android wrapper, wich are the ones that i will be using.

OpenCV was a bit more challenging than Android, as i had to learn how to link native code (C++) with the Java code, while understanding the C++ code. But by the end of December I had managed to construct an app that detects faces on real time. Then I moved on to choosing what algorithm I would use.

As I mentioned before, I will be using an sparisty based Face Recognition algoritm. I will also implement the FisherFaces algorithm to use as a control algoritm. February I spent implementing Fisherfaces.

Aim of the Thesis

My aim for this thesis is to create an Android app. This app will be getting a live feed of what the user is seeing. Through that live feed, it would detect faces and compare them against a database of people know by the user. If the detected face is know, the phone will play an audio file tailored to each users neccesities. Normally this would be the name of the person in front of the user. After the app is made, we will explore the possibilities of expanding the algorithm for object detection.

And it Begins...

This post marks the start of my thesis at University of Sydney. I will be researching machine learning algorithms for copmputer vision and posting the progress I make in this blog.

Over this summer I've read this list of CV papers:

About

Blog Archive