CREU Project Blog: 2016

Monday, December 12, 2016

Recovered Drive!

Great news! Last week we had a terrible accident with one of our hard drives that resulted in us not being able to access our data sets. Luckily a classmate we reached out to knew how to repair the drive and we were back to normal in a few days. After recovering the drive we made another back up with an external hard drive, just in case we have an issue in the future.

I've created a readme doc detailing the commands, steps, and resources for my research partner to follow while we're on break. During the break I also plan on creating a classifier for the curved and triangular tragus.

Sunday, December 4, 2016

Progress Meeting

Today Morgan and I spoke with Dr. Washington, about how much progress we have made to date. We also discussed the milestones we wanted to tackle over the winter break. The game-plan is for Morgan to deliver the lobule classifier by December 18th. The types of classifications for lobules are attached and unattached. Currently we have classifiers for the Helix portion of the ear. Also we recently had issues with the external hard drive that was used to store our collection of ear samples. We are currently trying to recover the data sets.

Friday, November 18, 2016

Testing Classifier Performance

As I previously mentioned I wrote a python script to test how well our classifier detects narrow helixes. I wanted to test a small sample size, so I took 5 ear samples from a folder within our research gdrive to test against. The results were a bit discouraging, but I realized there's a few things I can do to better train the cascade.

detect.py (source code)

Results:

Trial 1 - 4 Narrow Helixes Detected

Trial 2 - 7 Narrow Helixes Detected

Trial 3- 4 Narrow Helixes Detected

Trial 4 - 5 Narrow Helixes Detected

Trial 5 - 5 Narrow Helixes Detected

The results from testing were very inaccurate. There should be only one narrow helix detected in a sample image. A few things that I believe will help are providing more samples to train with. With this classifier there are two positive samples per negative sample. In resources I found many cascades were trained with a high ratio of negative samples compared to positives. Also in my test script I specify minNeighbors to equal 5. Which means that my program will detect at least 5 objects before it declares that a narrow helix is found. I believe if I increase the minimum neighbors the detection will be more accurate.

Thursday, November 17, 2016

Testing Our Classifier to Detect Narrow Helixes

To test the performance of our newly created classifier. I created a python script that runs the classifier against a set of ear samples that were not used to train the classifier. None of the positive samples that went into building our positive vector file were included. Our wonderful mentor Dr. Washington stressed to "Don't test on what you train!" doing so will greatly skew the results and not provide an accurate depiction of the classifier's quality.

Classifier Format

The train cascade tool created and converted our cascade to multiple xml files. Our narrow_helix_cascade contains xml files for each stage ran during the training (stage0.xml, stage1.xml, stage2.xml, etc..) params.xml contains the supplied arguments to the train_cascade command, and the classifier.xml file has the features and results from all stages of training.

Training Our Classifier

After constructing our vector file. Our next task involves using the file as input for training our classifier. This is done by using the opencv_traincascade command line tool.

opencv_traincascade -data classifier -vec narrow_positives.vec -bg narrow_negatives.txt\
-numStages 3 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -numPos 20\
-numNeg 10 -w 24 -h 24 -mode ALL -precalcValBufSize 256\
-precalcIdxBufSize 256

Parameters
In our case classifier is the location we want the classifier files to be stored. The -vec flag requires the vec file we generated in our last step. We supplied the file that specifies paths to all the negative sample files we created

PrecalcValBufSize indicates the amount of memory we allow for executing the program 256MB in our case. If we had a larger sample size more memory would make processing faster, but since we have a small sample size and this is one of our first trials we won't need much. The number of negative and positive samples is given and The amount of stages that we want the classifier to undergo is given with -nstages parameter. MinHitRate stands for the minimal desired hit rate for each stage of the classifier

When trying to train the classifier, we ran into a few issues.

The attempt above, failed at the first stage. At first I was missing or incorrectly giving values to the command's parameter. I also believe that finding the right amount of stages to train effected the outcome as well as this trial's lack of negatives samples compared to the number of positives.

Eventually we were able to get a successful run. (see below)

PARAMETERS:
cascadeDirName: classifier
vecFileName: narrow_positives.vec
bgFileName: narrow_negatives.txt
numPos: 20
numNeg: 10
numStages: 3
precalcValBufSize[Mb] : 256
precalcIdxBufSize[Mb] : 256
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 24
sampleHeight: 24
boostType: GAB
minHitRate: 0.999
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: ALL
Number of unique features given windowSize [24,24] : 261600

===== TRAINING 0-stage =====
<BEGIN
POS count : consumed 20 : 20
NEG count : acceptanceRatio 10 : 1
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 1 seconds.

===== TRAINING 1-stage =====
<BEGIN
POS count : consumed 20 : 20
NEG count : acceptanceRatio 10 : 0.217391
Precalculation time: 0
+----+---------+---------+
| N | HR | FA |
+----+---------+---------+
| 1| 1| 0|
+----+---------+---------+
END>
Training until now has taken 0 days 0 hours 0 minutes 2 seconds.

===== TRAINING 2-stage =====
<BEGIN
POS count : consumed 20 : 20
NEG count : acceptanceRatio 4 : 0.1
Required leaf false alarm rate achieved. Branch training terminated.

Constructing a Vec File Based on Positive Narrow Helix Samples

After creating our description file of positive and negative samples, the next step towards building our classifiers includes packing the positive samples into a vec file.

Building a vector file is done via the opencv_createsamples utility. Opencv_createsamples allows us to generate a large number of samples from a small number of input images by applying distortions and transformations to positive samples.

We wrote shell scripts to automate using a few of the opencv command line tools. The shell script for createsamples is below.

Part of our vec file generated

Creating Our Negative Samples + Negative Description File

When researching different ways to develop negative samples, we found that we obtain the best results for the classifier by having a slight variant of the features we wish to detect embedded in an image that does not contain any characteristics of the image.
Negative images can be anything, but the classifier is more accurate if it includes a variant of a positive sample. Ideally negative images would look exactly like the positive samples, except they wouldn't contain the object we want to recognize.

Using Gimp, an image manipulation program we placed images of ears in the foreground of a background/backdrop.

Examples of Negative Samples:

Negative Description File:

Friday, November 4, 2016

Creating our description file of positive narrow helix samples

After collecting positive training images of narrow helixes we cropped our sample images of ears to just the portion that contained the helix. This cropping was done using an open source object marker tool written in python.

The object marker allows us to specify the region of interest by drawing a bounded rectangle object in each positive image then produces a text file description of the coordinates corresponding to the location of the helixes.

This data will be used to construct our positive vector file to eventually train our classifier.

Thursday, November 3, 2016

Next Steps

The Next Steps for our project:

Work through Haar-training tutorials

http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
https://pythonprogramming.net/haar-cascade-object-detection-python-opencv-tutorial/
https://www.cs.auckland.ac.nz/~m.rezaei/Tutorials/Creating_a_Cascade_of_Haar-Like_Classifiers_Step_by_Step.pdf
http://note.sonots.com/SciSoftware/haartraining.html#z97120d9

Generate XML file for Helix haartraining
Verify + Test Helix Classifier by feeding dummy images

Test classifier against sample images like trucks and other vehicles to ensure matches aren't returned.

Introduction to Haar Cascades

Now that we're starting to build our extraction tool we needed to gain more background information to acquire a better understanding of how haarcascades work.

Background Info:

A HaarCascasde is used to detect objects within images. This feature based classifier was first introduced in the Viola Jones Algorithm explained in the paper "Rapid Object Detection using a Boosted Cascade of Simple Features" by Paula Viola and Michael Jones. The detection method is based off of machine learning and applying a cascade function that is trained from negative and positive images. After the cascade function is trained it can be used to detect the desired object within sample images.

Viola Jones Algorithm

The Viola Jones algorithm detection algorithm depends on "Haar features" to detect the presence of a desired image in a sample, an "Integral Image" which is a representation of the original image. The integral image allows a detector to evaluate features quickly, several operation are performed per pixel from an image. After each pixel is computed any Haar feature can be detected in current time regardless of the position in the image or scale of the image. "AdaBoost" is another vital part of the algorithm and is used for feature feature selection. Adaboost increases the speed of classification by excluding irrelevant features by focusing on a subset of Haar-like features. Cascading as previously mentioned is one of the major contributions to object detection from the algorithm. Cascading increases the speed of the classifier by focusing on the critical portions of the image. Non-promising regions of a sample are disregarded. Increasingly complex processing is applied only once a feature of interest is found.

Explored Sources:
"Rapid Object Detection using a Boosted Cascade of Simple Features"
Face Detection using Haar Cascades

Wednesday, November 2, 2016

Helix Distinction: Wide vs. Narrow

The helix is located in the upper portion of the ear that consists of cartilage and resembles a y-shaped curve (see diagram of ear below).

For feature extraction to analyze the helix portion of the ear we created two categories of helix; wide or narrow. We distinguish between the two categories through looking at the amount of cartilage contained in the sample. Sample images where the helix seems to have a lot of cartilage are considered wide whereas helixes that are small and the outer rim is very defined are considered narrow in our classifier.

Tuesday, October 18, 2016

Research Group Presentation

Yesterday Morgan and I presented the progress we made to the rest of the researchers. The discussions about the project, gave us more insights on the direction we can take later after we complete our initial objectives.

Major points taken:

Utilize our classifier in a demo security application (for validation)
Record Compare the accuracy between multiple races (have african-american, white, asian, etc participants)
Improve our classifier to handle participants with higher levels of melanin. (if time permits)

Our Presentation is Below:

Friday, October 14, 2016

Weekly Research Recap Meeting

October 13, 2016 Meeting Breakdown

In our weekly research meeting we discussed preparing for our large research group presentation on the 17th. Morgan and I will prepare slides introduces ourselves, the project, and provide an overview to other researchers on the progress we have made.

In addition, we were able to talk to the main lab technician to get a dedicated computer to build our classifiers. We also switched from trying to build a haar-cascade classifier on anti-helixes to helixes. Our next objective is to find 20 positive samples from our database of a wide and narrow helix.

For research purposes we're constraining what we consider the helix to be to just the top portion of the ear. We discussed the possibility of using edge detection to distinguish between wide and narrow helixes, but for the moment we will be using visual approximations.

Friday, October 7, 2016

Face Detection Sample Program

Morgan and I met up this evening to go through a OpenCV face detection sample program. We were able to get things working and learn more about OpenCV and Cascades.

The detection program was implemented in python, and the source code with notes are below:

Program Output:

Thursday, October 6, 2016

Weekly Research Recap Meeting

Today we had our weekly research meeting to go over the progress we made over the last few weeks. Dr. Washington discussed different meeting times, schedules, and we agreed upon having weekend check-ins to ensure that we're staying on track and completing our milestones.

At the moment our main goal is to find image examples of skinny and fat anti-helixes and create Haar Cascade classifiers for skinny anti-helixes. If possible Morgan and I would like to complete these objectives and present our findings during large research meeting on the 17th of October.

Wednesday, October 5, 2016

Research Partner Setup

Today I focused on meeting up with my research partner, Morgan so that she could have the proper environment for our project. The primary task we completed together was installing and configuring a python and OpenCV environment. After getting this set up we went through a sample OpenCV python program to verify everything was working properly. The next steps would be to go through a more involved sample, for the purpose of getting more familiar with the library's methods.

Thursday, September 29, 2016

Visual Biometrics

They're many different methods of gaining biometrics. In the research we're conducting we'll be examining ears and finding ways to detect and extract discriminating features, so the majority of our project will deal with visual biometrics.

To gain some more background on visual biometrics I looked through parts of Matching Shape Sequences in Video with an application to Human Movement Analysis. A paper written by Ashok Veeraraghavan, Amit Roy Chowdhury and Rama Chellappa.

Key things I learned from this paper were:
- Parametric vs Non-Parametric Statistics and models.
- Bioinformatics can be taken from a person's effect on the environment. (i.e. A person's shadow and silhouette can be used as a unique feature for recognition.)

More info on Parametric vs Non-Parametric Stats:
Parametric versus non-parametric

I also started wondering if it was possible to use solely visual biometrics to distinguishing features between twins and came across this paper: A Study of Multibiometric Traits of Identical Twins

Preperation: Getting Setup With OpenCV

For the past week I've been trying to familiarize myself with all the aspects of our research and the resources that we'll need to complete the project.

The primary tool we'll be using for to build our ear classification scheme is OpenCV. Which is a library for performing computer vision related tasks.

Installing OpenCV for Macs isn't as easy straight forward as other platforms like windows.

My research partner and I have followed the tutorials below to get setup with OpenCV:

http://www.mobileway.net/2015/02/14/install-opencv-for-python-on-mac-os-x/

https://jjyap.wordpress.com/2014/05/24/installing-opencv-2-4-9-on-mac-osx-with-python-support/