Tuesday, January 31, 2017

Retraining the Helix classifier. Attempts and Hurdles

Over the past two weeks I've been working on retraining the Helix classifier to improve it's accuracy. Initially when I trained the classifier I used 20 positive samples and 10 negative samples. For the next trials I collected 40 positive samples and 500 negative samples. The positive samples came from the Collection E Notre Dame database. The negative samples were retrieved from the UIUC Image Database for Car Detection.

The cascade training was terrible with 40 positive and 500 negatives. ~6 minutes and 30 seconds in total time and the process terminated after 2 stages.

After a few attempts I wasn't able to successfully detect parts of the helix. I did multiple trials where I changed the ratio of positive to negative samples. (i.e. 40 positive, 250 negative / 40 positive 80 negative). The trial with 80 negatives was quicker, but only went through 1 stage of training and when tested no helixes were detected.

Some trials trained the classifier within seconds others were lengthy, over 5 minutes. I recently found a post on the forum site stackoverflow that provided suggestions on the sample sizes and properties that gave optimal results. The ideal settings had a positive to negative ratio of 2:1. Many people training haar classifiers generated 1000's of samples from a limited supply of positive images by applying small rotations and distortions to the original samples. These transformations can be performed using the opencv_createsamples utility. For each photo it's best to create 200 samples with this technique. Another thing I learned that will improve my training is to ensure my samples are monochrome and to scale the negatives to a size of 100 x 100. Negative images should be the same size or larger than positives. Currently the size of my negatives are 100 x 40. Much smaller than my positive samples. 

I will apply these techniques in my next trials of testing.


No comments:

Post a Comment