Machine learning and computer vision has always been fields in computer science that I have some interest in. I also have some experience of writing programs that can classify for example paintings or beer labels. Doing handwriting recognition was also something that I had on my list of technical challenges.

Had as I’m glad to say that it’s something new that I can strike off the list! For this I used machine learning and in particular a SVM classifier, OpenCV and some C++ code.

Support Vector Machines

I’m not going to go in a lot of detail how a SVM works, as there is a vast collection of good resources on the internet available. O’reilly has a book called “Programming Collective Intelligence” that I highly recommend.

svm.jpg

The only thing that we need to know for this article is that SVM’s are supervised learning models that analyses data and is very suited for recognizing patterns.

The supervised is also important to note, because that means that we need to feed the model examples of numbers and what they are. After a lot of good examples it should be able to recognize the numbers autonomously.

MNIST database

So, as I pointed out, the SVM needs a lot of examples that it can learn from. For now it would be too time-consuming for manually creating those examples. Therefore, I have made the choice to use the MNIST dataset.

This is a large dataset of handwritten digits that is commonly used to train and test these kind of classifiers.

svm.jpg

It is regarded as a best practice to split your dataset and use a part for learning and the other for testing. For this experiment I used 70% to learn from and 30% for testing.

The features that makes up a number

An SVM classifier needs features that it uses to recognize different patterns. Features being numbers. In this case I will be using the colour values of individual pixels.

svm.jpg

The images in the MNIST dataset have a width of 28 pixels and a height of 28 pixels. This will result in 784 features (28x28) we will provide to the SVM including the class (number) this group of features are a member of.

Converting our images into features

I’ve used the OpenCV library as I have the most experience with it. It also contains numerous machine learning algorithms including SVM.

To train a SVM of images we need to create a training matrix. In this matrix each row will correspondent with one image and each column in that row correspond with an image feature.

So the first step we take is to convert our 2 dimensional image to 1 dimension.

svm.jpg

In OpenCV v3 it is relatively easy to convert a 2D Matrix (OpenCV Mat) to 1D.

// read image file (grayscale)
cv::Mat 2dMat = cv::imread("test.jpg", 0);

// convert 2d to 1d
cv::Mat 1Mat = 2dMat.clone().reshape(1,1);

From features to classes

When training we will also need a second matrix where each row correspond to the class, in other words the number the row of features represent.

svm.jpg

/**
 * Train
 **/

// build features and labels matrixes
...

// create SVM classifier and set its parameters
cv::Ptr<cv::ml::SVM> svm = cv::ml::SVM::create();
svm->setType(cv::ml::SVM::C_SVC);
svm->setKernel(cv::ml::SVM::POLY);
svm->setTermCriteria(cv::TermCriteria(cv::TermCriteria::MAX_ITER, 100, 1e-6));
svm->setGamma(3);
svm->setDegree(3);

// train svm classifier by passing training (features) and labels matrixes
svm->train(featuresMat, cv::ml::ROW_SAMPLE ,labelsMat);

// store its knowledge in a yaml file
svm->save('knowledge.yml');

How to determine which SVM parameters are best?

While there is certainly some theory available which you can rely on, the reality is that I always prefer to follow the same workflow in determing which parameters are best suited for my problem.

Test, tweak parameters, test, tweak parameters and repeat the whole process until you are at the point that the classifier makes no/less (more likely) faults.

Fortunately training a SVM is extremely fast, certainly if you compare it with training a Artifical Neural Network (using the CPU) for example. In practice the process is rather painless or you must have done something horrible wrong ;-)

From an image to a class

To classify an image we need to repeat the same step of converting it to a 1 dimensional matrix, pass it to the SVM and ask to predict which class (number) the image is part of.


/**
 * Classify
 **/
// load yaml data
svm->load('knowledge.yml');


// Read image into grayscale matrix and convert it to 1D
...

// predict and output which number is seen
int predicted = svm->predict(1dMat);
std::cout << std::endl  << "Number -> " << predicted << std::endl << std::endl;

A honourable mention : TinyDir

One of the things that I needed to do was to traverse over directories with test and training files. At the moment of writing, the standard library of C++ doesn’t come with a filesystem library. Although there is a proposal on the table, so in the future this may change.

I know that Boost has a good library to interact with filesystems, but I wanted to have something lightweight and preferably a simple header to include. TinyDir (a C library) fitted the bill perfectly.

For example this is the code to get all the directories in a folder

// open our root directory
tinydir_dir dir;
tinydir_open(&dir, "/root_dir");

// iterate over files/directories
while (dir.has_next)
{
    tinydir_file file;
    tinydir_readfile(&dir, &file);

    // check if the "file" is a directory
    if (file.is_dir)
    {
        printf(file.name);
    }

    // go to next file "file"
    tinydir_next(&dir);
}

// close the resource
tinydir_close(&dir);

Dead simple and without getting in the way.

Sourcecode

As in most cases the (C++) the complete sourcecode is available on github. The code should compile on Linux, BSD and OSX without much problems. A Makefile is also included.

Please keep the license (AGPL v3) in mind and the fact that I don’t provide any direct support!