The world of machine learning is dominated with Python and some Lua – sure, there’s good reasons for that. But I want to see more C++ -Just for the sake of my love for C++. So I downloaded a Kaggle dataset and made a C++ program that learns from it. There is no real advantage to do so. But hopefully someone can learn from it. And at least I got something I can add to my resume.
I choose to tackle the Sign Language Digits Dataset since it’s small and no extra processing is needed.
You’ll need a C++17 capable compile, xtensor 0.16 and tiny-dnn to follow this tutorial.
First of all, stock tiny-dnn won’t work. The copy of xtensor included in it is old; lacking features we need for this project. So we need to update it. Updating xtensor in tiny-dnn’s folder to it’s (upstream’s) HEAD should do the job. You’ll have to do this part by your self.
But if you’re also using Linux. Here’s how. Assuming you’ve installed tiny-dnn to
First we need to install xtensor’s dependency.
git clone https://github.com/QuantStack/xtl cd xtl cmake . make sudo make install
Then, install xtensor
git clone https://github.com/QuantStack/xtensor cd xtensor cmake . make sudo make install
Finally, copy xtensor to replace the copy in tiny-dnn
sudo rm -r /usr/local/include/tiny_dnn/xtensor sudo cp /usr/local/include/xtensor /usr/local/include/tiny_dnn/
Then lets have a look at what kind of data we are processing. Opening the zip files downloaded from the Sign Language Digits Dataset.
npy files are numpys storage format. Thay store ndarrays. Well, firing up python…
Python 3.6.5 (default, May 11 2018, 04:00:52) [GCC 8.1.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> X = np.load("X.npy") >>> X.shape (2062, 64, 64) >>> Y = np.load("Y.npy") >>> Y.shape (2062, 10) >>> Y array([1., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
So there are 2062 samples. The inputs are 64×64 grayscale images. And the outputs are one hot vectors of length 10.
That’s it! The preparation is done. We can now start developing the code itself.
Let’s start by including the needed headers.
Nothing special here. The
#define CNN_USE_AVX tells tiny-dnn to use AVX instructions when possible. This makes the training process so much faster. If you are on a non AVX capable machine (Old x86 processors or ARM, etc…). Remove that line.
The dataset is stored in two .npy files
Y.npy. Which, stores numpy arrays inside. That’s a problem, I can’t call numpy in C++; not in a straight froward way. Fortunately, xtensor has npy loading built-in. The following code loads a npy file into xtensor.
Well not really. It turns out that Y.npy stores values in doubles instead of floats. A
xt::castis needed to convert the loaded values into floats. (Loading a npy file containing doubles as floats results in an exception.)
tiny-dnn accepts a vector of
vec_ts as input. where vec_t is a vector with custom allocators. And label_t is an integer of which category a given input belongs to, tiny-dnn converts it automatically into corresponding one-hot vectors while training. (No messing with sklearn to generate the one-hot vectors, Ya!)
So with the dataset loaded, the next step is to convert it from xtensor to a format tiny-dnn can use (and shuffle it).
decltype(auto) is C++’s way of saying: “Dear Compiler, please figure out the return type by your self”. Which in this case…. is a horribly long and complex tuple. 😛
Now, we construct a neural network and train it on the loaded data. I’m going with modified LeNet-5 model.
I like how tiny-dnn uses the
<< operator to stack new layers onto the network. Just like how
std::cout pushes output values to the screen.
onEnumerateEpoch function. They are the callback function called when tiny-dnn finishes a batch and an epoch. Technically they don’t have to be a function, They can be any callables. Ex: class/struct with
std::function , etc…
Since tiny-dnn doesn’t provide verbosity options like Keras to display training; we need to implement progress display ourself. This is why we need to callbacks.
After training the network, we can test how well out model is doing and save it for future use. The two functionalities are also built-in. Great!
As you might notice, I didn’t separate the data into training and teasing sets. I’m testing on the training set. Euuuuuuuuuuuu
I know, I know. That’s horrible. But I don’t think LeNet-5 can over-fit on anything, epically something that’s more complex than MNIST.
Compile the code with
g++ main.cpp -o main -pthread -O3 -std=c++17 -march=native and run it!
WOW, it works really well… I think I need to derive the test set the next time. But 99.1% with LeNet5 is awesome.
Here you are! A Deep Learning program in C++. Hopefully this tutorial helps, Please send me a mail or leave a comment if you have any questions or need any help.