Building a bot playing Rock, Paper, Scissors using Hierarchical Temporal Memory being as good LSTM

The story begins when I saw a post in Numenta’s forum. Making an AI (or, ML algorithm) play rock, paper, scissors against a human player and learn the human player’s pattern. And by that, beating dominating the human. Than I thought to my self. Why not to write 2 AI. One in HTM, one using LSTM. And have them compete?

The idea is to have 2 AI players. One implemented in HTM and the other in RNN (or LSTM/GRU). Let them learn and predict their opponent’s next move then act accordingly. Sounds fun and straightforward enough.

The  RNN Agent

I’ll be using the broad definition of RNN – being a NN layer that has a hidden state coming from previous timesteps in this post unless specified otherwise.

Agents in this game is expected to do 2 things. First, learn the opponent’s  pattern. 2. Predict the opponent’s next move based of the opponent’s past move history.  To keep things simple. I’ll be using my favoret DL framework – tiny-dnn. And the RNN agent will be a simple 2 layer network receiving an one hot encoded vector as input . 

So, let’s define the network.

Then the tricky part…. The RNN is required to predict and learn at the same time. Yet due to tiny-dnn is a static graph library. The network can only be trained every 3 steps (the seq_len parameter) . So for each step, we save the input to a std::vector. Predict the opponent’s next move with the network with the current input. Than train the network every 3 steps.

The HTM Agent

The HTM agent is a lot straight forward. Recap from my previous project. All HTM layers receive a sparse binary tensor, or Sparse Distributed Representation(SDR) as input and generates a SDR representing what the algorithm learns. And TemporalMemory is a algorithm in HTM which learning to predict the next input based on input sequences observed in the past; exactly what I need.

So now, let’s create a TemporalMemory object and setup the hyper parameters. 3*ENCODE_WIDTH is how long the SDR the layer will receive. And TP_DEPTH is in broad terms, how may different sequence  that can potentially trigger an output.

To train and make use of the TemporalMeory layer. Simply call the compute() function. It performs the learning and prediction automatically.

Playing the game

Now I have the two agents ready. It’s time for them to play games!
 Set the two algorithm to play against each other 200K times. Compile and  run… Val la!  Here comes the result…. There is way less draws than win/looses? I have tried multiple times with different parameters. It seems to be a consistent trend. Intresting

The final results

The code that lets the algorithm plays against each other is quite boring so I didn’t show it. If you are interested. The source code available here.

Conclusion

I don’t know what conclusion I can draw from this experiment… Theoretically both algorithm should be winning 33% of the time. However in fact both LSTM and HTM is winning around 38% of the time. I can’t find any explanation for this. Never the less, TemporalMemory is definitely a valid algorithm to learn and predict from sequences.

Advertisements

3 thoughts on “Building a bot playing Rock, Paper, Scissors using Hierarchical Temporal Memory being as good LSTM

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Powered by WordPress.com.

Up ↑

%d bloggers like this: