Teaching an AI to play Flappy Bird

Teaching an AI to play Flappy Bird

AI has arrived. It’s driving cars, powering chatbots, beating humans at board games and investing in the stock market. It used to be just for experts and big tech companies. But now regular developers like us can use cloud services from Microsoft and Google or machine intelligence libraries like Tensorflow.

Now that AI is going mainstream, I realised it’s time for me to see what it’s all about. What better way to learn than making an AI to play Flappy Bird.

Here’s a video of my AI in action. It’s kind of magical isn’t it. (free game art from Kenney.nl and code from a tutorial by The Coding Train guy Daniel Shiffman).

Now that I’ve learned a bit about AI I’m going to give you a short and simple overview of how this AI works. AI can be pretty intimidating so I’m going to break down the core concepts without using any impenetrable technical jargon.

AI is different

AI solves problems in a fundamentally different way to regular programming. Normally you code a program by writing specific instructions for every situation that your program might encounter.

But with AI, first you create a generic program that can learn. Then you don’t tell it what to do. You train it.

A program that can learn

To make a program that can learn I’m going to use a neural network.

Neural networks are a technique for building computer programs that is loosely based on the way we think the human brain works. You start with a collection of software neurons. Each neuron in a brain has a bunch of inputs and one output. When enough electrical signals arrive at the inputs, the neuron will activate and send a signal out of the output.

In software this is really simple to reproduce.

As in a brain, the intelligence in a neural network comes from how these neurons are connected together. This is the neural network we’ll use for our Flappy Bird AI.

Our neural network will have 4 inputs, one hidden layer and 1 output. The inputs are: how high the plane is, how far away the next obstacle is and the top and bottom of the gap. The output is simple, just whether to tap or not.

The inputs I’ve selected are called features. If this was a more complex deep learning network, then the AI would pick the features itself. I would send the network the complete picture of the game and then let the network decide for itself what features are important.

So now we have the most complicated bit of this blog post: how the network makes decisions.

It works by sending signals into the inputs and letting the signals flow over the interconnected neurons to the output. The amount of signal that flows between neurons determines which neurons activate and which ones stay dormant. The network will make drastically different decisions based on the strength of these connections. These connection strengths are referred to as the weights.

Training the network

Ok, so now we have a brain. Next we need to teach it how to play. Teaching the network how to play Flappy Bird is really a case of searching for the best configuration of weights for its neural network. There are a lot of ways that you could configure this neural network. Some of those ways would result in a plane that flies really well and others would result in a plane that crashes immediately.

To find the configuration that flies best, we’re going to use a search and optimisation technique called a genetic algorithm. It’s inspired by the natural processes that drive biological evolution and although it sounds complicated, the principles behind it are actually quite simple.

First we create 250 neural networks all with random configurations. This is called initialisation. Then we try them out. Let’s put each of our 250 randomly configured neural networks in a plane and see how they fly.

Ok, that didn’t go very well. But if you watch closely, you’ll see that three of them actually flew past a few obstacles before they crashed.

So now we apply some natural selection. It’s survival of the fittest, so we measure which of the planes did best and use them create the next generation of 250 new neural networks. When we create the 250 new neural networks we’ll also add a little bit of randomness. This is called mutation and it helps us search for even better configurations. Ok, round 2.

We’re getting a bit better. So we repeat the selection and mutation for round 3.

Even better. So let’s keep doing this over and over.

After 7 rounds, we finally find this neural network configuration.

Success!

Now I’ll summarise what we’ve done using as much technobabble jargon as possible. Because let’s face it, AI jargon just sounds awesome.

We created an artificial neural network and used a genetic algorithm to optimise the neuron connection weights by applying an iterative selection and mutation process.

To play Flappy Bird.

You’re welcome 🙂

If you want to dive into the weird and wonderful world of AI yourself here are some great resources I’ve found:

Tensorflow playground