How To Train Your Code: 2014

Saturday, December 27, 2014

Dimensions

Yesterday I had an idea of making a generalized p-dimensional m,n,k game (tic-tac-toe). The first problem is understanding the (spatial) dimensions. How can I check the neighbours of a point in space? How many directions should I check?

Let x∈ℜ^p, x = (x₁, x₂,...x_p). Then a neighbouring point in direction d∈ℜ^p, where d_i∈{-1, 0, 1} and d ≠ 0 is simply x+d. Here the maximum-norm of d is 1. Then if you take a point x then the total number of directions you should check is

(3^p-1)/2

(we can consider the p=e_i case the duplicate of p=-e_i where e_i is the i-th unit vector - e.g. right and left are the same directions, same for all duplicate directions p=-p, also we don't count the zero-vector).

Example: 2D table: horizontal, vertical and right-diagonal and left-diagonal. ((3²-1)/2=4).
Same for 3D: (3³-1)/2=13:

(0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1) => 7 (forward, up, right and combinations)
(0,-1,1), (1,0,-1), (-1,1,0), (1,-1,1), (1,-1,-1), (1,1,-1) => 6 (negating one index, and combinations)

So, I know how many directions I have for a game. But how to store the p-dimensional table? C# has multidimensional arrays, but I don't know the number of dimensions in advance. So I decided to make a single array of it.

In 2D, you access an element in case of a 2D array: arr[x, y], or with a single-dim array arr[x+y*size_x]. 3D case: arr[x+y*size_x+z*size_x*size_y]. Following this pattern I can calculate the linear offset of a p-dim array with size vector S∈ℜ^p at location x in the form:

x^Ts, where s^T= (1, S₁, S₁*S₂, S₁*S₂*S₃,...,∏S_i)

It can be simply rolled in a loop:

LinearOffset(x, dims)

s := 1

r := 0

for i from 0 to count(dims)

r := r + s*x_i

s := s * dims_i

end for
return r

It's all good, now I can create a p-dim array, access the neighbours, and write the tic-tac-toe. But how to visualize it? 2D is easy. 3D is easy too. I saw a video on YouTube with a 4D hypercube represented as 8 3D cubes. So, we can split up dimensions. A 3*3*3*3 4D tic-tac-toe is simplified to 3 3³ 3D tic-tac-toes. The last array (dimension) index is the index of the 3D tic-tac-toe table. A 3⁵ tic-tac-toe is simplified to 3 4D, then to 9 3D tic-tac-toes, and so on.

Tuesday, December 16, 2014

Trainers

Well, if you implemented a single-layer perceptrons, then you can make your AI more robust using multiple layers.

The first trainer in my list is the backpropagation method. I suggest reading it up on wiki, it is pretty easy to follow, even the derivation part, I implemented my classes following this link. I won't post the full code here (I can send it if sy needs it), but here are some of my implementation details:

First of all, you need a (feed forward) neural network class, in which every neuron is connected to every neuron in one layer above. Do not implement the training algorithms here.
The network is made of n layers, and n_i neurons in each layer:

/// <summary>
/// neuron[l][i] is the i-th neuron in layer l
/// <summary>
public readonly double[][] NeuronLayers;        
    
/// <summary>
/// weights[l][in][out] is the weight between neuron[l][in] and neuron[l + 1][out]
/// <summary>
public readonly  double[][,] WeightsLayers;

This network class can also have an Export() and an Import() function, and you can save the class to a file. My format is to define the layers with the number of neurons in it, then the weights per layer. For example a 2 layer perceptron can look like (2 input with 1 bias and 2 output neurons):

[3, 2]{0.2, 0.1}{0.4, 0.5}{0.2, 0.5}

[#neurons in layers, ...]{weights between neuron[0][0] and neurons[1]}{neuron[0][1] to 2nd layer}{neuron[0][2] to 2nd}

You should implement the trainers in separate classes: some of the trainers need addittional information, need to keep track of the previous changes, ect... So now each trainer class only have what it really needs. Then you can derive training classes, and reuse code for more advanced algorithms.
In the (base) trainer class I have one abstract method, and the NN that I want to train:

public abstract class NNTrainer
{
    ...
    public abstract void StartTraining();
    protected NeuralNetwork NeuralNetwork;
    NNTrainer(NeuralNetwork nn) { NeuralNetwork = nn; }
}

In the derived constructors, I can build the internal state of the trainer.

Currently I have 3 different trainers: trainrp (resilient backprop), online backprop and batch backprop with momentum terms. Sometimes your trainer just won't train well. It is possible that the error of the network will converge to a local minima, for example to 0.35 (with my implementation, 1 out of 100 converges to this value). Then we should generate new random weights and start over. It does not necessarily mean that the trainer is buggy, or it won't converge at all.
A training can take long time. In this case, it is recommended to ocassionally save the network and the training data during the long run.

The trainers I mentioned above are supervised learning algorithms. It means that they need example pairs, and they calculate a cost with a function, using the output from the network and the example pairs. This can be the mean-squared error between the target output in the examples and the actual output. But there are different learning paradigms:

In unsupervised learning there is only the input data, and the actual output, and the cost function, but no supervisor (a trainer to check your answers like comparing output to target in supervised learning). For example we can classify objects according to properties. We can discover patterns in data using this method (clustering). (If we see an apple or a banana, we can see they're different, without anyone (a trainer) telling us).
Reinforcement learning is similar to unsupervised learning. The network train itself without a supervisor, but it does this continously, according to the reward it gets from it's actions. It learns by studying its environment or its score in a game...the important part is the interaction with the environment. (a robot is hungry, he eats a basketball, he is still hungry... he eats a pancake, he is not hungry...next when he will be hungry, he will know he has to eat a pancake - it is reinforcement learning).

The uses of these techinques combined with eachother are infinite. It's time to play a Tic-Tac-Toe against your AI.

Thursday, December 11, 2014

AI - Some real code.

My first AI program is based on artifical neural networks. So how to train your network? In the previous post I mentioned a function that is our brain. The simplest form is a linear function, a perceptron. In short:

There are 2 layer of neurons.
Every neuron in the first layer is connected to every neuron in the second layer.
Between each connection there is a weight.
The neurons in the first layer can pick any values. They are the input neurons.
The second (output) layer is computed by calculating the weighted sum of the input neurons for each of the output neurons. They can be true or false (0 or 1), so we apply a so-called activation function on them. For now, it is a Step function.

You can see the full story behind this on this website.

//Weight[j, i] is the weight from Input[j] to Output[i] neuron.
Outputs[i] = Step(Sum(Input[j] * Weight[j, i] for j from 0 to Input.Length));

public static double Step(double d)
{
    return d < 0.0D ? 0.0D : 1.0D;
}

The goal is to teach the network to map an input vector to an output vector. We can teach these networks by examples, input-to-output pairs. Then we should figure out the weights, which generate the desired output (also called target) from it's input pair in the example.

The formula to update a weight is:

Weight[i, j] += LearningRate * (Target[j] - Output[j]) * Input[i];

LearningRate is used for fine-tuning the weights, because of the characteristics of the gradient descent method (next part).

Also, there is a hidden bias neuron in the input layer, with a constant value of 1. From the link:

The bias can be thought of as the propensity (a tendency towards a particular way of behaving) of the perceptron to fire irrespective of its inputs.

This cannot be modified by the user, but its weight is computed just for another input neuron.

So now the coding part...

Inputs and Outputs are arrays of doubles. The Weight layer is a matrix, with Inputs.Length in the first, and Outputs.Length in the second dimension. In C# it looks like:

double[] Inputs = new double[NumberOfInputs + 1]; //+1 for the bias
Inputs[NumberOfInputs] = 1.0; //The bias is constant 1.
double[] Outputs = new double[NumberOfOutputs]; 
double[,] Weights = new double[Inputs.Length, Outputs.Length];

//Weights[i, j] is the weight between Input[i] and Output[j]

Keep in mind that if you fill your inputs, you should only set Inputs.Length - 1 values (and leave the bias alone). So to calculate the output in C#:

public override double[] CalculateOutput(params double[] inputs)
{
    if (inputs == null) return Outputs; //Outputs is the last neuron layer, it is an array of doubles
    FillInputs(inputs); //it does range checks and fills a number of input neurons with the values given in the parameters

    Parallel.For(0, Outputs.Length, i => //the neuron loop can go in parallel in a layer. In case of a multilayer perceptron, the values are propagated from the first to the last layer
    {
        Outputs[i] = 0;
        for (int j = 0; j < Inputs.Length; j++)
            Outputs[i] += Inputs[j] * SingleLayerWeights[j, i];
        Outputs[i] = Step(Outputs[i]); //apply step function to clamp the value to 0 or 1.
    });
    return Outputs;
}

And how to train it? Using backpropagation, but in case of a single-layer system, it becomes the delta rule. I only share the code, you can also find the formula in the link above. We pass the inputs, and the target outputs, and let the code modify the weights to get the outputs right. This way if we present new input to the system, it can also classify it based on the rules learned from the previous examples.

public override void Teach_Backpropagation(double[] inputs, double[] targets)
{
    if (inputs == null || targets == null) return;
    FillInputs(inputs);

    var maxLoop = Math.Min(targets.Length, Outputs.Length); //bound checking
    Parallel.For(0, maxLoop, i =>
    {
        Outputs[i] = 0;
        for (int j = 0; j < Inputs.Length; j++)
        {
            Outputs[i] += Inputs[j] * SingleLayerWeights[j, i];
        }
        Outputs[i] = Step(Outputs[i]); //the block above is just the CalculateOutput function, but I can calculate it during the learning algorithm, and save another loop
        //This is the real part: it trains the weight to reproduce the target values.
        var diff = targets[i] - Outputs[i];
        for (int j = 0; j < Inputs.Length; j++)
            SingleLayerWeights[j, i] += LearningRate * diff * Inputs[j];
    });
}

These networks are used for linearly-separable problems. It means, that if we represent our input vector in space as dots, (in case of 2 inputs, it can be a 2D cartesian coordinate-system). If the output of this system is 1 (true), make the dot black, else make the dot white. If you can draw a line to separate the black and white dots, the perceptron can also do this. An example is the logical-and and the logical-or operators:

2 input neurons, 1 bias and 1 output neuron.

(The pictures are taken from the second link)

Another example if to check if a number is negative. Feed the network with negative numbers, and with false (0) outputs, then present some positive numbers, and make the targets true.
If you type another number, it will tell you wether it is positive or negative.It will be precise if you take numbers close to 0 from each side, because the system will draw the line according to the example values, and you should teach the system the precise bounds of the true and false parts of the space. This can be scaled to higher dimensions too, with 3 or more inputs, and outputs.

AI - How?

My current area of interest is Artifical Intelligence. Since I started programming I've been thinking about it: how can an algorithm evolve? The code writes/modifies itself... I just could not imagine how can something artifical, like a program or machine can learn concepts. Maybe with

if {...} else {...}

statements?
I made my first ever "Hello World!" in C++, and from then on I coded every kind of stuff. Now I use C#, C++, C and ASM to write programs. But with none of them I managed to write even a simple AI. I didn't read papers about this topic because at that time the math and theory behind this area was not comprehensible to me (altough it's still hard to understand). Now as I have more patience (or at least enough), I read articles and blogs about it. Many of them includes lots of formulas and math, so it's pretty necessary to have some clue about the followings:

mathematical analysis
probability theory
some linear algebra
...and maybe more I have not encountered yet.

So why is it a must? Because an AI - at least an optimal and scaleable one - is not made of millions of braches (if-s). It is more abstract than that. Just imagine that you are a function F(x, y, ... z). There are inputs, and of course outputs from this function. What are the inputs? A pain in your back, a sound, or a picture. And you also have reactions to these inputs. You go to doctor, start to whistle, or just simply realize what's in that picture. They are like outputs. But what's between them? A system of neurons, sending signals (or not) to other neurons. I'm not saying that only they are responsible for our actions and feelings. These neurons are behaving like a function. It is a big method that we shall construct. A system of simple connections. And it will be the brain of our AI.

Initial Commit

This is gonna be my first ever blog, not counting the previous one with 0 posts.

I've been programming for a while now (since 2010). I decided to create a blog, because it looks fun, and I have plenty of time...and it is a good way to track my progress.

I hope this will help someone in the future.

class MyBlog