The first trainer in my list is the backpropagation method. I suggest reading it up on wiki, it is pretty easy to follow, even the derivation part, I implemented my classes following this link. I won't post the full code here (I can send it if sy needs it), but here are some of my implementation details:
- First of all, you need a (feed forward) neural network class, in which every neuron is connected to every neuron in one layer above. Do not implement the training algorithms here.
- The network is made of n layers, and n_i neurons in each layer:
/// <summary> /// neuron[l][i] is the i-th neuron in layer l /// <summary> public readonly double[][] NeuronLayers; /// <summary> /// weights[l][in][out] is the weight between neuron[l][in] and neuron[l + 1][out] /// <summary> public readonly double[][,] WeightsLayers;
- This network class can also have an Export() and an Import() function, and you can save the class to a file. My format is to define the layers with the number of neurons in it, then the weights per layer. For example a 2 layer perceptron can look like (2 input with 1 bias and 2 output neurons):
[3, 2]{0.2, 0.1}{0.4, 0.5}{0.2, 0.5}
[#neurons in layers, ...]{weights between neuron[0][0] and neurons[1]}{neuron[0][1] to 2nd layer}{neuron[0][2] to 2nd}
- You should implement the trainers in separate classes: some of the trainers need addittional information, need to keep track of the previous changes, ect... So now each trainer class only have what it really needs. Then you can derive training classes, and reuse code for more advanced algorithms.
- In the (base) trainer class I have one abstract method, and the NN that I want to train:
public abstract class NNTrainer { ... public abstract void StartTraining(); protected NeuralNetwork NeuralNetwork; NNTrainer(NeuralNetwork nn) { NeuralNetwork = nn; } }
In the derived constructors, I can build the internal state of the trainer.
- Currently I have 3 different trainers: trainrp (resilient backprop), online backprop and batch backprop with momentum terms. Sometimes your trainer just won't train well. It is possible that the error of the network will converge to a local minima, for example to 0.35 (with my implementation, 1 out of 100 converges to this value). Then we should generate new random weights and start over. It does not necessarily mean that the trainer is buggy, or it won't converge at all.
- A training can take long time. In this case, it is recommended to ocassionally save the network and the training data during the long run.
The trainers I mentioned above are supervised learning algorithms. It means that they need example pairs, and they calculate a cost with a function, using the output from the network and the example pairs. This can be the mean-squared error between the target output in the examples and the actual output. But there are different learning paradigms:
- In unsupervised learning there is only the input data, and the actual output, and the cost function, but no supervisor (a trainer to check your answers like comparing output to target in supervised learning). For example we can classify objects according to properties. We can discover patterns in data using this method (clustering). (If we see an apple or a banana, we can see they're different, without anyone (a trainer) telling us).
- Reinforcement learning is similar to unsupervised learning. The network train itself without a supervisor, but it does this continously, according to the reward it gets from it's actions. It learns by studying its environment or its score in a game...the important part is the interaction with the environment. (a robot is hungry, he eats a basketball, he is still hungry... he eats a pancake, he is not hungry...next when he will be hungry, he will know he has to eat a pancake - it is reinforcement learning).
The uses of these techinques combined with eachother are infinite. It's time to play a Tic-Tac-Toe against your AI.
No comments:
Post a Comment