Heh:
I've just finished writing my dissertation based on using a Genetic Algortihm to evolve neural networks to control a biorobtic cricket in order to study the evolution of cricket callings songs.
Here is an exert from the NN section (note this is an older copy so there are probably mistakes, don't worry the copy I'm handing in is much improved):
-------------------------------------------------
2.4.2a Artificial Neural Networks Morphology and Training
Artificial neural networks (ANNs) are composed of nodes, or units, interconnected by links which have associated weights. The weights are the primary means of memory in neural networks, and it is the function of learning to alter these weights so that the network as a whole can learn a particular function. Each unit has an activation level(threshold), equivalent to the action potential, which is used to determine when the unit can 'fire'; if the computation of the input is great enough then the unit can output to its connected units. The unit performs a local non-linear computation (known as the transfer function, typically a non-linear function such as Sigmoid or Gaussian) based on the linearly weighted sum of its inputs, without any global control processing. Therefore, the network performs a distributed, non-linear, computation. ANN topology typically consists of three or more layers: an input layer, which acts like the biological sensory input system and performs no computation (has no transfer function), one or more hidden layers that can be considered to perform the main computations, and an output layer which provides the output of the network's function and typically consists of fewer units than the hidden layers. Typically, data flows from one layer to the next and never backwards, i.e. the links are unidirectional only, this kind of network being known as a feed-forward network. Contrastingly, recurrent networks have bidirectional links and data processing can be cyclic; this allows for more natural processing and provides the network with more power. Recurrent networks have the notion of short-term memory while feed-forward networks can have no internal states. Examples of recurrent network models include Hopfield networks and Boltzmann machines [55 and 56 respectively]. Although recurrent networks maybe very useful in modelling some systems, it should be noted that it is known that many parts of the human brain are very similar to feed forward networks, are somewhat layered and that such networks can implement adaptive versions of simple reflective agents or components of more complex systems. Furthermore, recurrent networks are poorly understood and are a lot more complex to train and to understand what the network is actually doing when it is working, something which is not desirable if we are aiming to understand the process of a the real biological system.
ANNs are often trained with an iterative method called back-propagation, often referred to as gradient descent because of the mathematical function used to train the network. Training is performed over a set of labelled training examples where both the inputs and outputs are known. For each example the input is processed by the current network and the output compared to the known output. If an error is present, then the weights are adjusted accordingly to reduce this error. The difficulty is in knowing which weights are responsible for the error; each connection is assumed responsible for the error in proportion to the connection strength. The error checking is propagated back through the network from the output units through the hidden layers, adjusting connection weights as necessary. Back-propagation tries to minimise the mean-square error by iteratively searching for the optimal set of weights. If the the different weights form a weight space, then an error surface describes the error for each weight combination. The error-surface will contain maxima and minima; the training algorithm must locate the error minimum by way of gradient descent. The gradient of the surface provides information to allow the algorithm to alter weights in the correct dimension to locate the minima, so back-propagation is a form of gradient descent. This method is the most commonly used and is well understood, but it does have limitations: it is computationally expensive, the error-surface can be extremely large and complex, poorly understood, noisy, non-differentiable (preventing gradient descent) and can contain many local minima. Therefore, alternative methods have been developed to better cope with such a complex search space. The most interesting and also highly successful alternative is the use of GAs to search and evolve suitable weights. This has the advantages of being an adaptive global approach to training, is a stochastic search method capable of better searching large, complex and poorly understood search spaces, and can cope better with noise. GA training methods are less likely to get trapped in local minima than gradient descent algorithms, and they are more likely to be able to find global maxima (or at least portions of the search space that are globally better). Therefore much work has been done using GAs and other evolutionary algorithms for neural network training. Furthermore, using GAs as a training method additionally allows the potential study of evolutionary effects, such as adaption, convergence, co-evolution etc., especially when the algorithm is also designing the network topology.
2.4.2.b Biologically Inspired Artificial Neural Networks
What has been described is the basics of a particular kind of multi-layer, feed-forward neural network model. Other models exist that display different characteristics which can make them more suitable for certain tasks. Within this project we are mainly concerned with a Spiking neural network model, specifically we are interested in a 'leaky integrate-and-fire' neural model [28] which is the basis of the previous research into cricket phonotaxis neural modelling in the IPAB department. This model is more akin to how natural neural networks operate, can show more complex network behaviour, and has more realistic biological characteristics. Spiking neural networks are better suited to biological models because they have a more accurate neurological basis of both neuronal excitation potentials and neuron 'firing', and thus are more useful when trying to understand the biological processes which control natural behaviours. Additionally, it has been proposed that the temporal properties of this neural model are especially good at learning and processing complex temporal patterns, such as male cricket songs [18].
In this model a neuron can be in one of two states: 'fired' or 'not fired', with all neurons initially being set to 'not-fired' (with this biological model it makes more sense to speak of 'neurons' instead of processing 'units'). All incoming voltages to the membrane potential (MP) are added; if the MP is above a 'resting' value but below a certain threshold, then a constant 'leak' value is subtracted from the MP. If the MP exceeds the threshold then the neuron will attain the state 'fired'; there is a rapid increase in MP and potential is sent down the synaptic connections, passing potential from neuron to neuron. Hence, with only minor excitation the neuron is less likely to fire and its MP, or its potential to fire, will quickly dwindle back to the resting state. Only enough excitation within a certain time-period will induce the neuron to fire. After a neuron has fired, its MP rapidly decreases by a specific constant each cycle, when this drops below a 'recovery' level the neurons will return to the 'not-fired' state. This prevents a neuron from undergoing constant firing even if its excitation is constant (e.g. from an input), which differs from the basic ANN described above. Excitatory and inhibitory connections can exist and are denoted by the sign of the connection weight. Furthermore, the model also exhibits synaptic depression; the weight at each synaptic connection is halved on each cycle when potential is passed through it, until it reaches a set minimum. The connection weight then goes under a process of gradual recovery, slowly returning to normal. Similarly this stops a neuron from receiving constant excitation, or inhibition, since its active connections will rapidly reduce in weighting. As a whole, this network model does not perform continual processing at each neuron since extended excitation will cause the neurons not to fire; in effect showing neural habituation- i.e. the neuron will stop responding to continual stimulation. As mentioned above, this causes the network to exhibit certain temporal properties because of the inherent timing of neuron firing rates. It is these temporal properties that can exploit the temporal patterns in the male cricket song successfully. Overall, this model is a simplification of a model described in [19], however, see [5] for a description of the simplified model.