Artificial Neural Networks & Robotics
The simplest kind of animal response to its environment is the spinal reflex arc. Probably the best known reflex in people is the patellar reflex or "knee jerk" reaction. In this case, a sensory neuron just below the knee connects directly to a motor neuron in the quadriceps which causes the lower leg to kick outward. The figure below illustrates the situation:
Reading the figure from top to bottom, we see that physical energy stimulates the input neuron which makes a connection with the output neuron. If the input neuron's activity exceeds the output neuron's threshold, the output neuron fires and a motor response is generated.
This simple circuit has nearly all the ingredients we will need to build more complicated artificial neural networks. In mathematical or engineering terms, we represent the activity of the input neuron by a variable x while the activity of the output neuron is symbolized by y. The synaptic strength or weight between the input neuron and output neuron is represented by w. For a given level of activity of the input neuron, the activity of the output neuron is then given by the equation:
y = w · x - b
where b is the output neuron's bias. The final response of the network is then given by:
r = a(y)
where a is called the activation function. The activation function can take almost any form, but the most commonly used are the step function and the sigmoid function. The step function simply holds the final output at 0 until y exceeds a threshold value at which point the output is set to 1. This is similar to the way the patellar knee reflex works: if the mallet doesn't hit the base of the knee just right, there is no reflex. But hit the right spot, and the leg kicks forward. The step function looks like this:
This particular step function has a threshold value of 0 at which point the function transitions from a value of 0 to a value of 1.
The sigmoid function is a less drastic version of the threshold function and is also called a squashing function. It looks like the picture to the left:
As the figure illustrates, the sigmoid function is roughly linear in its middle range. This means that changes in the x value lead to roughly proportional changes in the y value in this region. However, large negative or positive values of the input produce asymptotically smaller changes in the output. If the patellar reflex worked this way, there would be a range of impact values that cause a proportionally smaller or larger kick of the leg. But outside of this range, the kick would not get appreciably smaller or larger. This type of activation function is particularly useful in robotics since it can put an automatic upper and lower value on control signals, such as the voltage being sent to a motor which we would not want to exceed a certain value, positive or negative.
The mathematical formula for a sigmoidal function is as follows:
f(x) = 1 / (1 + exp(-x))
where exp() is the exponential function. As you can see by playing with different values of x, large negative values of x result in a value of f(x) near 0 while large positive values of x yield an f(x) close to 1, which is consistent with the graph above.
By the way, if you happened to be wondering how a neuron's activity level can be negative, well it can't, at least not in real neurons. However, when we are talking about artificial neurons, we can use any range of values we like. There is one way that a real neuron's activity can be considered negative: most neurons have a base level of activity—in other words, even if they are receiving no input, they will fire at some frequency. If this base level activity is suppressed by an input, then the lower value could be considered "negative" relative to the baseline. However, our goal is not to model real neurons exactly but to borrow as many concepts from them as we find useful. For this reason, artificial neurons are sometimes referred to simply as units or nodes.
The non-linear property of both the step and sigmoid functions turns out to be of critical significance in artificial neural networks. The reason is that non-linearity enables the network to make "decisions" in a way that is not possible in purely linear networks. This will be fully explained in a later section on categorization.
Go Into The Light! A Four-Neuron Light Following Robot
Suppose we would like our robot to follow a patch of light. You could use such a method to have your robot come to you from across the room by simply shining a flash light in front of it and guiding it across the floor. By adding just two more neurons to our simple reflex circuit, we can use it to drive our robot. Our new network looks like the figure below:
We now have two input units and two output units. Consequently, we now have four connections: two straight through connections, and two cross connections. This means that the activity of the left motor will depend on the readings from both the left and right light sensors, as will the activity of the right motor.
Let us introduce a new notation for keeping track of inputs, outputs and connections. As you can see from the figure, the input nodes are labeled x1 and x2, while the output nodes are represented by y1 and y2. For a network with N input nodes and M output nodes, we will represent a typical input node by xi where i can range from 1 to N. Similarly, an output node will be represented by yj where j can range from 1 to M. The connection between input unit xi and output unit yj is then written as wji.
From the figure above, we see that the total input into the left motor unit, y1, is given by the sum:
y1 = w11x1 + w12x2 - b1
while the input to the right motor unit is give by:
y2 = w21x1 + w22x2 - b2
Where b1 and b2 are the biases on our two output units. We can write both equations together using a more compact matrix notation:
y = w · x - b
This equation states that the vector of values across the output units is given by the matrix product of the connection strengths times the vector of input values minus the vector of bias values. The activation function then generalizes to:
r = a(y)
where the response vector r can now be a function of all the output units. For example, one very common practice is to let a(y) select only the most active output unit in a process called winner take all. This will become important in later chapters when we discuss how neural networks can be used to make choices between alternative actions.
One nice thing about these equations is that they generalize to any number of input and output units. So our network can have thousands of nodes all cross connected to one an other, yet we still just use matrix multiplication to get the output values from the input values. In our current example, the matrix version of the two output equations above is:
= -
We will return to the matrix formulation of our problem in a little bit. But first, let's just play with some of the numbers to get a better feel for our network.
It is easy to see that if the left light sensor is receiving more light than the right sensor, then we should turn towards the left which means we must turn on the right motor more than the left motor. Referring to our network diagram, we can make this happen if the connection weight w12 is a number greater than 0 and the weight w11 is less than 0 so that it suppresses the left motor. Let's set w12 = 1 and w11 = -0.5. Just the opposite argument holds when the light is stronger on the right so we set w21 = 1 and w22 = -0.5. Let's now re-draw our network diagram substituting these values for the connection weights:
To see if this works, suppose the left light sensor is giving a reading of 300 units while the right sensor registers only 100 units, meaning the light is brighter to the robot's left side. Setting both the bias values to 0, the total input to the left output unit y1 is:
y1 = -0.5 x 300 + 1 x 100 = -50
while the net input to the right output unit y2 is:
y2 = 1 x 300 – 0.5 x 100 = 250
This means our left motor will turn backwards with a speed of -50 units while the right motor turns forward with a speed of 250 units. Consequently, our robot will spin in place to the left and the robot turns toward the brighter light on the left as we hoped.
Let's now return to our more general matrix notation which we show again below:
= -
If both light sensors are reading 0—i.e. the robot is sitting in the dark—then we want both motors to be off. This means that when x1 = x2 = 0, the above equation must give y1 = y2 = 0. The only way this can happen for non-zero connection weights is for both bias values, b1 and b2 to be 0. So our simplified control equation becomes:
=
And plugging in our values for the connection weights we have:
=
Now you may be wondering if we could have chosen other connection weights that would also work. And the answer is yes. In this particular scenario, there are an infinite number of ways we can choose the weights and get similar behavior. For example, the following matrix would also work:
=
In this case, the robot will turn more quickly toward a difference in light values than in the first case. So in the end, the actual choice of coefficients will come down to the nuances of how you want your robot to behave. The real power of artificial neural networks lies in their ability to learn an optimal set of connection weights from experience. We will explore this potential at great length in the section on neural network learning.
The final step in preparing the neural controller for our light following robot is choosing an activation function to map the values of the output units into actual motor control signals. Let's represent the maximum speed of our motors by the letter S and the maximum value the light sensors can take as L. The maximum differential we can expect between the two sensors occurs when one of them registers L and the other reads 0. Plugging these values into our matrix equation for x1 and x2 yields output values of y1 = -0.5L and y2 = L. Assuming we want the maximum output value L to map into the maximum motor speed S, we need to multiply output values by S/L. In essence we are simply scaling the output values from the units of our light sensors to those of our motor controller. So the first part of our activation function is simply:
r(yi) = S/L · yi
In addition, we only want our robot to follow a light that is brighter than its surroundings. Consequently, we need to set a minimum output needed for the robot to react. Let's call this minimum value T for threshold. Anything less than this and we want to set the motor control signal to 0 so the robot does not move. We can accomplish this with the function:
yi = H(max(y1, y2) – T) · yi
where H(x) is the step function we met earlier and evaluates to 1 if x > 0 and 0 if x ≤ 0. Combining this with our scaling function yields our final activation function for our motor signals:
r(yi) = S/L · H(max(y1, y2) – T) · yi
This is actually much simpler than it looks. We simply find the maximum value given by our two output units, and if this value is smaller than our threshold, we set both outputs to 0, otherwise we scale the outputs appropriately and send them on to the motors.
So much for all the theory. How does our neural controller stack up against the real world?
Testing the Robot
Everything is now in place to test our light following neural network on our robot. As shown in the image below, our left and right light sensors are mounted near the front of the robot. Notice how we have mounted them pointing a little left and right to help amplify the difference between their readings. We have also tilted them upward slightly since we will be mostly standing when shining the guide light at the robot. The sonar sensors also visible in the picture are not used in this experiment.
The light sensors are connected to two analog ports on the Serializer. These particular sensors produce a minimum value of 0 and a maximum value of 1024. Since not all light sensors are exactly the same, be sure to check their readings when they are facing the same intensity of light. If the sensors return different values, add or subtract this amount in your code to compensate.
Before looking at the code, let's look at a live performance. Keep in mind that the goal of the robot is to stay on top of the light patch projected by the flashlight. If the flashlight is turned off, the robot should stop.
For the programmers in the audience, the code for our "Follow Light" behavior is shown below. Comments shown in green font explain each of the steps in the process.
private double leftInput, rightInput;
private double leftOutput, rightOutput;
private double leftMotor, rightMotor;
private double maxLight = 1024;
private double threshold = 300;
private double maxSpeed = 50;
private int leftRightDiff = 100;
while (true)
{
// Get the current light sensor readings. Notice how we compensate the left
// input value by the discrepancy we measured during calibration.
leftInput = My_Robot.sensorValues[Sensors.SensorID.LightFL] + leftRightDiff;
rightInput = My_Robot.sensorValues[Sensors.SensorID.LightFR];
// Compute the output unit values from the inputs and connection weights.
leftOutput = -0.5 * leftInput + rightInput;
rightOutput = leftInput - 0.5 * rightInput;
// Check the output unit values against our minimum intensity threshold.
if (Math.Max(leftOutput, rightOutput) - threshold <= 0)
{
leftOutput = 0;
rightOutput = 0;
}
// Compute the final motor values from the scaling ratio.
leftMotor = (maxSpeed / maxLight) * leftOutput;
rightMotor = (maxSpeed / maxLight) * rightOutput;
// Set the left and right motor speeds accordingly and tell the motors
// to travel at the new speeds.
My_Robot.myDriveMotors.pidDrive.Motor1Speed = (int)leftMotor;
My_Robot.myDriveMotors.pidDrive.Motor2Speed = (int)rightMotor;
My_Robot.myDriveMotors.pidDrive.TravelAtSpeed();
// Suspend the loop for 200 msec (1/5 of a second). This means we are
// sampling the light sensor values and updating our motor controls
// 5 times per second.
Thread.Sleep(200);
}
As you can see, our program loop retrieves the two light sensor readings which are represented by the activity values of our two input units. We then multiply this two-element vector by our weight matrix to get our output unit activities. Before sending commands to our drive motors, we pass these values through our activation function which puts a lower bound on the output values we are willing to respond to and scales the values appropriately. The resulting motor signals are then passed to our drive motor PID controller which adjusts the speeds of the left and right wheels accordingly every 200 msec (five times per second).
|