An Introduction to Robot Coordinate Frames


In the video below, Pi Robot's ability to reach out and grasp a balloon relies on viewing the target (in this case the balloon) in difference frames of reference or coordinate frames.



Locating the target in the camera view gives us the x-y coordinates of the target relative to a coordinate system centered on the camera's current location and orientation. When we add the distance to the balloon as measured by the head-mounted sonar and IR sensors, we have its z-coordinate also relative to the head-center coordinate frame. Before Pi can know how to move his arms to reach for the balloon, these coordinates must be transformed into a frame of reference attached to the shoulder joints. From there, we can compute where to move the hands so that they can grasp the balloon.


The transformation from the head-centered coordinate frame to a shoulder-center frame is straightforward mathematically but takes a little work as we show below. Before handling the case of moving the arms, let's look at the simpler task of determining the horizontal distance of the target from the robot based on where the balloon appears in Pi's field of view. This would be useful if Pi needed to keep the balloon within a given distance, perhaps to follow it around the house or to get close enough to pick it up. Either way, we must figure out how to measure the horizontal distance to the target.


The head-mounted range sensors will give us the distance to the target along the current line of sight. Since the head may be tilted and rotated to one side, we'll have to do a little trigonometry to convert this distance to the horizontal distance in front of the robot and the vertical distance off the floor. The following diagram illustrates the situation:


















The general problem we face is known as a transformation between frames of reference. In three dimensional space, we can define a frame of reference using three perpendicular axes normally labeled x, y and z. A transformation between two such frames involves specifying the location of the origin of one frame relative to the other and the relative orientation of the three axes. Fortunately for us, this problem was solved centuries ago and we can just write down the transformation equations without having to derive them from scratch.


Referring to the diagram above, we can locate the first frame of reference at the center of the camera indicated by the letter O. The z-axis points through the camera lens toward the target. The y-axis points straight out the top of the head, and the x-axis points into the plane of the diagram (using a left-handed coordinated system) This is the same kind of view point you have through your own eyes. In this frame of reference, the target has coordinates [0, 0, R + s] where R is the distance returned from our range sensor, and s is the distance between the sensor and the center of the camera. If we want to know the horizontal distance H of the target from the front of the base of the robot, and the vertical distance V of the target above the floor, then a good frame of reference to use would be the one centered at the point O' whose y and z axes are aligned with the distances we're interested in. To find the coordinates of the target in this frame, we must transform the reference frame at point O into the frame at point O'.


The transformation between any two such reference frames can be broken down into a combination of a translation followed by a rotation to bring the axes into alignment. In symbols we write:


P' = A · P + B


Where P are the coordinates of a point in the original frame, P' are the point's coordinates in the transformed frame, A represents the rotation and B encodes the translation. In our current situation, we can see that a rotation about the x-axis through the reverse of the tilt angle, -θ, will align the two reference frames.

 

In addition to the rotation, we have to translate the origin O into O'. Relative to the O-frame, O' has y-coordinate -hcosθ b and z-coordinate hsinθ + f. The x-coordinate is 0. Our translation vector B therefore has components:

 





The matrix A that aligns the two coordinate frames does a rotation about the x-axis through angle -θ. That matrix is given by:

 





Putting together the translation and rotation, we can now find the balloon's coordinates in the reference frame centered at O':






Doing the matrix multiplication and addition, we find that:


P'x = 0

P'y = -(R + s)sinθ + hcosθ + b

P'z = (R + s)cosθ - hsinθ - f


where we have used the fact that sin(-θ) = -sin(θ) and cos(-θ) = cos(θ). Remember that P'z is the same as H, the horizontal distance of the balloon from the base of the robot and P'y is the same as V, the vertical distance of the balloon of the ground. Let's plug in some values to see if they make sense. Suppose θ = 0 so that the balloon is level with the camera. Then cosθ = 1 and sinθ = 0. In this case we have:


P'y = V = h + b

P'z = H = (R + s) – f


The horizontal distance H of the balloon from the front of the robot base is therefore the value returned by our range sensor plus the distance from the sensor to the camera plane minus the backward offset of the camera from the base. The vertical distance V of the balloon off the floor is just the sum of the height of the robot up to the base of the head plus the height of the head from the base of the neck to the center of the camera.


As another example, suppose θ = 30 degrees which would look close to the angle depicted in the diagram. Then cosθ = 0.866 and sinθ = 0.5 so that:


P'y = V = -(R + s)/2 + h · 0.866 + b

P'z = H = (R + s) · 0.866 - h/2 - f


To be even more concrete, assume s = 1 inch, h = 4 inches, f = 6 inches and b = 24 inches. If R = 36 inches, then the two equations above become:


P'y = V = -(36 + 1)/2 + 4 · 0.866 + 24 = 8.96 inches

P'z = H = (36 + 1) · 0.866 - 4/2 – 6 = 24.04 inches


As you can see, factoring in the downward tilt of the head and the dimensions of our robot's body enables the range measurement to be converted into a horizontal distance and a height off the floor for the target.


Next we need to factor in the rotation of the head to the left or right which is represented in the diagram by a rotation about the y-axis through angle ф in the camera-centered coordinate frame. The matrix that reverses this rotation is given by:






We can now combine the two rotations by matrix multiplication to get the overal rotation maxtrix. We have also used the identies sin(-θ) = -sin(θ) and cos(-θ) = cos(θ):






Doing the matrix multiplication we get:






At the same time, the angle ф modifies the horizontal component of the translation between the two new frames by an amount cosф. Our matrix equation for the coordinates of the target in the base reference frame therefore becomes:






Doing the matrix operations yields the expressions for the three coordinate components:


P'x = (R + s)sinфcosθ

P'y = -(R + s)sinθ + hcosθ + b

P'z = (R + s)cosфcosθ - hcosфsinθ - f


Let's now find the coordinates of the balloon in the base frame of reference using the same numbers as above for the dimenions of the robot as well as θ = 30 degrees, and ф = 45 degrees. Plugging the numbers into the equation above gives us:


P'x = (36 + 1)sin(45)cos(30) = 0.707 x 0.866 * 37 = 22.66

P'y = V = -(36 + 1)sin(30) + 4cos(30) + 24 = 8.96

P'z = H = (36 + 1)cos(45)cos(30) – 4cos(45)sin(30) – 6 = 15.24


Since we have assumed a value of ф = 45 degrees for the head's pan angle, the balloon must be to the right of the robot and indeed, the value of P'x = 22.66 tells us that the balloon is 22.66 inches to the right. Consequently, the effective distance of the balloon in front of the robot, P'z = 15.24 inches, has been reduced from 24.04 inches since the balloon is no longer straight ahead. Finally, the vertical distance of the balloon off the floor is the same at P'y = 8.96 inches since we haven't changed the tilt angle.

 

Reaching for an Object


Although Pi was very happy to get his new arms, programming the motion of such an arm is more difficult than it might appear. Imagine the simple task of reaching out to pick up a pencil. How does your brain compute the motor signals necessary to rotate your shoulder, elbow, wrist and fingers to bring your hand to the target?


If we are given the angles of the various joints in an arm—robotic or biological—together with the lengths of the segments between joints, it is a simple matter of geometry to compute the location of the hand in three dimensional space. This problem is called the forward kinematics of the arm. On the other hand (so to speak), if we are only given a desired position of the hand in space and asked to compute the angles of the joints that will put it there, we face the harder problem of computing the inverse kinematics of the arm. The reason this problem is hard is that the forward transformation defines a non-linear system of equations. While linear systems are generally straightforward to solve, non-linear systems often require figuring out a different solution at each point along the arm's trajectory. Furthermore, depending upon the number of joints in the arm, also known as its degrees of freedom, there may be one solution, no solutions or an infinite number of solutions for positioning the joints to get the end effector to the target location.


To make things even more interesting, there are situations where objects or other constraints might prevent a given joint from moving through its entire range. For example, if there is a juice container on the table near your elbow when reaching for your coffee, you'll have to modify your normal reaching pattern to avoid tipping over the juice.


To get started on this difficult problem, we'll begin with a relatively simple task: let's figure out how to have Pi point to the balloon no matter where it is located.


Eye-Hand Coordination


Imagine wanting to hand off the balloon to the robot or having it play a game of catch. Such activities require that the the robot be able to move its arms in such a way that the hands are in position to hold or catch the balloon. In other words, the robot must be able to point or reach toward the target. For this reason, we might call this behavior "arm tracking" by analogy to head tracking.


In the previous section, we derived a set of equations for mapping the coordinates of the target in the camera-centered frame of reference into a set of coordinates relative to the base of the robot. We can use a similar procedure to position the hands at specific locations in space relative to the balloon. This will allow the robot to reach for the balloon based on where it appears in the field of view.


Forward and Inverse Kinematics

The figure below illustrates our situation:


















The joint angles of the arms are traditionally labeled by the variables qk where k runs from 0 to N-1 and N is the number of joints in the arm. Two of the angles are shown in the diagram above: q1 reflects the up-down rotation of the arm and q3 measures the bend angle at the wrist. The two angles not shown are q0 which corresponds to the horizontal motion of the arm and q2 which measures the roll of the arm along its axis.


As a first example, let's assume that q2 and q3 are fixed with a value of 0 so that only q0 and q1 are allowed to vary. In other words, there is no bend at the wrist and we do not allow the arm to roll. How then should we control the servo positions q0 and q1 so that the hand points toward the balloon?


We begin by attaching a coordinate system at the shoulder joint of the arm labeled by O'. The y'-axis is aligned vertically with the robot body, the z'-axis points horizontally parallel to the ground and the x'-axis points into the plane of the diagram. As in the previous section, we can now get the coordinates of the target in this coordinate system by way of its coordinates in the camera-centered frame. The distance between O' and the base of the head is k and this time there is no fore-aft offset between O and O'. However, there is an offset along the x'-axis of the shoulder joint from the mid-line of the robot which we will call m. (Not shown in the diagram.) So the coordinates of the balloon in the shoulder-centered frame are given by:


P'x = (R + s)sinфcosθ + m

P'y = -(R + s)sinθ + hcosθ + k

P'z = (R + s)cosфcosθ - hcosфsinθ


The question now becomes: what is the relation between joint angles q0 and q1 and the x', y', z' coordinates of the end of the arm? Shoulder joint q0 rotates the arm about the y' axis, while joint q1 rotates about the x'-axis. Furthermore, q1 is displaced outward at a distance v from q0 along the x'-axis. When q0 is 0, q1 will place the end of the arm at a point given by:


P'x = -v

P'y = (g + u)cosq1

P'z = (g + u)sinq1


If q0 is also allowed to vary, we get:


P'x = (g + u)sinq0 - v

P'y = (g + u)sinq0cosq1

P'z = (g + u) sinq0sinq1


These three equations are known as the forward transformation or forward kinematics of the arm mapping joint coordinates into the Cartesian coordinates of the end of the arm in space. (Remember though that we are keeping two of the joints fixed for the moment.) The forward transformation is generally straightforward to calculate even for arms with many joints and even different types of joints such as linear actuators or prismatic joints. However, what we tend to need more often is the inverse transformation: given a desired position of the hand in space, what joint angles do we need to move it there?


Because of the way we have constrained our arm for this example, there are many points in space that our arm cannot reach which means we cannot directly solve the equations above to find the joint angles in terms of P'x, P'y, and P'z. But we can point to the target fairly well using just two joints. In addition, the joint angles q0 and q1 are analogous to the two angles used in spherical coordinates. We can therefore use the well known transformation between Cartesian and spherical coordinates to get our inverse transformation:


q0 = sin-1[(P'x + v)/(P'x2 + P'y2 + P'z2)1/2]

q1 = tan-1[P'z/P'y]


These two equations allow us to move the arm so that the hand points to a specified location in space. This is a good first step toward controlling the arms in preparation for grasping the target based on where it appears in the camera image.


In summary, we started with the coordinates of the balloon in the camera-centered frame of reference. Then we transformed these coordinates into a reference frame attached to the shoulder. Once we know where the target is relative to the shoulder, we can compute the joint angles necessary to have the arm point to this location in space.


Two Hands Are Better Than One

What is good for the left hand can't be too bad for the right. So let us apply the arm tracking algorithm to the right arm as well. However, we will add a slight twist: since the goal is to have the robot actually grasp the balloon in both hands, we don't want both arms to simply point at the center of the target. Instead, we want each arm to point to the outside of the corresponding edge of the balloon. In other words, we want the arms pointing in the direction of the balloon, but opened wide enough to grasp when it gets close enough.


The coordinate equations for the right arm mirror those of the left arm and are given below. Note that the only difference is that +m becomes -m and +v becomes -v since the displacements of the shoulder along the X' axis are now in the opposite direction:


P'x = (R + s)sinфcosθ - m

P'y = -(R + s)sinθ + hcosθ + k

P'z = (R + s)cosфcosθ - hcosфsinθ


q0 = sin-1[(P'x - v)/(P'x2 + P'y2 + P'z2)1/2]

q1 = tan-1[P'z/P'y]


Summary


The simple act of reach for an object turns out to be fairly challenging to implement on a robot. Fortunately, the only tools required are high school algebra and a little patience. What's more, once we have the transformation equations figured out, they can be used for different robots simply by changing the parameters representing the dimensions of the robot involved. For example, I have changed the lengths of Pi's arm segments a few times, but the same equations described above can be used to control the arms. All that needed to be changed were the constants describing the distance between the shoulder and the hand. In a future article, we will delve more deeply into the inverse kinematics of a fully functional arm—i.e., with all the servos activated. But for now, Pi seems to be doing just fine with just his four shoulder joints online.