In Part 1 of this series I laid out the prerequisites. Now we’ll get into how to turn the Kinect into a tracking system for the cannon.
Manual Targeting
As I mentioned in Part 1, one of the pieces to this puzzle was already written for us – a .NET layer around the launcher. This layer was provided by Chris Smith in his Being an Evil Genius with F# and .NET post. He links to this source code at the very end of the post, and included several projects. We ended up using the RocketLib\RocketLauncher_v0.5.csproj project.
So, now we had a class that we could give commands to the launcher such as
Me._Launcher.MoveLeft(5)
Me._Launcher.MoveDown(10)
Me._Launcher.Fire()
Where “Me._Launcher” was an object of type RocketLib.RocketLauncher. The numbers being passed to the “Move” commands are the number of times to move the launcher turrent. The unit of “time” or “step” (as we came to refer to it) seemed to translate into a little less than half a degree of rotation (either left/right or up/down).
Armed with this knowledge (see what I did there?), we were able to whip together a little WPF interface that had five buttons on it – Up, Down, Left, Right, and Fire – that controlled the launcher manually. That became the “Manual” mode. The “Auto-track” mode, where the Kinect would control the launcher, would come next.
Auto-Targeting
Now we started going through the Kinect SDK Quickstart video tutorials, produced by Microsoft and hosted by Dan Fernandez. To begin, we wanted to get to the raw position data (X, Y, and Z) from the camera. We ended up compressing the first four tutorials (“Installing and Using the Kinect Sensor”, “Setting up the Development Environment”, “Skeletal Tracking”, and “Camera Fundamentals”) into a Friday to get ramped up as quickly as possible.
In “Skeletal Tracking Fundamentals”, Dan explains that the Kinect tracks skeletons, not entire bodies. Each skeleton has 20 different joints, such as palms, elbow, head, shoulders, etc. We decided to select the “ShoulderCenter” joint as our target.
Next, we added labels for the X, Y, and Z positions of the ShoulderCenter joint to the app, and then started moving around the room in front of the Kinect, seeing how the values changed. The values are given in meters, with X and Y being 0 when you’re directly in front of the depth camera. These values are updated in the SkeletonFrameReady event.
Now, the fun could really begin. We decided to focus on left/right movement of our target, so the Y value is not used in the app at all.
We also decided that since the launcher had a real physical limitation as to how fast it could move, we couldn’t give it too many commands at a time. The Kinect sends data 30 times a second, so we decided to sample the data twice a second (every 15 frames).
Our first attempt at this was very complicated and clunky, and didn’t work well unless you were at a magical distance from the Kinect (basically we threw enough magic numbers into the equation until it worked for that one distance). We really ran into problems when we tried to extend that to work for any depth.
It was Doug that hit upon the idea of calculating the angle to turn the launcher as the arc tangent of X/Z as opposed to what we had been doing (the number of steps). That did two things for us – first, the angle approach was correctly taking the depth information (Z measurement) into account, and second, it meant we only had to store the last known position of the launcher (measured as a number of steps, either positive or negative, with 0 being straight ahead). If we knew the last position, and we knew where we had to move to, we could swivel the launcher accordingly.
Private Sub nui_SkeletonFrameReady(ByVal sender As Object, ByVal e As SkeletonFrameReadyEventArgs)
Dim allSkeletons As SkeletonFrame = e.SkeletonFrame
Dim NewCannonX, DeltaX As Integer Me._FrameCount += 1
'get the first tracked skeleton Dim skeleton As SkeletonData = ( _
From s In allSkeletons.Skeletons _
Where s.TrackingState = SkeletonTrackingState.Tracked _
Select s).FirstOrDefault()
Dim ShoulderCenter = skeleton.Joints(JointID.ShoulderCenter)
Dim scaledJoint = ShoulderCenter.ScaleTo(320, 240)
Me.UpdateCrossHairs(scaledJoint.Position.X, scaledJoint.Position.Y, scaledJoint.Position.Z)
Me.HorizontalPosition.Content = ShoulderCenter.Position.X
Me.VerticalPosition.Content = ShoulderCenter.Position.Y
Me.DepthPosition.Content = ShoulderCenter.Position.Z
Dim NormalizedX As Integer = CType(ShoulderCenter.Position.X * 10, Integer)
Dim AbsoluteX As Integer = Math.Abs(NormalizedX)
If (Me._IsAutoTrackingEngaged) Then
If (ShoulderCenter.Position.Z > 0) Then
' The multipliers of 100 * 1.6 are needed to convert the degrees to move into steps for the cannon
NewCannonX = Math.Atan2(ShoulderCenter.Position.X, ShoulderCenter.Position.Z) * 100 * 1.6
DeltaX = Math.Abs(NewCannonX - Me._LastCannonX)
If (NewCannonX < Me._LastCannonX) Then
Me._Launcher.MoveRight(DeltaX)
Else
Me._Launcher.MoveLeft(DeltaX)
End If
Me._LastCannonX = NewCannonX
Me._NetCannonX = NewCannonX
End If
End If
End Sub
With this logic in place, the tracking became fairly good, regardless of the distance between the target and the Kinect.
Assumptions Uncovered
Since there really wasn’t any feedback that the launcher could give us about it’s current position, this logic make a couple of major assumptions about the world. First, the Kinect and the launcher have to be pointed straight ahead to begin with, and second, the Kinect needs to remain pointing ahead.
We uncovered the first assumption when the launcher stopped responding to commands to move right. We could move it to the left, but not to the right. We fired up the application that comes with it, and discovered a “Reset” button that caused the launcher to swivel all the way to one side, then to a “center” point. This center point was actually denoted by a raised arrow on the launcher’s base – something I had not seen up to this point. After we reset it, it would move left and right just fine. As it turns out, the launcher can’t move 360 degrees indefinitely – it has definite bounds. The reset function moved it back to center to maximize the left/right motion.
After we discovered that, I would jump out to that app to reset the launcher, and then I had to shut it down again before I could use ours (two apps couldn’t send commands to the launcher – in fact we got runtime errors if we tried to run both apps at the same time). After a while that got old, so we included a reset of our own. Since we knew the launcher’s current position, we’d just move in the opposite direction that amount. We added a Reset button to our own app, and also called the same method when the app was put back to Manual tracking and when it was shut down.
We uncovered the second assumption in a rather amusing way. During one of our tests we noticed the cannon was constantly aiming off to Doug’s (our target at the time) right. He could move left or right, but the launcher was always off. He happened to look up and noticed that the Kinect had been bumped, so it wasn’t pointing directly ahead any more. As a result, the camera was looking off to one side and all of its commands were off. After that, we were much more careful about checking the Kinect’s alignment, and not bumping it.
Some fun to be had
Early on we had thought up a “fun” piece of icing on this electronic cake. What if we took the video image from the camera, and superimposed crosshairs on it? We could literally float an image with a transparent background over the image control on the form. If we could get the scaling right, it could track on top of the user’s ShoulderCenter joint.
And we did. This is turned on using the “Just for Mike” button at the bottom of the app. During the agency meeting demo, I had walked through the basic tracking, using Mike (our President) as the target, and explained about the video and depth images. Then – very dramatically – I “noticed” the screen and turned to Doug (who was running the computer) – “uh, Doug? I think we’re missing something.” At which point he hit the button to add the cross hairs to the video image. “There we go! That’s better.” Mike got a good laugh out of it, as did most of the rest of the audience. Fun? Check!
Beyond the fun, though, I thought it was cool that we could merge the video and depth information to such great effect. Between having the launcher track you, and seeing the cross hairs on your chest – it’s downright eerie.
Performance Issues
So, by this point, we had launcher tracking, both video and depth images refreshing 30 times a second, and crosshairs.
And everything was running on the same thread.
Yeah. We now had some performance issues to solve.
When the launcher moved at all, and especially when it fired (which took 2-3 seconds to power up and release), the images would completely freeze, waiting for the launcher to complete. The easy solution? Duh! Just put the launcher and the image updates on their own threads. Um, yeah. That turned out to be easier said than done. We’ll cover the multi-threading solution, as well as the speech recognition features in Part 3. Those two topics turn out to be intertwined.
Update: Full source code for this solution can be found in the “KinectMissileLauncher.zip” archive here: http://tinyurl.com/MarkGilbertSource.