Mark Gilbert's Blog

Science and technology, served light and fluffy.

Target-Tracking with the Kinect, Part 3 – Target Tracking Improved, and Speech Recognition

In Part 1 of this series, I went through the prerequisites for getting the Kinect/Foam-Missile Launcher mashup running.  In Part 2, I walked through the core logic for turning the Kinect into a target-tracking system, but I ended it talking about some major performance issues.  In particular, commands to the launcher would block updates to the UI, which meant the video and depth feeds were very jerky. 

In this third and final part of the series, I’ll show you the multi-threading scheme that solved this problem.  I’ll also show you the speech recognition components that allowed the target to say the word "Fire" to actually get a missile to launch. 

What did you say?

We had tried to implement the speech recognition feature by following the "Audio Fundamentals" tutorial.  That code looked like it SHOULD work, but there a couple of differences between the tutorial app and ours: the tutorial’s was a C# console application, while ours was a VB WPF application.  As it turns out, those two differences made ALL the difference.

For the demo, Dan (the host) mentions the need for the MTAThread() attribute on the Main() routine in his console app.  Since our solution up to this point was VB, it looked like we would need this.  I tried adding that to every place that didn’t generate a compile error, but nothing worked – the application kept throwing this exception when it fired up:

Unable to cast COM object of type ‘System.__ComObject’ to interface type ‘Microsoft.Research.Kinect.Audio.IMediaObject’. This operation failed because the QueryInterface call on the COM component for the interface with IID ‘{D8AD0F58-5494-4102-97C5-EC798E59BCF4}’ failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)).

Stack Trace:
       at System.StubHelpers.StubHelpers.GetCOMIPFromRCW(Object objSrc, IntPtr pCPCMD, Boolean& pfNeedsRelease)
       at Microsoft.Research.Kinect.Audio.IMediaObject.ProcessOutput(Int32 dwFlags, Int32 cOutputBufferCount, DMO_OUTPUT_DATA_BUFFER[] pOutputBuffers, Int32& pdwStatus)
       at Microsoft.Research.Kinect.Audio.KinectAudioStream.RunCapture(Object notused)
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart(Object obj)

I decided to try a different tack.  I wrote a C# console app, and copied all of Dan’s code into it (removing the Using statements and initializing the variables manually to avoid scoping issues).  That worked right out of the gate.  Since we were very short on time (this was two days from the demo at this point) I decided to port our application to C#, then incorporated the speech recognition pieces.

First, the "setup" logic was wrapped into a method called "ConfigureAudioRecognition" (I pretty much copied this right from the tutorial).  That method was invoked in the Main window’s Loaded event, on its own thread.  In addition to initializing the objects and defining the one-word grammar ("Fire"), this adds an event handler for the recognizer engine’s SpeechRecognized event:

private void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
    if (this._Launcher != null && 
        this._IsAutoTrackingEngaged &&
        e.Result.Confidence > 0.95) { this.FireCannon(); }
}

The command to launch a missile is only given if the Launcher object is defined, the app is in "auto-track" mode, and the confidence level of the recognition engine is greater than 95%.  This last check is an amusing one.  Before I included this check, I would read a sentence that happened to contain some word with the letter "f", like "if", and the missile would launch.  Inspecting the Confidence property, I found that this only had a value in the 20-30% range.  When I said "Fire", this value as 96-98%.  The confidence check helps tremendously, but it’s still not perfect.  Words like "fine" can fool it.  It’s much better than having it fire with every "f", though.


Take a number

Doug, Joshua, and I discussed some solutions to the UI updates earlier in the week, and the most promising one looked like using BackgroundWorker (BW) to send a command to the launcher asynchronously.  That was relatively easy to drop into the solution, but I almost immediately hit another problem.  The launcher was getting commands sent to it much more frequently than my single BW could handle it, and I started getting runtime exceptions to the effect of "process is busy, go away".  I found an IsBusy property on the process that I could check to see if it had returned yet, but that meant that I would have to wait for it to come back before I could send it another command – basically the original blocking issue, but one step removed.

I briefly toyed with the idea of spawning a new thread with every command, but because they were all asynchronous there was no way to guarantee that they would be completed in the order I generated them in.  Left-left-fire-right looks a lot different than fire-right-left-left.  What I really needed was a way to stack up the requests, and force them to be executed synchronously.  What I found was an unbelievably perfect solution from Matt Valerio with his post titled "A Queued BackgroundWorker Using Generic Delegates".  As the title suggests, he wrote a class called “QueuedBackgroundWorker” that would add another BW to a queue, and then pop them off and process them in order.  This was EXACTLY what I needed.  This was also the most mind-blowing use of lambda expressions I’ve ever seen: you pass entire functions to run as the elements on the queue which get executed when that element is popped off the queue.

I added a small class called "CannonVector" that would roll up a direction (up, down, left, or right) and a number of steps.  Then, I created two methods – FireCannon() and MoveCannon() that would now wrap my calls to the launcher methods that Matt Ellis wrote (see Part 2 of this series):

private void FireCannon()
{
    QueuedBackgroundWorker.QueueWorkItem(
        this._Queue,
        new CannonVector
        {
            DirectionRequested = CannonDirection.Down,
            StepsRequested = 0
        },
        args =>
        {
            this._Launcher.Fire();
            return (CannonVector)args.Argument;
        },
        args => { }
    );
}


private void MoveCannon(CannonDirection NewDirection, int Steps)
{
    QueuedBackgroundWorker.QueueWorkItem(
        this._Queue,
        new CannonVector
        {
            DirectionRequested = NewDirection,
            StepsRequested = Steps
        },
        args =>
        {
            CannonVector MyCannonVector;
            MyCannonVector = (CannonVector)args.Argument;

            switch (MyCannonVector.DirectionRequested)
            {
                case CannonDirection.Left:
                    this._Launcher.MoveLeft(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Right:
                    this._Launcher.MoveRight(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Up:
                    this._Launcher.MoveUp(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Down:
                    this._Launcher.MoveDown(MyCannonVector.StepsRequested);
                    break;
            }
            return new CannonVector
            {
                DirectionRequested = MyCannonVector.DirectionRequested,
                StepsRequested = MyCannonVector.StepsRequested
            };
        },
        args => { }
    );
}

Cool, huh?

With this in place, everything was smooth again – launcher movement and UI updates, alike.

And there was much rejoicing.

So there you have it.  Full source code for this solution can be found in the "KinectMissileLauncher.zip" archive here: http://tinyurl.com/MarkGilbertSource.  Happy hunting!

Advertisements

September 10, 2011 - Posted by | Microsoft Kinect, Visual Studio/.NET

2 Comments

  1. […] « Previous | Next » […]

    Pingback by Target-Tracking with the Kinect, Part 2 – Target Tracking « Mark Gilbert’s Blog | September 10, 2011

  2. […] Part 3, I’ll go through the threading that we discovered we needed in order to make the application […]

    Pingback by Target-Tracking with the Kinect, Part 1 – Intro and Prerequisites « Mark Gilbert’s Blog | September 10, 2011


Sorry, the comment form is closed at this time.

%d bloggers like this: