Mark Gilbert's Blog

Science and technology, served light and fluffy.

How I got my groove back – Music Files, Playlists, and the Sansa Clip

Before a couple of months ago, I had only really been using my MP3 player, a Sansa Clip, to listen to music while I was at work, but then I started finding other uses for it.  For example, I can connect it as an input to my guitar amp, and then play along with whatever song I cue up. I also found myself plugging it in at home, finding it far easier to use than Windows Media Player (WMP).

WMP works fine for playing music, but managing my collection is another matter.  I’d drop a new MP3 into a folder, and then fight for 15 minutes with WMP to get it to actually recognize it.  Sometimes it would appear under "Songs" but not "Albums".  Sometimes I’d drag it into a playlist, only to have it get duplicated.  Sometimes the file wouldn’t sync to my player at all: no errors, but no transferring bits either.  These are probably just cases of me just not doing it the "WMP-Way", but whatever that is is not intuitive.

The more I thought about it, the more I realized that the three most common things I was still using WMP for were:

  1. Ripping CDs and syncing music to the player.
  2. Syncing music to the Sansa Clip.
  3. Burning podcasts onto CD so I can listen to them in my car.

I haven’t ripped a CD in months because I’ve been buying all my recent music online.  Burning podcasts onto CD is actually very painless in WMP, so I will probably continue using it for that.

But syncing?  Could I manage the music on the player directly?  Plugging the player into a USB port registers it as another storage device, available in Windows Explorer.  Could I just drag music onto it?  The short answer is "yes", but to really make this useful, I’d need to do a few more things:

  1. Reorganize the media files to clean up where Windows Media Player originally dropped them.
  2. Edit the media tags on the files so that Artist and Song Titles are accurate and simple.
  3. Maximize the number of songs I could fit onto the player by converting everything to MP3 format.
  4. Organize them into separate playlists to accommodate my current given mood.
  5. Sit back and enjoy the sweet sounds of victory.

Reorganize the media files

Most of my digital collection was actually ripped from my CD collection using Windows Media Player, which organizes it into a folder structure that looks like this:

Folder Structure

I’m really only interested in Artist and Song Title.  If I’m in the mood for John Williams, for example, I want to hear all of his work – I don’t care if it came from the "The Spielberg/Williams Collaboration", "Harry Potter and The Sorcerer’s Stone: Soundtrack", or one of the Star Wars albums I own.  I just want to hear the music of John Williams.  So, I decided to flatten the music by removing the Album level:

Folder Structure Flattened

Next, the track numbers that prefaced the song titles were making me twitch, so I removed them:

Folder Structure No Track Numbers

The next step was to resolve all of the "Unknown Artist", "Various Artists", and other folders that had been created over time, and move those music files into folders with a real artist name.  Some of these became obvious just from the name of the song – "Takin’ Care of Business" by Bachman-Turner Overdrive, for example.  Some of these, especially the classical pieces like "Violin Concerto No. 1", took a little more work to track down.  A lot of these required me to look at the media tags attached to the file, which we’ll address next.

Edit the media tags

Each audio file has a series of tags such as Artist, Album, Song Title, Track #, etc.  I originally used these to help reorganize the music into their proper artist folders, but many of these needed to be cleaned up themselves.  Why?  Because my Sansa Clip would use organize the music by these tags.  Putting the files in a folder in Windows Explorer called "Hans Zimmer" wouldn’t be enough – the song’s Artist media tag would need to reflect that name.

Originally I thought I needed an application to allow me modify these, but I discovered that Windows Explorer can do it.  When you select a music file in Windows Explorer, the window shows a series of controls at the bottom:

Media Tag Controls

All you have to do to change these is click the tag you want to edit, type over it, and hit Enter:

Media Tags - Editing

So, my first task was going through and cleaning up the "Contributing Artists", "Album artist", and "Title" for each of my music files.  After updating a few, I realized how tedious this was going to.  I don’t have an enormous digital music collection, but it’s large enough that I figured I could write something to automate the process faster than just doing it manually.

So I did.

I had already organized each music file into a folder named after the artist responsible, and had renamed the files themselves to clean up the song title (Several songs were named things like "Satisfied* [bonus tracks].mp3", so I cleaned it up to just be "Satisfied.mp3").  What if I could write a Powershell script (my shiny new tool in my development toolbox) to rework the media tags for each file based on this information?

After consulting my good friend, Google, I found people here and here were already managing media tags from Powershell.  Using TagLib# (available from GitHub: https://github.com/mono/taglib-sharp ), it was very easy to walk through my entire music collection, updating media tags as I went:

[Reflection.Assembly]::LoadFrom( (Resolve-Path ".\taglib-sharp.dll") )

$BaseMusicPath = "C:\Users\Mark\Desktop\Music"

Get-ChildItem -Path $BaseMusicPath -Filter "*.mp3" -Recurse | ForEach-Object {
    Write-Host "Processing:" $_.FullName
    $CurrentMediaFile = [TagLib.File]::Create($_.FullName)
   
    # Set the song title to the file name
    $CurrentMediaFile.Tag.Title = [IO.FileInfo]$_.Name
   
    # Make the AlbumArtists match the Artists (contributing artists)
    $CurrentMediaFile.Tag.AlbumArtists = $CurrentMediaFile.Tag.Artists
   
    # Save the new album name into the file
    $CurrentMediaFile.Save()
}

The script looks through my music folders recursively for every MP3, opens it, sets the "Title" media tag to the file name and the "AlbumArtists" media tag to the "Artists" tag.  The latter corresponds to the "Contributing Artists" tag that appears in Windows Explorer.

The script worked like a charm.  It ran through my entire collection in a matter of seconds, and took me about half an hour to piece it together.  Overall, I estimate it saved me at least an hour of drudgery, and gave me a great excuse to do something in Powershell.

 

Maximize the number of songs

I still had a mix of WMA and MP3 files at this point.  In the course of updating the media tags, I noticed there was a pretty large gap between the average file size of a WMA file and the average file size of an MP3 – WMAs were much larger than the MP3s.  I found a free converter from KoyoteSoft that could process my entire music collection in batch – converting all WMA files to MP3 in place.  I didn’t think to capture before and after totals, but the size savings was tremendous: 30% smaller files were very common.

I actually put the media tag editing on pause to convert everything over to MP3s.  That is why the Powershell script above only handles MP3s.  By the time I got around to writing it, EVERYTHING was an MP3.

 

Organize them into Playlists

The next, and what ended up being the biggest challenge, was figuring out how I could create my own playlists.  To be fair, I had not tried this with the Sansa Clip before.  What got me thinking about it was that there was a "Playlists" option on the Clip, hinting that it was supported and that I only had to figure out how to do it.

My good friend, Google, turned out to be a good start down this path.  I found this post on the Sansa Clip forums that pointed to a couple of possible paths:

  1. If I browsed to the folder on the Clip in Windows Explorer, and right clicked on a folder or music file, I had an option for "Create Playlist".  I tried selecting multiple folders and created a playlist from them.  That dropped a .PLA file in the folder, and the player seemed to like it.  The weird thing was that this file was 0 bytes long.  Examining the file properties (again through Windows Explorer) revealed a tab called "References" that listed out all of the songs I just dropped in.  That tab would allow me to remove songs, or reorder them, but there did not appear to be any way to add new ones to an existing playlist.  If I added a new song, I’d have to reselect all of the other songs AND the new one to effectively update the playlist.  That would become unwieldy fast.
  2. The other option I found in this forum post talked about the M3U playlist file format.  This was billed as a simple text file format, which seemed much more likely to be manageable going forward.

I ended up consulting several other internet destinations to figure out what this file needed to look like, and how to get it to work on the Clip:

In addition to these posts, I did a fair amount of my own experimentation to figure out the following procedure:

  1. Create a Windows 1252 (ANSI) text file and name it with a ".m3u" file extension.
  2. Add this as the very first line of the file: #EXTM3U
  3. Add one or more relative paths to the music files to be included in the playlist.  These would be relative to the "Music" folder on the Clip where the Artist folders would be housed:

        #EXTM3U
        Antonio Vivaldi\12 Violin Concerto, for violin, strings & continuo in E major (‘La Primave.mp3
        Antonio Vivaldi\Concerto For 2 Violins In A Minor, Op. 3 No. 8 – Allegro (Mouvement 1).mp3
        Antonio Vivaldi\Four Seasons- Spring Allegro.mp3
        Émile Waldteufel\Skaters Waltz.mp3
        Franz Liszt\Hungarian Rhapsody No 2.mp3
        Franz Schubert\Moment Musical.mp3
        Frédéric Chopin\Minute Waltz.mp3
        Georges Bizet\Carmen Suite 1 Les Toreadors.mp3

    This seemed to be the minimum contents needed to get the playlist to be recognized.

    For the most part, if I kept the files in a subfolder below the Artist name, the player would not recognize them.  My decision to flatten the music files to just one level down proved to be beneficial here.  I say "for the most part" because I did have one instance where a file was 2 levels down, in an "album" folder below the Artist folder, and the player found it.  I couldn’t explain why this worked, or why moving the other files up to the Artist folder caused them to suddenly be recognized by the player.  I thought it might have something to do with the length of the overall path, but as you can see from the above samples, some of the songs I have are quite long, and the player found those just fine.

  4. Switch the player to "MTP" mode.  For the Clip, this is found under Settings\USB Mode.  My player had been set to "Auto Detect".  At least two of the posts I found mentioned the other mode, "MSC", as being completely unusable for transferring playlist files to the player.  I have not tried changing this back to "Auto Detect" or trying "MSC", and then copying the playlist files over and seeing if they still worked.  I also didn’t dig into what these two modes are.  I had been working on the playlist issue for the better part of the week, and honestly, was just interested to see it resolved rather than exploring every nook and cranny.  Perhaps another day.
  5. Place this file in the root of the "Music" folder.  I tried a few different other locations for the playlist files on the player, including the "Playlists" folder, but this was the only one where it worked.

At this point, assuming that the music files were already on the player, the "Playlists" option on the player will now show the new playlist, and let you play from it.  I decided to go one step further with the playlists, not wanting to manage the playlists file by hand, and created a small WinForms application called "Playlist Forge" that would allow me to drag and drop individual music files, or entire folders, and construct the playlist file myself.

Playlist Forge

If you drag an M3U file onto Playlist Forge, it opens it.

Dragging a single music file (MP3 or WMA) onto it adds it to the playlist, including the name of the file and the parent folder.  (Playlist Forge assumes the folder structure I mentioned previously, where the actual music files are in a folder named after the Artist.)

Dragging a folder onto Playlist Forge will recursively find all MP3 or WMA files, and include them in the playlist, regardless of their depth.  It would still only include the file name and the folder it was actually in, but it would dig down as deeply as needed in the folder structure to pull out all of the music files.

Once you have the right files in there, you hit "CTRL-S" to save it.  If you had opened an M3U file originally, it would overwrite it.  If you had just started dragging music files onto it, it will create a new file called "NewPlaylist.m3u" on your desktop.

Finally, you can hit “CTRL-N” to clear the utility out and start a new playlist from scratch.

While this is definitely rough, it proved to be much faster to write this utility and use it than trying to pull all of the paths and files out manually.  It will also allow me to easily edit the files later, as I add music to my collection.

The utility – both the source and the compiled application – are found in the PlaylistForge.zip archive found at http://tinyurl.com/MarkGilbertSource if you are interested.  (And yes, I did see that other people had built apps like this already, but this seemed like a fun little app to write.)

 

Sit back and enjoy the sweet sounds of victory

A lot of research and work for this, but after all of it I am much happier about the state of my music collection and the prospects for managing it going forward.

January 8, 2013 Posted by | Powershell, Visual Studio/.NET | Leave a Comment

Meta Insanity 2 – Strings and Meta Tags revisited

In October of 2011, I posted about some funkiness with trying to embed an expression hole into the “content” attribute of a <meta> tag.  I dubbed it “Meta Insanity”.

This past week, at my functional language users group meeting (FLUNK),  we took a break from doing problems or talks on functional programming, and instead did lightning talks about something tech-related.  I built my talk around my October blog post.  The talk went great – everyone got into trying other combinations of markup to either 1) figure out why it was failing, or 2) figure out better ways to get around it than just appending empty strings.  What follows are some of the other things we discovered that worked and didn’t work.

For this list, “worked” means that a tag such as the following rendered the expression hole correctly:

<meta name="description1" content="<%=Me.SomeMetaValue%>" />

While “didn’t work” means that the leading angle bracket (<) was HTML encoded before the expression hole is evaluated, which led to rendered markup like this:

<meta name=”description1″ content=”&lt;%=Me.SomeMetaValue%>” />

  1. Moving the meta tag into the <body> tag worked.
  2. Putting a different tag with an expression hole, such as <script>, or even fake ones like <blah>, worked.
  3. Removing the runat=”server” attribute from the <head> tag worked for everything, including <meta> tags.
  4. Removing the double quotes from the markup, and adding them to the string that would be returned in the expression hole, worked.
  5. Adding attributes to the <head> tag didn’t work.
  6. Adding attributes to the <head> tag, and removing the runat=”server” attribute, worked.

And probably the best find of the evening (and by “best” I mean “face-palm-funny”) was HTML encoding the double quotes in the source markup, like this:

<meta name="description6" content=&quot;<%=Me.SomeMetaValue%>&quot; />

The rendered result?  The quotes stayed encoded, but the expression hole was properly evaluated:

<meta name=”description6″ content=&quot;Blah blah&quot; />
So, what did we accomplish?  It looks like the combination of runat=”server” and <meta> tags is the crux of this issue.  Meta tags will always be needed on a professional site, but I can’t remember the last time I needed to access the <head> tag from the server, so it seems like simply removing that attribute is now the cleanest way to get around this issue.

May 12, 2012 Posted by | Visual Studio/.NET | Leave a Comment

Kinect in the Abstract: Working with the Sealed SkeletonData and JointsCollection classes

My latest side project involving the Kinect started to get a bit hairy.  The logic for what we were trying to do was at least an order of magnitude greater than the Target Tracking system my colleagues and I built last year.  It functioned, but it was getting exponentially more difficult to add features to it, let alone debug it.

So, suffering from a lull in my regular project work over the holiday break, I decided to start building some unit tests for it.  If nothing else, having a solid test suite would allow me to regression-test the application whenever I monkeyed with the code, and THAT would enable some good-sized refactorings that were long overdue.  My first task, then, was to figure out how to mock out the data coming off the Kinect.  My first task quickly hit a wall.

The application uses the SkeletonData object available in the SkeletonFrameReady event.  My original event handler looked something like this:

void nui_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{

    List<SkeletonData> ActiveSkeletonsRaw;
    SkeletonFrame allSkeletons = e.SkeletonFrame;

    ActiveSkeletonsRaw = (from s in allSkeletons.Skeletons
                          where s.TrackingState == SkeletonTrackingState.Tracked
                          select s).ToList();

    this._MyManager.UpdatePositions(ActiveSkeletonsRaw);
}

The UpdatePositions() method would handle moving the objects around based on the new positions of the skeletons/joints, and that was the primary method I wanted to test.  I figured if I could create my own SkeletonData object, and pass that into UpdatePositions, I could test any scenario I wanted.  Unfortunately, the SkeletonData class is sealed, and there aren’t any public constructors on it.  So, I went the route of writing my own version of SkeletonData – one that I could create objects from, and would effectively function the same as SkeletonData:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.Research.Kinect.Nui;


public class SkeletonDataAbstraction : ISkeletonData
{
    public JointsCollectionAbstraction Joints { get; private set; }
    public Vector Position { get; set; }
    public SkeletonQuality Quality { get; set; }
    public int TrackingID { get; set; }
    public SkeletonTrackingState TrackingState { get; set; }
    public int UserIndex { get; set; }


    public SkeletonDataAbstraction() 
    {
        this.InitializeJoints();
    }
    public SkeletonDataAbstraction(Microsoft.Research.Kinect.Nui.SkeletonData RawData) : this()
    {
        foreach (Joint CurrentJoint in RawData.Joints)
        {
            this.UpdateJoint(CurrentJoint);
        }
        
        this.Position = RawData.Position;
        this.Quality = RawData.Quality;
        this.TrackingID = RawData.TrackingID;
        this.TrackingState = RawData.TrackingState;
        this.UserIndex = RawData.UserIndex;
    }

    private void InitializeJoints()
    {
        this.Joints = new JointsCollectionAbstraction();
        foreach (JointID CurrentJointID in Enum.GetValues(typeof(JointID)))
        {
            this.Joints.Add(new Joint()
                                        {
                                            ID = CurrentJointID,
                                            Position = new Vector() { X = 0.0f, Y = 0.0f, Z = 0.0f, W = 0.0f },
                                            TrackingState = JointTrackingState.NotTracked
                                        });
        }
    }

    public void UpdateJoint(Joint NewJoint)
    {
        this.Joints[NewJoint.ID] = new Joint() 
                                                { ID = NewJoint.ID,
                                                  Position = new Vector()
                                                                            { X = NewJoint.Position.X,
                                                                              Y = NewJoint.Position.Y,
                                                                              Z = NewJoint.Position.Z,
                                                                              W = NewJoint.Position.W
                                                                            },
                                                  TrackingState = NewJoint.TrackingState
                                                };
    }
}

When the class is instantiated, the Joints collection is also instantiated with a "blank" Joint object for every joint defined by the Kinect (the complete list is defined by the Microsoft.Research.Kinect.Nui.JointID enumeration).  Then, the UpdateJoint method is called to overwrite those blank joints with the real values.  I also use this method in the unit tests to precisely place the joints I was interested in, just before running a given test.

I thought I would end up needing to mock out portions of the class, so I created an interface for it as well:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.Research.Kinect.Nui;

public interface ISkeletonData
{
}

As it turns out, I didn’t need to mock anything out – I can just create SkeletonDataAbstraction classes, and pass them directly into UpdatePositions.  I decided to keep the interface around, just in case I later found something that required a mock.

I also needed to be able to construct a JointsCollection object (what the SkeletonData.Joints property is defined as), but that was also marked sealed with no public constructors.  So, I created a JointsCollectionAbstraction object for it:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
using Microsoft.Research.Kinect.Nui;
using System.Collections.ObjectModel;
using System.ComponentModel;

public class JointsCollectionAbstraction : List<Joint>, IEnumerable
{

    public Joint this[JointID i]
    {
        get
        {
            return this[(int)i];
        }
        set
        {
            this[(int)i] = value;
        }
    }


}

After putting these together, I rewrote my original application code using the new abstraction layer, to make sure I had captured everything I needed to:

void nui_SkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e) 
{

    List<SkeletonData> ActiveSkeletonsRaw;
    SkeletonFrame allSkeletons = e.SkeletonFrame;

    ActiveSkeletonsRaw = (from s in allSkeletons.Skeletons
                          where s.TrackingState == SkeletonTrackingState.Tracked
                          select s).ToList();

    List<SkeletonDataAbstraction> ActiveSkeletons;
    ActiveSkeletons = new List<SkeletonDataAbstraction>();
    foreach (SkeletonData CurrentSkeleton in ActiveSkeletonsRaw)
    {
        ActiveSkeletons.Add(new SkeletonDataAbstraction(CurrentSkeleton));
    }

    this._MyManager.UpdatePositions(ActiveSkeletons);
}

That worked like a charm.  With each SkeletonFrameReady event-raise, I copy the key pieces of information from the Kinect over to my own structures, and use those from that point on.  Now the task of writing tests around this could begin in earnest.  I wrote a "CreateSkeleton" method for my unit tests that would encapsulate setting one of these up:

private SkeletonDataAbstraction CreateSkeleton(SkeletonTrackingState NewTrackingState, int NewUserIndex)
{
    SkeletonDataAbstraction NewSkeleton;

    NewSkeleton = new SkeletonDataAbstraction();
    NewSkeleton.Position = new Vector();
    NewSkeleton.Quality = SkeletonQuality.ClippedBottom;
    NewSkeleton.TrackingID = NewUserIndex + 1;
    NewSkeleton.TrackingState = NewTrackingState;
    NewSkeleton.UserIndex = NewUserIndex;

    NewSkeleton.UpdateJoint(new Joint()
                                        {
                                            ID = JointID.HandLeft,
                                            Position = new Vector() { X = X_WHEN_HAND_MOVES_AWAY, Y = this._OriginalY, Z = this._OriginalZ, W = this._OriginalW },
                                            TrackingState = JointTrackingState.Tracked
                                        });
    NewSkeleton.UpdateJoint(new Joint()
                                        {
                                            ID = JointID.HandRight,
                                            Position = new Vector() { X = X_WHEN_HAND_MOVES_AWAY, Y = this._OriginalY, Z = this._OriginalZ, W = this._OriginalW },
                                            TrackingState = JointTrackingState.Tracked
                                        });

    // Other joints overwritten here...

    return NewSkeleton;
}

(Note, the values for X_WHEN_HAND_MOVES_AWAY, _OriginalY, _OriginalZ, and _OriginalW are merely floats, defined specific to the application.)

Now I could easily create a list of Skeletons to track, with joints positioned just so, and pass that structure into UpdatePositions.


After I had most of this built out, I found a couple of other posts from people doing essentially the same thing:

The first one is an interesting forum post where one of the Microsoft guys admits that declaring the SkeletonData and other classes Sealed was probably not the brightest idea.

Thankfully, the wall I hit ended up coming only up to my knees, so after a few bumps and bruises I was over it.

January 10, 2012 Posted by | Microsoft Kinect, Visual Studio/.NET | Leave a Comment

With a little help from my Friends – TDD, Mocking, and InternalsVisibleTo

My current project is building a .NET library that will interface with multiple different web services.  Some of those services were not ready when I started the library, and I wanted to push myself further into mocking, so I wrote a .NET interface, and a wrapper class for each web service.  That allowed me to mock the services out, and simulate the response for a given request.  Once the web service became available, I’d implement that interface, and pass the requests through to the real services.

The goal here was to completely abstract the actual web service calls and responses from the user of the library.  However, in order for NUnit to be able to test those interface and other classes, they had to be declared Public.  That meant that someone actually using the library would see all of that structure in Intellisense – even when they would never use it, and would probably be confused by it.  This was the unfortunate tradeoff of TDD – or so I thought.

One of my colleagues, Doug, found a little assembly attribute called InternalsVisibleTo.  Applying this to your assembly (in the AssemblyInfo.vb/AssemblyInfo.cs class) allows non-Public members to be visible to the specified external assembly.  That allowed me to change the declaration on the Public classes that I didn’t really want exposed to a consumer of the library to Friend (the not quite equivalent to C#’s "internal" declaration).  That meant that I could effectively expose those classes and other items to my test assembly, but hide them from every other assembly.  For more information on this attribute, check these links out:

http://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.internalsvisibletoattribute.aspx#Y1557
http://devlicio.us/blogs/derik_whittaker/archive/2007/04/09/internalsvisibleto-testing-internal-methods-in-net-2-0.aspx

(Please note, the second is an older post by Derik Whittaker, and at the time he wrote it this was only available to C# assemblies; that has since been remedied – you can use it in VB now as well.)

So I added a line to expose this to my test assembly.  Then I started systematically changing Publics to Friends, recompiling, and re-running the unit tests.  I ran into a few bumps along the way.

First, I was doing a lot of constructor injection in association with the mocking where the parameter-less constructors would set up the lower-level objects, but the other variants would allow me to pass those objects in (the passed-in objects would be my mock objects).  In the course of this rework, I ended up hiding a lot of those classes.  Initially, the constructor variants that used them were marked Public, which the compiler had a fit about – I couldn’t expose those classes via the constructor parameters because the classes were now marked as Friend, but the constructors were marked as Public.  Changing the constructor designations to Friend solved this.

Second, when I started changing the classes used by my mocking framework, Moq, I found that the InternalsVisibleTo line allowing my test assembly wasn’t enough.  I figured the Moq assembly needed to be explicitly allowed, too.  I tried the code-roulette approach first, without success.  Then I consulted the internets, which of course had the answer.  Andrey Shchekin had the solution – DynamicProxyGenAssembly2 (http://blog.ashmind.com/2008/05/09/mocking-internal-interfaces-with-moq/).  Yeah, that was totally going to be my next guess.  Uh-huh.

So, I have my mocking/TDD cake and get to eat it too.  The library footprint is nicely trimmed back, without much of the original clutter, but I can still unit-test to my heart’s content.  Many thanks to Doug for finding this little gem!

The only thing I wasn’t able to accomplish with this was hiding the SOAP web service structure.  I tried changing the Visual Studio-generated classes to Friend, but that started failing when I tried to call the service.  Perhaps there is another attribute that at least hides these from Intellisense.  A search for another day.

December 7, 2011 Posted by | Visual Studio/.NET | Leave a Comment

Meta Insanity – Strings and Meta Tags

It was another one of those days.  And this time, I DID go home after I saw this in action.

On many of my past Web Forms sites, I’ve had to include a dynamic meta/description tag in the header.  The “dynamic” parts come in when the page being rendered is a product page, a recipe page, or something else where the data is drawn out of a database.

When I first started doing this, I tried something like this*:

<meta name="description1" content="<%=Me.SomeMetaValue%>" />

 

Where “SomeMetaValue” is a property of the page, or a reference to a shared value somewhere else in the solution.  This version fails because the inner left angle bracket gets encoded, thus ruining the server-side expression hole:

<meta name=”description1″ content=”&lt;%=Me.SomeMetaValue%>” />

To get around this, I replaced the entire meta/description tag with an ASP.NET literal, and insert the value I wanted on the server side.  That worked.

Recently, this issue came up again.  Doug had taken over one of the sites where I had done this, and he refused to believe that the data wouldn’t render correctly in the expression hole.  He tried it, and sure enough the < got encoded.  Doug kept at it, though, and found that if you removed the double quotes around the expression hole, it rendered the value correctly:

<meta name="description2" content=<%=Me.SomeMetaValue%> />
 

Unfortunately, this was no longer valid HTML:

<meta name=”description2″ content=Blah blah />

Doug remembered an old issue he and I troubleshot, documented at System.String has me in knots.  He modified the original attempt to prepend an empty string:

<meta name="description3" content="<%="" & Me.SomeMetaValue%>" />

 

That renders perfectly:

<meta name=”description3″ content=”Blah blah” />

Putting the empty string after SomeMetaValue also works just fine:

<meta name=”description3″ content=”<%=Me.SomeMetaValue & “”%>” />

What.

The.

Heck?!?

I had written off the encoded angle bracket to an overly eager web server (Cassini and IIS behaved the same in this case), but why in the world would tacking on an empty string force it to NOT be encoded and allow it to work?  Since we’re dealing with silly strings here, I followed the lesson learned in System.String has me in knots, and added .ToString onto the end of Me.SomeMetaValue:

<meta name="description4" content="<%=Me.SomeMetaValue.ToString%>" />

 

 

Aaaaaand, we’re back to encoding:

<meta name=”description4″ content=”&lt;%=Me.SomeMetaValue.ToString%>” />

Sigh.  Oh Visual Studio.  Why must you taunt me so?

For a fully working – er, FAILING – sample of the above, check out the MetaInsanity.zip archive at http://TinyURL.com/MarkGilbertSource.

 

* I used “description1”, “description2”, etc. just for illustrative purposes for this blog post.  The production sites have this value as simply “description”.

October 19, 2011 Posted by | Visual Studio/.NET | 1 Comment

One web.config to rule them all – eh, not so fast

Well, I thought I had the hard parts worked out.

In my previous post I threw out this little bit of hand-waving:

The hard part of this was not figuring out how to put 5 environments’ worth of values into a single web.config – that’s already a solved problem in my shop (and I’m sure you could come up with your own approach).  The hard part here was figuring out to programmatically override the values that I would normally put into a web.config.

Well, as it turns out, changing out the values at runtime was the hard part here.  In fact, when it came to configuring ELMAH to generate email notifications, it was nearly insurmountable.

The mechanisms that we’ve developed to handle a "dynamic" web.config are all centered on the assumption that we can do the following (another gem I wrote in the last post):

At runtime, figure out the URL that the site is being executed under, and look up the values for that URL in the web.config.

And here’s where things fell apart.  You see,

  • In order to figure out the current URL, I needed a Request object to inspect.
  • The soonest I could get at the Request was the Application’s Begin_Request event.
  • The ELMAH module, however, was getting initialized before Application.Start, let alone Begin_Request.
  • Once initialized, ELMAH cached the now-incorrect settings, and provides no mechanism to force an update after the fact.

If I can’t override the settings when the application first starts up, how about forcing an override when the first error occurs?  There was an ELMAH event for that.  Unfortunately, being a module, the ELMAH email module ends up processing the error message AFTER the Request has already been processed, so it’s no longer available.

Rather than trying to come up with an even more elaborate (aka "convoluted") mechanism for saving off the current environment and then overriding ELMAH’s cache when the time came, I considered modifying the ELMAH source to initialize later, have a way to update the cached settings, etc.  Unfortunately, I had already sunk too much time trying to get this to work, so I opted for a much simpler solution – a simple email method of my own device, invoked in the Application.Error handler.  It functions, and occurs while the Request is still available, so I had a much easier time wiring it into the one web.config structure I had in place.  It’s not as robust as ELMAH, and certainly isn’t the way I wanted to go with it.  After using ELMAH for dozens of sites over the last few years, I had come to rely on it so completely.  It felt quite odd to have to go without it.

The good news is that the ELMAH mail module was the only one to give me trouble.  The logging module CAN be updated on Application.Begin_Request.  Although, even there I went a different route.  Instead of logging the errors to the file system, I opted to use the In-Memory provider.  Here is my revised web.config (only the ELMAH-specific bits are shown):

<?xml version="1.0"?>
<configuration>
  <configSections>

    <!-- 
            Error Logging Modules and Handlers (ELMAH) 
            Copyright (c) 2004-7, Atif Aziz.
            All rights reserved.
        -->
      <sectionGroup name="elmah">
        <section name="errorLog" requirePermission="false" type="Elmah.ErrorLogSectionHandler, Elmah"/>
        <section name="security" requirePermission="false"  type="Elmah.SecuritySectionHandler, Elmah"/>
        <section name="errorFilter" requirePermission="false" type="Elmah.ErrorFilterSectionHandler, Elmah"/>
      </sectionGroup>
  </configSections>

  
  <system.web>
    <httpHandlers>
      <add verb="POST,GET,HEAD" path="errors/report.axd" type="Elmah.ErrorLogPageFactory, Elmah"/>
    </httpHandlers>

    <httpModules>
      <add name="ErrorLog" type="Elmah.ErrorLogModule, Elmah"/>
      <add name="ErrorFilter" type="Elmah.ErrorFilterModule, Elmah"/>
    </httpModules>

  </system.web>


  <elmah>
    <errorLog type="Elmah.MemoryErrorLog, Elmah" />
    <security allowRemoteAccess="yes"/>
    <errorFilter>
      <test>
        <equal binding="HttpStatusCode" value="404" type="Int32"/>
      </test>
    </errorFilter>
  </elmah>

  
  <system.webServer>

    <modules runAllManagedModulesForAllRequests="true">
      <add name="ErrorLog" type="Elmah.ErrorLogModule, Elmah"/>
      <add name="ErrorFilter" type="Elmah.ErrorFilterModule, Elmah"/>
    </modules>

    <handlers>
      <add name="ElmahErrorReportingPage" verb="POST,GET,HEAD" path="errors/report.axd" type="Elmah.ErrorLogPageFactory, Elmah"/>
    </handlers>
  </system.webServer>

</configuration>

Since there was no longer any difference between the environments, I didn’t have to override anything.

The ActiveRecord initialization also seemed to go in as I expected.  Both ELMAH logging and ActiveRecord have been in place and working in our environments for a solid week now.

So lesson learned – get the code actually working before I blog on it.

October 17, 2011 Posted by | Castle ActiveRecord, Visual Studio/.NET | Leave a Comment

One web.config to rule them all

Most of the web-centric work I’ve done in my career and especially in the last four years has involved developing sites that are designed to be deployed to multiple environments: my local workstation, our internal Dev and Staging servers, and up to three environments at the client.  One of my strategies for managing the differences in file paths, email recipients, database connection strings, etc. among all of those environments is to push anything that changes from one environment to another into the web.config.  Then, I have a separate web.config for each environment, and I rename (or have the client’s technical staff rename) the appropriate file for a given environment.

TheSetup
That system has worked well for years.  That is, until a few weeks ago when I was assigned to a new client who insists that there only be a single web.config that covers all environments.  Doing this allows them to simply copy the files from one environment to the next wholesale, and eliminates the need to rename anything.  The other developers who have worked with this client for a while have created a couple of different frameworks for implementing this requirement, but they boil down to the same basic approach:

1) Put all values for all environments into the web.config, but tie them to the corresponding URL for each environment.

2) At runtime, figure out the URL that the site is being executed under, and look up the values for that URL in the web.config.

My colleagues affectionately refer to this scheme as “one web.config to rule them all”.

TheChallenge
A couple of the key components that I incorporate into the sites I work on are ELMAH and Castle ActiveRecord.  Naturally, since my current task involves building three brand new sites, I wanted to drop these in from the beginning.  The challenge was how to use them given this client’s requirement.  The hard part of this was not figuring out how to put 5 environments’ worth of values into a single web.config – that’s already a solved problem in my shop (and I’m sure you could come up with your own approach).  The hard part here was figuring out to programmatically override the values that I would normally put into a web.config.

TheSolution – ELMAH
Let’s start with ELMAH.  Normally, I’d have these sections in my web.config (only the ELMAH-specific portions are shown here):

<configuration>

  <configSections>
    <!-- 
            Error Logging Modules and Handlers (ELMAH) 
            Copyright (c) 2004-7, Atif Aziz.
            All rights reserved.
        -->
    <sectionGroup name="elmah">
      <section name="errorLog" requirePermission="false" type="Elmah.ErrorLogSectionHandler, Elmah"/>
      <section name="errorMail" requirePermission="false" type="Elmah.ErrorMailSectionHandler, Elmah"/>
      <section name="security" type="Elmah.SecuritySectionHandler, Elmah"/>
      <section name="errorFilter" type="Elmah.ErrorFilterSectionHandler, Elmah"/>
    </sectionGroup>
  </configSections>

  
  <system.web>
    <httpModules>
      <add name="ErrorLog" type="Elmah.ErrorLogModule, Elmah"/>
      <add name="ErrorMail" type="Elmah.ElmahMailModule"/>
      <add name="ErrorFilter" type="Elmah.ErrorFilterModule, Elmah"/>
     </httpModules>
  </system.web>

  <elmah>
    <errorLog type="Elmah.XmlFileErrorLog, Elmah" logPath="~/bin/Logs"/>

 


 <errorMail from="me@blah.com"
 to="blah@blah.com"
 subject="Test Error"
 smtpServer="blah.com"/>
 <security allowRemoteAccess="yes"/>
 <errorFilter>
 <test>
 <equal binding="HttpStatusCode" value="404" type="Int32"/>
 </test>
 </errorFilter>
 </elmah>

 

</configuration>
 

The things that differ per environment are:

*) The “logPath” attribute of the elmah/errorLog tag

*) The “to”, “subject”, and “smtpServer” properties of the “elmah/errorMail” tag.

My colleague, Joel, found that you can write a class that inherits from Elmah.ErrorMailModule, override the settings there, and use that in the httpModules block.  First, the class:

Public Class ElmahMailExtension
    Inherits Elmah.ErrorMailModule

    Protected Overrides Function GetConfig() As Object
        Dim o As Object = MyBase.GetConfig()
        o("smtpServer") = "mail.blah.com"
        o("subject") = String.Format("Blah message at {0}", Now.ToLongTimeString())
        o("to") = "me@blah.com"
        Return o
    End Function
End Class

And the web.config modification:

<configuration>
 
  <system.web>

    <httpModules>
      <add name="ErrorLog" type="Elmah.ErrorLogModule, Elmah"/>
      <add name="ErrorMail" type="MvcApplication1.ElmahMailExtension"/>
      <add name="ErrorFilter" type="Elmah.ErrorFilterModule, Elmah"/>
     </httpModules>

  </system.web>

</configuration>

Simple.

Overriding the logging settings requires a slightly different tack.  Instead of inheriting from Elmah.ErrorLogModule, I create a class that inherits from Elmah.XmlFileErrorLog:

Imports System.Web.Hosting

Public Class ElmahLogExtension
    Inherits Elmah.XmlFileErrorLog

    Public Sub New(ByVal config As IDictionary)
        MyBase.New(HostingEnvironment.MapPath("~/bin/Logs2"))
    End Sub

    Public Sub New(ByVal logPath As String)
        MyBase.New(logPath)
    End Sub

End Class

I couldn’t find a convenient collection to change values in, so I cheated.  Using JustDecompile, I looked at what the two constructors were doing.  They basically just manipulate the log path passed in.  So, I leave the New(string) variant alone, and modify the New(IDictionary) variant to ignore the incoming “config” parameter, and substitute the path that I want to use.  One of the things that I noticed the XmlFileErrorLog constructor doing was replacing paths with leading “~/” with the full path on the file system.  Full log paths won’t require this.
TheSolution – ActiveRecord

Here is a common ActiveRecord configuration for me (just the ActiveRecord-relevant parts are shown here):

<configuration>

  <configSections>
    
    <section name="activerecord" type="Castle.ActiveRecord.Framework.Config.ActiveRecordSectionHandler, Castle.ActiveRecord" />
    <section name="nhibernate" type="System.Configuration.NameValueSectionHandler, System, Version=1.0.5000.0,Culture=neutral, PublicKeyToken=b77a5c561934e089" />
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />

  </configSections>

  <activerecord isDebug="false" threadinfotype="Castle.ActiveRecord.Framework.Scopes.HybridWebThreadScopeInfo, Castle.ActiveRecord">
    <config database="MsSqlServer2005" connectionStringName="MyTestDB">
    </config>
  </activerecord>

  <log4net>
  ...

  </log4net>

  <nhibernate>
  ... 
  </nhibernate>

  <connectionStrings>
    <add key="MyTestDB" value="Database=MyDBName;Server=MyServer,1433;User ID=MyUser;Password=MyPassword;" />
  </connectionStrings>

</configuration>

My Application_Start method in Global.asax initializes ActiveRecord:

Sub Application_Start()
    AreaRegistration.RegisterAllAreas()

    Dim MyConfig As IConfigurationSource = Castle.ActiveRecord.Framework.Config.ActiveRecordSectionHandler.Instance
    Dim MyAssemblies As System.Reflection.Assembly() = New System.Reflection.Assembly() {System.Reflection.Assembly.Load("MvcApplication1")}
    ActiveRecordStarter.Initialize(MyAssemblies, MyConfig)
    AddHandler Me.EndRequest, AddressOf Application_EndRequest

    RegisterRoutes(RouteTable.Routes)
End Sub

The primary piece that I need to override is the connection string.  My first attempts were in a similar vein as with ELMAH – in this case create a class that inherits from Castle.ActiveRecord.Framework.Config.ActiveRecordSectionHandler, and use that in the web.config.  However, I found an even easier way – simple use a different IConfigurationSource object in the call to ActiveRecordStarter.Initialize – one that is constructed programmatically.  As it turns out, there is even a built-in class to do this – InPlaceConfigurationSource:

Sub Application_Start()
    AreaRegistration.RegisterAllAreas()

Dim MyConfig As InPlaceConfigurationSource = InPlaceConfigurationSource.Build(DatabaseType.MsSqlServer2005, “Database=MyTest;Server=blahsqlsrvr,1433;User ID=blah;Password=blah;”) MyConfig.ThreadScopeInfoImplementation = GetType(Framework.Scopes.HybridWebThreadScopeInfo)

    Dim MyAssemblies As System.Reflection.Assembly() = New System.Reflection.Assembly() {System.Reflection.Assembly.Load("MvcApplication1")}
    ActiveRecordStarter.Initialize(MyAssemblies, MyConfig)
    AddHandler Me.EndRequest, AddressOf Application_EndRequest

    RegisterRoutes(RouteTable.Routes)
End Sub

Setting the ThreadScopeInfoImplemention property allows me to reproduce the “threadinfotype” property of the <activerecord> block in the web.config.

Using this allows me to completely dump the <activerecord> block and the configSection/section that references it:

<configuration>

  <configSections>
    
    <!--<section name="activerecord" type="Castle.ActiveRecord.Framework.Config.ActiveRecordSectionHandler, Castle.ActiveRecord" />-->
    <section name="nhibernate" type="System.Configuration.NameValueSectionHandler, System, Version=1.0.5000.0,Culture=neutral, PublicKeyToken=b77a5c561934e089" />
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />

  </configSections>

  <!--<activerecord isWeb="true" isDebug="false" threadinfotype="Castle.ActiveRecord.Framework.Scopes.HybridWebThreadScopeInfo, Castle.ActiveRecord">
    <config database="MsSqlServer2005" connectionStringName="MyTestDB">
    </config>
  </activerecord>—>

  ...  
</configuration> 

TheConclusion

I’m not sold on the “one web.config to rule them all” approach to maintaining environment settings, but at least I don’t have to give up my favorite frameworks as a result.

September 28, 2011 Posted by | Castle ActiveRecord, Visual Studio/.NET | 5 Comments

Target-Tracking with the Kinect, Part 3 – Target Tracking Improved, and Speech Recognition

In Part 1 of this series, I went through the prerequisites for getting the Kinect/Foam-Missile Launcher mashup running.  In Part 2, I walked through the core logic for turning the Kinect into a target-tracking system, but I ended it talking about some major performance issues.  In particular, commands to the launcher would block updates to the UI, which meant the video and depth feeds were very jerky. 

In this third and final part of the series, I’ll show you the multi-threading scheme that solved this problem.  I’ll also show you the speech recognition components that allowed the target to say the word "Fire" to actually get a missile to launch. 

What did you say?

We had tried to implement the speech recognition feature by following the "Audio Fundamentals" tutorial.  That code looked like it SHOULD work, but there a couple of differences between the tutorial app and ours: the tutorial’s was a C# console application, while ours was a VB WPF application.  As it turns out, those two differences made ALL the difference.

For the demo, Dan (the host) mentions the need for the MTAThread() attribute on the Main() routine in his console app.  Since our solution up to this point was VB, it looked like we would need this.  I tried adding that to every place that didn’t generate a compile error, but nothing worked – the application kept throwing this exception when it fired up:

Unable to cast COM object of type ‘System.__ComObject’ to interface type ‘Microsoft.Research.Kinect.Audio.IMediaObject’. This operation failed because the QueryInterface call on the COM component for the interface with IID ‘{D8AD0F58-5494-4102-97C5-EC798E59BCF4}’ failed due to the following error: No such interface supported (Exception from HRESULT: 0×80004002 (E_NOINTERFACE)).

Stack Trace:
       at System.StubHelpers.StubHelpers.GetCOMIPFromRCW(Object objSrc, IntPtr pCPCMD, Boolean& pfNeedsRelease)
       at Microsoft.Research.Kinect.Audio.IMediaObject.ProcessOutput(Int32 dwFlags, Int32 cOutputBufferCount, DMO_OUTPUT_DATA_BUFFER[] pOutputBuffers, Int32& pdwStatus)
       at Microsoft.Research.Kinect.Audio.KinectAudioStream.RunCapture(Object notused)
       at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
       at System.Threading.ThreadHelper.ThreadStart(Object obj)

I decided to try a different tack.  I wrote a C# console app, and copied all of Dan’s code into it (removing the Using statements and initializing the variables manually to avoid scoping issues).  That worked right out of the gate.  Since we were very short on time (this was two days from the demo at this point) I decided to port our application to C#, then incorporated the speech recognition pieces.

First, the "setup" logic was wrapped into a method called "ConfigureAudioRecognition" (I pretty much copied this right from the tutorial).  That method was invoked in the Main window’s Loaded event, on its own thread.  In addition to initializing the objects and defining the one-word grammar ("Fire"), this adds an event handler for the recognizer engine’s SpeechRecognized event:

private void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
    if (this._Launcher != null && 
        this._IsAutoTrackingEngaged &&
        e.Result.Confidence > 0.95) { this.FireCannon(); }
}

The command to launch a missile is only given if the Launcher object is defined, the app is in "auto-track" mode, and the confidence level of the recognition engine is greater than 95%.  This last check is an amusing one.  Before I included this check, I would read a sentence that happened to contain some word with the letter "f", like "if", and the missile would launch.  Inspecting the Confidence property, I found that this only had a value in the 20-30% range.  When I said "Fire", this value as 96-98%.  The confidence check helps tremendously, but it’s still not perfect.  Words like "fine" can fool it.  It’s much better than having it fire with every "f", though.


Take a number

Doug, Joshua, and I discussed some solutions to the UI updates earlier in the week, and the most promising one looked like using BackgroundWorker (BW) to send a command to the launcher asynchronously.  That was relatively easy to drop into the solution, but I almost immediately hit another problem.  The launcher was getting commands sent to it much more frequently than my single BW could handle it, and I started getting runtime exceptions to the effect of "process is busy, go away".  I found an IsBusy property on the process that I could check to see if it had returned yet, but that meant that I would have to wait for it to come back before I could send it another command – basically the original blocking issue, but one step removed.

I briefly toyed with the idea of spawning a new thread with every command, but because they were all asynchronous there was no way to guarantee that they would be completed in the order I generated them in.  Left-left-fire-right looks a lot different than fire-right-left-left.  What I really needed was a way to stack up the requests, and force them to be executed synchronously.  What I found was an unbelievably perfect solution from Matt Valerio with his post titled "A Queued BackgroundWorker Using Generic Delegates".  As the title suggests, he wrote a class called “QueuedBackgroundWorker” that would add another BW to a queue, and then pop them off and process them in order.  This was EXACTLY what I needed.  This was also the most mind-blowing use of lambda expressions I’ve ever seen: you pass entire functions to run as the elements on the queue which get executed when that element is popped off the queue.

I added a small class called "CannonVector" that would roll up a direction (up, down, left, or right) and a number of steps.  Then, I created two methods – FireCannon() and MoveCannon() that would now wrap my calls to the launcher methods that Matt Ellis wrote (see Part 2 of this series):

private void FireCannon()
{
    QueuedBackgroundWorker.QueueWorkItem(
        this._Queue,
        new CannonVector
        {
            DirectionRequested = CannonDirection.Down,
            StepsRequested = 0
        },
        args =>
        {
            this._Launcher.Fire();
            return (CannonVector)args.Argument;
        },
        args => { }
    );
}


private void MoveCannon(CannonDirection NewDirection, int Steps)
{
    QueuedBackgroundWorker.QueueWorkItem(
        this._Queue,
        new CannonVector
        {
            DirectionRequested = NewDirection,
            StepsRequested = Steps
        },
        args =>
        {
            CannonVector MyCannonVector;
            MyCannonVector = (CannonVector)args.Argument;

            switch (MyCannonVector.DirectionRequested)
            {
                case CannonDirection.Left:
                    this._Launcher.MoveLeft(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Right:
                    this._Launcher.MoveRight(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Up:
                    this._Launcher.MoveUp(MyCannonVector.StepsRequested);
                    break;
                case CannonDirection.Down:
                    this._Launcher.MoveDown(MyCannonVector.StepsRequested);
                    break;
            }
            return new CannonVector
            {
                DirectionRequested = MyCannonVector.DirectionRequested,
                StepsRequested = MyCannonVector.StepsRequested
            };
        },
        args => { }
    );
}

Cool, huh?

With this in place, everything was smooth again – launcher movement and UI updates, alike.

And there was much rejoicing.

So there you have it.  Full source code for this solution can be found in the "KinectMissileLauncher.zip" archive here: http://tinyurl.com/MarkGilbertSource.  Happy hunting!

September 10, 2011 Posted by | Microsoft Kinect, Visual Studio/.NET | 2 Comments

Target-Tracking with the Kinect, Part 2 – Target Tracking

In Part 1 of this series I laid out the prerequisites.  Now we’ll get into how to turn the Kinect into a tracking system for the cannon.

Manual Targeting

As I mentioned in Part 1, one of the pieces to this puzzle was already written for us – a .NET layer around the launcher.  This layer was provided by Chris Smith in his Being an Evil Genius with F# and .NET post.  He links to this source code at the very end of the post, and included several projects.  We ended up using the RocketLib\RocketLauncher_v0.5.csproj project.

So, now we had a class that we could give commands to the launcher such as

Me._Launcher.MoveLeft(5)
Me._Launcher.MoveDown(10)
Me._Launcher.Fire()

Where “Me._Launcher” was an object of type RocketLib.RocketLauncher.  The numbers being passed to the “Move” commands are the number of times to move the launcher turrent.  The unit of “time” or “step” (as we came to refer to it) seemed to translate into a little less than half a degree of rotation (either left/right or up/down).

Armed with this knowledge (see what I did there?), we were able to whip together a little WPF interface that had five buttons on it – Up, Down, Left, Right, and Fire – that controlled the launcher manually.  That became the “Manual” mode.  The “Auto-track” mode, where the Kinect would control the launcher, would come next.

Auto-Targeting

Now we started going through the Kinect SDK Quickstart video tutorials, produced by Microsoft and hosted by Dan Fernandez.  To begin, we wanted to get to the raw position data (X, Y, and Z) from the camera.  We ended up compressing the first four tutorials (“Installing and Using the Kinect Sensor”, “Setting up the Development Environment”, “Skeletal Tracking”, and “Camera Fundamentals”) into a Friday to get ramped up as quickly as possible.

In “Skeletal Tracking Fundamentals”, Dan explains that the Kinect tracks skeletons, not entire bodies.  Each skeleton has 20 different joints, such as palms, elbow, head, shoulders, etc.  We decided to select the “ShoulderCenter” joint as our target.

Next, we added labels for the X, Y, and Z positions of the ShoulderCenter joint to the app, and then started moving around the room in front of the Kinect, seeing how the values changed.  The values are given in meters, with X and Y being 0 when you’re directly in front of the depth camera.  These values are updated in the SkeletonFrameReady event.

Now, the fun could really begin.  We decided to focus on left/right movement of our target, so the Y value is not used in the app at all.

We also decided that since the launcher had a real physical limitation as to how fast it could move, we couldn’t give it too many commands at a time.  The Kinect sends data 30 times a second, so we decided to sample the data twice a second (every 15 frames).

Our first attempt at this was very complicated and clunky, and didn’t work well unless you were at a magical distance from the Kinect (basically we threw enough magic numbers into the equation until it worked for that one distance).  We really ran into problems when we tried to extend that to work for any depth.

It was Doug that hit upon the idea of calculating the angle to turn the launcher as the arc tangent of X/Z as opposed to what we had been doing (the number of steps).  That did two things for us – first, the angle approach was correctly taking the depth information (Z measurement) into account, and second, it meant we only had to store the last known position of the launcher (measured as a number of steps, either positive or negative, with 0 being straight ahead).  If we knew the last position, and we knew where we had to move to, we could swivel the launcher accordingly.

Private Sub nui_SkeletonFrameReady(ByVal sender As Object, ByVal e As SkeletonFrameReadyEventArgs)
    Dim allSkeletons As SkeletonFrame = e.SkeletonFrame
    Dim NewCannonX, DeltaX As Integer Me._FrameCount += 1

    'get the first tracked skeleton Dim skeleton As SkeletonData = ( _
        From s In allSkeletons.Skeletons _
        Where s.TrackingState = SkeletonTrackingState.Tracked _
        Select s).FirstOrDefault()

    Dim ShoulderCenter = skeleton.Joints(JointID.ShoulderCenter)

    Dim scaledJoint = ShoulderCenter.ScaleTo(320, 240)
    Me.UpdateCrossHairs(scaledJoint.Position.X, scaledJoint.Position.Y, scaledJoint.Position.Z)

    Me.HorizontalPosition.Content = ShoulderCenter.Position.X
    Me.VerticalPosition.Content = ShoulderCenter.Position.Y
    Me.DepthPosition.Content = ShoulderCenter.Position.Z

    Dim NormalizedX As Integer = CType(ShoulderCenter.Position.X * 10, Integer)
    Dim AbsoluteX As Integer = Math.Abs(NormalizedX)

    If (Me._IsAutoTrackingEngaged) Then 
 If (ShoulderCenter.Position.Z > 0) Then 
 ' The multipliers of 100 * 1.6 are needed to convert the degrees to move into steps for the cannon 
            NewCannonX = Math.Atan2(ShoulderCenter.Position.X, ShoulderCenter.Position.Z) * 100 * 1.6
            DeltaX = Math.Abs(NewCannonX - Me._LastCannonX)
            If (NewCannonX < Me._LastCannonX) Then 
 Me._Launcher.MoveRight(DeltaX)
            Else 
 Me._Launcher.MoveLeft(DeltaX)
            End If 
 Me._LastCannonX = NewCannonX
            Me._NetCannonX = NewCannonX
        End If 
 End If 
End Sub

With this logic in place, the tracking became fairly good, regardless of the distance between the target and the Kinect.

Assumptions Uncovered

Since there really wasn’t any feedback that the launcher could give us about it’s current position, this logic make a couple of major assumptions about the world.  First, the Kinect and the launcher have to be pointed straight ahead to begin with, and second, the Kinect needs to remain pointing ahead.

We uncovered the first assumption when the launcher stopped responding to commands to move right.  We could move it to the left, but not to the right.  We fired up the application that comes with it, and discovered a “Reset” button that caused the launcher to swivel all the way to one side, then to a “center” point.  This center point was actually denoted by a raised arrow on the launcher’s base – something I had not seen up to this point.  After we reset it, it would move left and right just fine.  As it turns out, the launcher can’t move 360 degrees indefinitely – it has definite bounds.  The reset function moved it back to center to maximize the left/right motion.

After we discovered that, I would jump out to that app to reset the launcher, and then I had to shut it down again before I could use ours (two apps couldn’t send commands to the launcher – in fact we got runtime errors if we tried to run both apps at the same time).  After a while that got old, so we included a reset of our own.  Since we knew the launcher’s current position, we’d just move in the opposite direction that amount.  We added a Reset button to our own app, and also called the same method when the app was put back to Manual tracking and when it was shut down.

We uncovered the second assumption in a rather amusing way.  During one of our tests we noticed the cannon was constantly aiming off to Doug’s (our target at the time) right.  He could move left or right, but the launcher was always off.  He happened to look up and noticed that the Kinect had been bumped, so it wasn’t pointing directly ahead any more.  As a result, the camera was looking off to one side and all of its commands were off.  After that, we were much more careful about checking the Kinect’s alignment, and not bumping it.

Some fun to be had

Early on we had thought up a “fun” piece of icing on this electronic cake.  What if we took the video image from the camera, and superimposed crosshairs on it?  We could literally float an image with a transparent background over the image control on the form.  If we could get the scaling right, it could track on top of the user’s ShoulderCenter joint.

And we did.  This is turned on using the “Just for Mike” button at the bottom of the app.  During the agency meeting demo, I had walked through the basic tracking, using Mike (our President) as the target, and explained about the video and depth images.  Then – very dramatically – I “noticed” the screen and turned to Doug (who was running the computer) – “uh, Doug?  I think we’re missing something.”  At which point he hit the button to add the cross hairs to the video image. “There we go!  That’s better.”  Mike got a good laugh out of it, as did most of the rest of the audience.  Fun?  Check!

Beyond the fun, though, I thought it was cool that we could merge the video and depth information to such great effect.  Between having the launcher track you, and seeing the cross hairs on your chest – it’s downright eerie.

Performance Issues

So, by this point, we had launcher tracking, both video and depth images refreshing 30 times a second, and crosshairs.

And everything was running on the same thread.

Yeah.  We now had some performance issues to solve.

When the launcher moved at all, and especially when it fired (which took 2-3 seconds to power up and release), the images would completely freeze, waiting for the launcher to complete.  The easy solution?  Duh!  Just put the launcher and the image updates on their own threads.  Um, yeah.  That turned out to be easier said than done.  We’ll cover the multi-threading solution, as well as the speech recognition features in Part 3.  Those two topics turn out to be intertwined.

Update: Full source code for this solution can be found in the “KinectMissileLauncher.zip” archive here: http://tinyurl.com/MarkGilbertSource.

September 10, 2011 Posted by | Microsoft Kinect, Visual Studio/.NET | 2 Comments

Target-Tracking with the Kinect, Part 1 – Intro and Prerequisites

Doug started it.

On some Friday back in late June, he mentioned that Microsoft had released a Beta SDK for the Kinect just the week before.  He asserted “We need to do something with it.  I have a Kinect I can bring in.”  By “do something with it” he meant in the Friday lunch sessions we’d been holding for a year called “Sandbox”.  Sandbox was where a small group of us got together to work with something we didn’t normally get to use during our day jobs.  We tried to keep it light and fluffy, and the Kinect fit both to a T.

Over that next weekend, I mulled over what we could do with the Kinect.  What would be a good enough demonstration of this electronic marvel?  And then it hit me – Joel (another Sandbox participant) had a USB powered foam-missile launcher (like the one pictured here: http://www.amazon.com/Computer-Controlled-Foam-Missile-Launcher/dp/B00100K5RM/ref=pd_sim_t_3).  Now we’re talking!

Microsoft had released a series of tutorials for the SDK, and a .NET interface to the launcher that Chris Smith used for his “evil genius” post.  Chris attributes the original code for this wrapper to Matt Ellis.

We decided that the Kinect would feed commands to the launcher telling it where to aim, and we’d use the speech recognition abilities of the Kinect to let the person in the cross-hairs say the word “fire” to send off a missile.  And so began a series of Fridays where we hacked together an app that turned the Kinect into a target-tracking system for the launcher.  We thought we had the hard stuff already done for us – we simply needed to write something that would connect A to B.  But, as any good project should be, we found the easy parts weren’t so easy, and were pushed and prodded into learning something new.  This is the first of a three-part series describing our solution.

Before we dive into code, I want to call out the software, frameworks, and SDKs that were ultimately needed for this project.  Some of these were called out by the quickstart tutorials, and the rest were discovered along the way.  These were installed in this order:

  • First, our laptop started with Visual Studio 2010 Professional, but in the quickstart tutorials, Dan (Fernandez) mentions that he’s working with the Express version.
  • The sample application that comes with the launcher.  This includes the drivers for the launcher itself, USBHID.dll.  The .NET wrapper provided by Matt Ellis will poke into the OpenHID method to send commands to the launcher.
  • The DirectX End-User Runtime.  This is required by the DirectX SDK, and is available from here.  This installer will need to be run as an Administrator (on Windows 7, anyway), and I had to do a manual restart of the machine after it finished.  The installer will not prompt you to do this, but the DirectX SDK (the next step) wouldn’t install correctly until I did.
  • The latest DirectX SDK.  This is a very large install – 570 MB – available from http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=6812.  I had to make sure nothing else was running when I installed this, and had to install it as an Administrator.
  • The Kinect SDK itself, available from http://research.microsoft.com/en-us/um/redmond/projects/kinectsdk/.
  • The Coding4Fun Kinect Library available from http://c4fkinect.codeplex.com.  This is not strictly required, but contains a couple of extension methods that simplify translating the Kinect camera data into images.

The above was enough to get going with tracking, but not for the speech recognition we wanted to incorporate.  In particular, we wanted to be able to say the word “Fire”, and then have the launcher fire the missile.  For this, we needed a few additional pieces.  (I found these out from Patrick Godwin’s excellent post here: http://www.ximplosionx.com/2011/06/22/intro-to-the-kinect-sdkadding-speech-recognition/):

In Part 2, I’ll walk through controlling the launcher manually using Ellis’ class, and then our first pass at controlling the launcher using the data coming off of the Kinect.

In Part 3, I’ll go through the threading that we discovered we needed in order to make the application perform better, and why we ended up converting the entire application to C#.

Update: Full source code for this solution can be found in the “KinectMissileLauncher.zip” archive here: http://tinyurl.com/MarkGilbertSource.

September 7, 2011 Posted by | Microsoft Kinect, Visual Studio/.NET | 4 Comments

Follow

Get every new post delivered to your Inbox.