Fun times with Kinect and WPF

I recently bought a Kinect controller for my Xbox and is really loving it. Of course I had to see if I couldn’t use that for something, and after some other people were nice enough to create and release free Windows drivers and libraries, I had to try it!

I started out with Code Laboratories “CL NUI Platform” which is a simple installer that installs the drivers and some libraries. It also comes with some samples showing how to connect to the sensor.

I quickly got that going, with a small app that displays the RGB and Depth image from the sensor.


“Unfortunately” this is also all you get out of the box. If you expect to automatically get an event when you wave your hand or doing some other gesture, this is not for you. However if have a little love for some simple Image Analysis algorithms, you should be all over this too!

I wanted to make a simple gesture API that could tell me when a user drags, pinch/stretch or flicks something, very similar to what you do with a Touch Screen. So basically an API that mimics an invisible touch screen hanging suspended in the air. Luckily the algorithms for this turned out to be pretty simple.

The first thing you do is go through the depth image pixel by pixel. If you find a pixel that is too close or too far away, replace it with Color.Empty so we can ignore those. That we we will only “see” things in the picture that are within a certain distance from the sensor. Like for instance two hands stretched out. This leaves us with a set of pixels that could potential be some stretched out hands.


We need to associate these pixels with each other. Let’s say all pixels that are next to each other that are not transparent, belong to the same hand. We call these sets of grouped pixels for “Blobs” and there’s a nice simple algorithm for detecting all the blobs in an image, known as “Connected Component Labelling”. Instead of explaining how it works here, all you need to know is here on wikipedia. If you register their center and size, you can easily use these as “touch input” and discard too large or small blobs. This is what this looks like, using a circle for center and size of each blob:

30 times a second you will be calculating a new set of blobs. Compare these to the previous blobs, and use the center and size to see which blobs are the same and has moved, which are new, and which disappeared. If I only have one blob and it moves, I consider this a “drag”. If I have more than 1, I calculate the bounding box of all the blobs, measure its diagonal and compare it to the size of the previous diagonal - The fraction between these two diagonals is the scale in a pinch or stretch.

Now all we need to do is take these “gesture events” and hook them up to something. Like for instance a map. So here’s the end-result: (note : It looks a little jerky because of the screen capture software, but it runs a lot smoother when not recording and flicks works consistently).

Minority MapReport

It worked surprising well for a first go.

Download source: Sorry no source available at this time, due to some licensing restrictions.

Comments (6) -

  • Nik
    I still think Tom Cruise looks more handsome doing the Minority Report impersonation Smile
  • Cool man! Any drivers released for Windows yes? PrimeSense is still to raw for me. Want to end up using it with Max/MSP.

    thx and keep up the good work!
  • Yago: The above example was using the Code Laboratories' Windows drivers, which are simple to set up and use, but very raw with respect to the the data you get.
    I have since moved on to using PrimeSense, which are granted harder to get up and running, but comes with so much more out of the box (like for instance full skeleton tracking of multiple users).
  • Congratulation guy!

    I'm a beginner in kinect programming and I would like to know if it's possible to have some simple examples of interactions (like the hand tracker) to understand how it works.

    Thanks by advance.
  • Excuse me for this double comment, but I mean "source-codes" when I say "simple examples" ^^

    And if you know some tutorials about that I'm in Smile

    Thank you Smile
  • Gluttuny: I suggest you go to the Code Laboratories' website, download their stuff and get to know their samples. There is already the code there that goes through each pixel one by one. Once you are there, its pretty simple to add the blob detection algorithm based on the wikipedia article.
    I don't think my source code would help you very much, if you don't get these basics first.

Pingbacks and trackbacks (1)+

Add comment