A Follow-On Prague Experiment with Skeletons

A developer dropped me a line having found my previous blog posts around Project Prague;

Project Prague in the Cognitive Services Labs

They’d noticed that it seemed really easy and powerful to define and monitor for gestures with Project Prague but wanted to know where the support was for tracking lower level data such as hand positions and movement. I’ve a suspicion that they are looking for something similar to what the Kinect SDK offers which was out-of-the-box support for treating a user’s hand as a pointer and being able to drive an on-screen UI with it.

As usual, I hadn’t the foggiest clue about how this might be done and so I thought I’d better take a quick look at it and this post is the result of a few minutes looking at the APIs and the documentation.

If you haven’t seen Prague at all then I did write a couple of other posts;

Project Prague Posts

and so feel free to have a read of those if you want the background on what I’m posting here and I’ll attempt to avoid repeating what I wrote in those posts.

Project Prague and the UWP

Since I last looked at Project Prague, “significant things” have happened in that the Windows 10 Fall Creators Update has been released and, along with it, support for .NET Standard 2.0 in UWP apps which I just wrote about an hour or two ago in this post;

UWP and .NET Standard 2.0–Remembering the ‘Forgotten’ APIs –)

These changes mean that I now seem to be free to use Project Prague from inside a UWP app (targeting .NET Standard 2.0 on Windows 16299+) although I’m unsure about whether this is a supported scenario yet or what it might mean for an app that wanted to go into Store but, technically, it seems that I can make use of the Prague SDK from a UWP app and so that’s what I did.

Project Prague and Skeleton Tracking

I revisited the Project Prague documentation and scanned over this one page which covers a lot of ground but it mostly focuses on how to get gestures working and doesn’t drop to the lower level details.

However, there’s a response to a comment further down the page which does talk in terms of;

“The SDK provides both the high level abstraction of the gestures as they are described in the overview above and also the raw skeleton we produce. The skeleton we produce is ‘light-weight’ namely it exposes the palm & fingertips’ locations and directions vectors (palm also has an orientation vector).

In the slingshot example above, you would want to register to the skeleton event once the slingshot gesture reaches the Pinch state and then track the motion instead of simply expecting a (non negligible) motion backwards as defined above.

Depending on your needs, you could either user the simplistic gesture-states-only approach or weave in the use of raw skeleton stream.

We will followup soon with a code sample in https://aka.ms/gestures/samples that will show how to utilize the skeleton stream”

and that led me back to the sample;

3D Camera Sample

which looks to essentially use gestures as a start/stop mechanism in between which it makes use of the API;

GesturesServiceEndpoint.RegisterToSkeleton

in order to get raw hand-tracking data including the position of the palm and digits and so it felt like this was the API that I might want to take a look at – it seemed that this might be the key to the question that I got asked.

Alongside discovering this API I also had a look through the document which is targeted at Unity but generally useful;

“3D Object Manipulation”

because it talks about the co-ordinate system that positions, directions etc. are offered in by the SDK and also units;

“The hand-skeleton is provided in units of millimeters, in the following left-handed coordinate system”

although what wasn’t clear to me from the docs was whether I had to think in terms of different ranges for distances based on the different cameras that the SDK supports. I was using a RealSense SR300 as it is easier to plug in than a Kinect and one of my out-standing questions remains what sort of range of motion in the horizontal and vertical planes I should expect the SDK to be able to track for the camera.

Regardless, I set about trying to put together a simple UWP app that let me move something around on the screen using my hand and the Prague SDK.

Experimenting in a UWP App

I made a new UWP project (targeting 16299) and I referenced the Prague SDK assemblies (see previous post for details of where to find them);

image

and then added a small piece of XAML UI with a green dot which I want to move around purely by dragging my index finger in front of the screen;

<Page
    x:Class="App2.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:App2"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d">

    <Grid>
        <Canvas HorizontalAlignment="Stretch" VerticalAlignment="Stretch" Background="{ThemeResource ApplicationPageBackgroundThemeBrush}" SizeChanged="CanvasSizeChanged">
            <Ellipse Width="10" Height="10" Fill="Green" x:Name="marker" Visibility="Collapsed"/>
        </Canvas>
        <TextBlock FontSize="24" x:Name="txtDebug" HorizontalAlignment="Left" VerticalAlignment="Bottom"/>
    </Grid>
</Page>

With that in place, I added some code behind which attempts to permanently be tracking the user’s right hand and linking it to movement of this green dot. The code’s fairly self-explanatory I think with the exception that I limited the hand range to be –200mm to 200mm on the X axis and –90mm to +90mm on the Y axis based on experimentation. I’m unsure of whether this is “right” or not at the time of writing. I did experiment with normalising the vectors and trying to use those to drive my UI but that didn’t work out well for me as I never seemed to be able to get more than around +/- 0.7 units along the X or Y axis.

using Microsoft.Gestures;
using Microsoft.Gestures.Endpoint;
using Microsoft.Gestures.Samples.Camera3D;
using System;
using System.Linq;
using Windows.Foundation;
using Windows.UI.Core;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;

namespace App2
{
    public sealed partial class MainPage : Page
    {
        public MainPage()
        {
            this.InitializeComponent();
            this.Loaded += OnLoaded;
        }
        async void OnLoaded(object sender, RoutedEventArgs e)
        {
            this.gestureService = GesturesServiceEndpointFactory.Create();
            await this.gestureService.ConnectAsync();

            this.smoother = new IndexSmoother();
            this.smoother.SmoothedPositionChanged += OnSmoothedPositionChanged;

            await this.gestureService.RegisterToSkeleton(this.OnSkeletonDataReceived);
        }
        void CanvasSizeChanged(object sender, SizeChangedEventArgs e)
        {
            this.canvasSize = e.NewSize;
        }
        void OnSkeletonDataReceived(object sender, HandSkeletonsReadyEventArgs e)
        {
            var right = e.HandSkeletons.FirstOrDefault(h => h.Handedness == Hand.RightHand);

            if (right != null)
            {
                this.smoother.Smooth(right);
            }
        }
        async void OnSmoothedPositionChanged(object sender, SmoothedPositionChangeEventArgs e)
        {
            // AFAIK, the positions here are defined in terms of millimetres and range
            // -ve to +ve with 0 at the centre.

            // I'm unsure what range the different cameras have in terms of X,Y,Z and
            // so I've made up my own range which is X from -200 to 200 and Y from
            // -90 to 90 and that seems to let me get "full scale" on my hand 
            // movements.

            // I'm sure there's a better way. X is also reversed for my needs so I
            // went with a * -1.

            var xPos = Math.Clamp(e.SmoothedPosition.X * - 1.0, 0 - XRANGE, XRANGE);
            var yPos = Math.Clamp(e.SmoothedPosition.Y, 0 - YRANGE, YRANGE);
            xPos = (xPos + XRANGE) / (2.0d * XRANGE);
            yPos = (yPos + YRANGE) / (2.0d * YRANGE);

            await this.Dispatcher.RunAsync(
                CoreDispatcherPriority.Normal,
                () =>
                {
                    this.marker.Visibility = Visibility.Visible;

                    var left = (xPos * this.canvasSize.Width);
                    var top = (yPos * this.canvasSize.Height);

                    Canvas.SetLeft(this.marker, left - (this.marker.Width / 2.0));
                    Canvas.SetTop(this.marker, top - (this.marker.Height / 2.0));
                    this.txtDebug.Text = $"{left:N1},{top:N1}";
                }

            );
        }
        static readonly double XRANGE = 200;
        static readonly double YRANGE = 90;
        Size canvasSize;
        GesturesServiceEndpoint gestureService;
        IndexSmoother smoother;
    }
}

As part of writing that code, I modified the PalmSmoother class from the 3D sample provided to become an IndexSmoother class which essentially performs the same function but on a different piece of data and with some different parameters. It looks like a place where something like the Reactive Extensions might be a good thing to use instead of writing these custom classes but I went with it for speed/ease.

Wrapping Up

This was just a quick experiment but I learned something from it. The code’s here if it’s of use to anyone else glancing at Project Prague and, as always, feed back if I’ve messed this up – I’m very new to using Project Prague.

One thought on “A Follow-On Prague Experiment with Skeletons

  1. Can you please tell where we can find those four dlls ? I tried to build your code but I am getting error on those dlls and a file called App.g.i.cs ? What is this file ?

Leave a Reply to Asma Rafi Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s