I put together a very basic prototype yesterday of a UI where a user could press on some buttons using the Kinect without touching any surfaces.
In order to do that quickly, I made use of the KinectRegion and the KinectUserViewer as I touched on this previous post.
The scenario is one where a single user would stand in front of a screen, press some buttons and watch a video and so I wanted a UI that led that user through steps such as;
- the sensor is looking for you
- the sensor needs to engage you
- the sensor needs you to use your real hand to move the on-screen virtual hand to press a button
Using the KinectRegion makes this easy in that I can quickly make a UI and open it up to Kinect by wrapping it in a region. However;
- I wanted to know when the user has/hasn’t engaged with the sensor.
- I wanted something simpler as a means of engagement with the sensor than the standard gesture that comes “out of the box”.
I struggled with this for a while and spent a bit of time reading this post;
but that approach didn’t seem to be open to me inside of a Windows XAML Store app and so I took a different route which was to implement my own IKinectEngagementManager.
This was new for me so I thought I’d share it here.
Starting from a blank, Windows 8.1 Store app with a reference to the Kinect bits and permission to access the webcam and microphone I can just this “UI”;
<Page x:Class="App1.MainPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:local="using:App1" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="d" xmlns:k="using:Microsoft.Kinect.Xaml.Controls"> <Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}"> <k:KinectRegion> <Grid> <Button Content="Click Me" HorizontalAlignment="Center" VerticalAlignment="Center" FontSize="24" Width="200" Height="200" /> <k:KinectUserViewer Width="272" Height="156" HorizontalAlignment="Right" VerticalAlignment="Bottom" /> </Grid> </k:KinectRegion> </Grid> </Page>
and that’s instantly usable via the Kinect if you know the magic gesture as per below – easy or what?
but I wanted to know when the user performed that “engagement” gesture and when they “disengaged” and I also wanted to change the gesture somewhat to make it simpler – I went for a simple rule of;
- right hand above right shoulder? == engaged
and, for my purposes, I only really mind about a single user.
With that in mind, I went about trying to implement IKinectEngagementManager to plug in what I wanted – there are samples that include some of this (specifically the “ControlsBasics – XAML” sample within the SDK) and so I’m not exactly breaking new ground here (other than ground that is new for me ).
I modified my UI a little to show the engaged/disengaged status and to name a couple of elements;
<Page x:Class="App1.MainPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:local="using:App1" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="d" xmlns:k="using:Microsoft.Kinect.Xaml.Controls"> <Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}"> <k:KinectRegion x:Name="kinectRegion"> <Grid> <Button Content="Click Me" HorizontalAlignment="Center" VerticalAlignment="Center" FontSize="24" Width="200" Height="200" /> <StackPanel HorizontalAlignment="Right" VerticalAlignment="Bottom"> <TextBlock FontSize="24" Margin="20" HorizontalAlignment="Center" TextAlignment="Center" Text="disengaged" x:Name="txtEngagement" /> <k:KinectUserViewer Width="272" Height="156" /> </StackPanel> </Grid> </k:KinectRegion> </Grid> </Page>
and then that works as per the video below;
the difference in the engagement interaction is perhaps quite a subtle one here but it’s very noticeable as the end user – I have to be a lot less definite about this gesture and that could be both a good and a bad thing – that original gesture wasn’t chosen lightly and so I’m foolish if I discard it lightly.
The code-behind this page now sets up a couple of things which it didn’t used to have to when everything was being left as “default” but mainly it’s just about using my new EngagementManager class and handling events on it before telling the KinectRegion about it;
namespace App1 { using Windows.UI.Xaml; using Windows.UI.Xaml.Controls; using WindowsPreview.Kinect; public sealed partial class MainPage : Page { public MainPage() { this.InitializeComponent(); this.Loaded += OnLoaded; } void OnLoaded(object sender, RoutedEventArgs args) { EngagementManager manager = new EngagementManager(KinectSensor.GetDefault()); manager.Engaged += (s, e) => { this.txtEngagement.Text = "engaged"; }; manager.Disengaged += (s, e) => { this.txtEngagement.Text = "disengaged"; }; this.kinectRegion.SetKinectOnePersonManualEngagement(manager); } } }
and then the extra work is being done in the EngagementManager class which is as below;
namespace App1 { using Microsoft.Kinect.Toolkit.Input; using System; using System.Collections.Generic; using System.Linq; using WindowsPreview.Kinect; using WindowsPreview.Kinect.Input; class EngagementManager : IKinectEngagementManager { public event EventHandler Engaged; public event EventHandler Disengaged; public EngagementManager(KinectSensor sensor) { this.sensor = sensor; } public bool EngagedBodyHandPairsChanged() { return (this.changed); } public IReadOnlyList<BodyHandPair> KinectManualEngagedHands { get { return (KinectCoreWindow.KinectManualEngagedHands); } } public void StartManaging() { this.bodies = new Body[this.sensor.BodyFrameSource.BodyCount]; this.reader = this.sensor.BodyFrameSource.OpenReader(); this.reader.FrameArrived += OnFrameArrived; } public void StopManaging() { this.reader.FrameArrived -= OnFrameArrived; this.reader.Dispose(); this.bodies = null; } static bool AreTracked(Body body, params JointType[] joints) { return (joints.All(j => body.Joints[j].TrackingState == TrackingState.Tracked)); } static double VerticalDistance(Body body, JointType jointOne, JointType jointTwo) { return (body.Joints[jointOne].Position.Y - body.Joints[jointTwo].Position.Y); } void OnFrameArrived(BodyFrameReader sender, BodyFrameArrivedEventArgs args) { if (args.FrameReference != null) { using (var frame = args.FrameReference.AcquireFrame()) { frame.GetAndRefreshBodyData(this.bodies); // we want the first body that is tracked and where the 2 joints I'm interested // in are tracked and where the hand is above the shoulder. var first = this.bodies.FirstOrDefault( b => b.IsTracked && AreTracked(b, JointType.HandRight, JointType.ShoulderRight) && VerticalDistance(b, JointType.HandRight, JointType.ShoulderRight) >= 0.0d); // got one? check to see if we need to trigger engagement. if (first != null) { this.EnsureEngaged(first.TrackingId); } else { // not got one? check to see if we need to clear engagement. this.EnsureNotEngaged(); } } } } void EnsureEngaged(ulong trackingId) { if (this.trackingId != trackingId) { this.trackingId = trackingId; this.changed = true; KinectCoreWindow.SetKinectOnePersonManualEngagement( new BodyHandPair(trackingId, HandType.RIGHT)); if (this.Engaged != null) { this.Engaged(this, EventArgs.Empty); } } } void EnsureNotEngaged() { if (this.trackingId != null) { this.changed = true; this.trackingId = null; KinectCoreWindow.SetKinectOnePersonManualEngagement(null); if (this.Disengaged != null) { this.Disengaged(this, EventArgs.Empty); } } } bool changed; ulong? trackingId; KinectSensor sensor; BodyFrameReader reader; Body[] bodies; } }
on first reading, this looks like a lot of code but most of the code is just the fairly boilerplate code to get the body frames from the sensor.
The rest of the code which tries to check whether the right hand is above the shoulder and flags “engaged!” if so is relatively only a small amount of code to add into things.
And with that ( bugs and all, no doubt ) I get to know as/when the user engages and I get to alter the way in which they engage.
( I bet there’s an easier way of doing this )