Following on from the previous posts, I’ve been taking my first steps towards trying out my third source of data from the Kinect V2 sensor – the BodyFrameSource that delivers skeletal tracking data for up to 6 bodies.
As in my last 2 posts, this is all with reference to the video series on Channel 9;
and I’m not at all claiming to be doing anything ‘unique’ or ‘clever’ or anything like that here – I’m just learning and experimenting and a lot of what I’m doing is duplicating what’s already present in the Kinect for Windows V2 SDK samples which you can see in the sample browser but I find that by tinkering with my own code rather than someone else’s I learn a lot more about the whole experience of developing for something like this.
I wanted to try and take the lowest-tech possible approach to getting started with this skeletal data and so I thought for a while about what the “lowest tech” kind of interface was that I could put on the data and decided that a console application was the way to go
Now, you might think “Hey, but what about some fancy 3D interface?”. Sure, that’s possible and there are examples out there and many samples in the SDK and I’ll work on building some of my own but I thought that a console application was a fun idea here because (to me) it illustrates the low barrier to entry that programming with this kind of data has.
That’s been something of a surprise to me – I’d assumed that some of these things were harder to get started with than they seem to be on the Kinect SDK and that’s a nice thing to be slowly finding out
For instance, if I want to know whether there are 1-6 people being tracked by the sensor, I may not need a fancy 3D application. I might just need a piece of code that delivers a “yes/no” result and some environment in which to host that code.
So…here’s my skeletal “Selfie” provided by my console app. Yup, that’s me below;
I made this as lo-fi as possible using just single letters to indicate particular joints from the sensor and filtering down the set of joints that I displayed to try and avoid over-cluttering the screen but it works surprisingly well. Here’s a clip of it as video rather than as a screenshot of me showing a few of my best disco moves
(the music is “Suburban Withdrawl” By Lee Rosevere taken from www.freemusicarchive.org).
In terms of putting that together, it’s a fairly simple and quickly hacked together console app. The main loop looks like this;
using Microsoft.Kinect; using System; namespace ConsoleApplication8 { class Program { static void Main(string[] args) { Console.SetWindowSize(Constants.ConsoleWidth, Constants.ConsoleHeight); Console.Title = "Kinect Skeleton Console"; KinectControl c = new KinectControl(() => new ConsoleBodyDrawer()); c.GetSensor(); c.OpenReader(); while (true) { var k = Console.ReadKey(true).Key; if (k == ConsoleKey.X) { c.CloseReader(); c.ReleaseSensor(); break; } } } } }
and, as in my previous post, I made a little KinectControl class which deals with opening/closing/reading from the sensor. Unlike in the previous post though, the data coming back to my code in this case it not just a 2-dimensional array of values. It’s structured, strongly-typed data in that the sensor returns up to 6 instances of a Body object from its BodyFrameSource and each Body that gets delivered into your code comes with a whole bunch of information about the tracked body including 20 Joint objects.
I wanted to try and abstract the details of how this Body object was drawn so I wrote a little interface;
using Microsoft.Kinect; using System; namespace ConsoleApplication8 { interface IBodyDrawer { ConsoleColor Color { get; set; } void DrawFrame(Body body, CoordinateMapper mapper, Rect depthFrameSize); } }
It’s not quite as clean as I would have liked it in that I ended up having to pass a few more details down into the DrawFrame method than I might do if I spent longer on reworking it but then my KinectControl class does the main work here in that it;
- Creates 6 instances of these IBodyDrawer interfaces.
- Syncs up to the FrameArrived event from the FrameBodySource.
- Pushes any Body instances that the Kinect sensor tells us are tracked through into the IBodyDrawer.DrawFrame method above.
The implementation for that looks like this;
using Microsoft.Kinect; using System; namespace ConsoleApplication8 { class Rect { public int Width; public int Height; } class KinectControl { public KinectControl(Func<IBodyDrawer> bodyDrawerFactory) { this.bodyDrawers = new IBodyDrawer[colours.Length]; for (int i = 0; i < colours.Length; i++) { this.bodyDrawers[i] = bodyDrawerFactory(); this.bodyDrawers[i].Color = colours[i]; } } public Rect DepthFrameSize { get; private set; } public void GetSensor() { this.sensor = KinectSensor.GetDefault(); this.sensor.Open(); this.DepthFrameSize = new Rect() { Width = this.sensor.DepthFrameSource.FrameDescription.Width, Height = this.sensor.DepthFrameSource.FrameDescription.Height }; } public void OpenReader() { this.reader = this.sensor.BodyFrameSource.OpenReader(); this.reader.FrameArrived += OnFrameArrived; } public void CloseReader() { this.reader.FrameArrived -= OnFrameArrived; this.reader.Dispose(); this.reader = null; } void OnFrameArrived(object sender, BodyFrameArrivedEventArgs e) { using (var frame = e.FrameReference.AcquireFrame()) { if ((frame != null) && (frame.BodyCount > 0)) { if ((this.bodies == null) || (this.bodies.Length != frame.BodyCount)) { this.bodies = new Body[frame.BodyCount]; } frame.GetAndRefreshBodyData(this.bodies); Console.Clear(); for (int i = 0; i < colours.Length; i++) { if (this.bodies[i].IsTracked) { this.bodyDrawers[i].DrawFrame( this.bodies[i], this.sensor.CoordinateMapper, this.DepthFrameSize); } } } } } public void ReleaseSensor() { this.sensor.Close(); this.sensor = null; } Body[] bodies; KinectSensor sensor; BodyFrameReader reader; IBodyDrawer[] bodyDrawers; static ConsoleColor[] colours = { ConsoleColor.Red, ConsoleColor.White, ConsoleColor.Green, ConsoleColor.Yellow, ConsoleColor.Cyan, ConsoleColor.Gray }; } }
and the concrete implementation of the IBodyDrawer interface that I knocked up to draw to the console looks like;
using ConsoleExtensions; using Microsoft.Kinect; using System; using System.Collections.Generic; using System.Text; namespace ConsoleApplication8 { class ConsoleBodyDrawer : IBodyDrawer { public ConsoleBodyDrawer() { this.Color = ConsoleColor.Green; } public ConsoleColor Color { get; set; } public void DrawFrame(Body body, CoordinateMapper mapper, Rect depthFrameSize) { this.Resize(); foreach (var jointType in interestedJointTypes) { var joint = body.Joints[jointType]; if (joint.TrackingState != TrackingState.NotTracked) { var cameraPosition = joint.Position; DepthSpacePoint depthPosition = mapper.MapCameraPointToDepthSpace(cameraPosition); if (!float.IsNegativeInfinity(depthPosition.X)) { var consolePosition = MapDepthPointToConsoleSpace(depthPosition, depthFrameSize); ConsoleEx.DrawAt( consolePosition.Item1, consolePosition.Item2, jointType.ToString().Substring(0,1), joint.TrackingState == TrackingState.Inferred ? ConsoleColor.Gray : this.Color); } } } } void Resize() { if (!resized) { Console.SetWindowSize(Constants.ConsoleWidth, Constants.ConsoleHeight); resized = true; } } static Tuple<int, int> MapDepthPointToConsoleSpace(DepthSpacePoint depthPoint, Rect depthFrameSize) { var left = (int)((depthPoint.X / (double)depthFrameSize.Width) * (Constants.ConsoleWidth - 1)); var top = (int)((depthPoint.Y / (double)depthFrameSize.Height) * (Constants.ConsoleHeight - 1)); left = Math.Min(left, Constants.ConsoleWidth - 1); top = Math.Min(top, Constants.ConsoleHeight - 1); left = Math.Max(0, left); top = Math.Max(0, top); return (Tuple.Create(left, top)); } static JointType[] interestedJointTypes = { JointType.Head, JointType.Neck, JointType.ShoulderLeft, JointType.ShoulderRight, JointType.HandLeft, JointType.HandRight, JointType.ElbowLeft, JointType.ElbowRight, JointType.HipLeft, JointType.HipRight, JointType.KneeLeft, JointType.KneeRight, JointType.AnkleLeft, JointType.AnkleRight, JointType.FootLeft, JointType.FootRight }; static bool resized = false; } }
That’s it – a staggeringly small amount of pretty high level code required to get this sort of data out of the sensor and into any kind of application that wants to do something with it.
I’ve posted the source here for download – it was put together pretty quickly so (as always) keep that in mind if you’re using it and there’s a bit of code in there conditionally compiled out (#if ZERO) which attempts to display a bit more info about the first body that the sensor reports it is tracking.