Following up on that previous post around Win2D;
I wondered what it would be like to use Win2D in an app that was also connecting to the Kinect sensor and so I tried out a couple of things there.
In the first instance, I tried to replace what I’d done previously to display frames of data with WriteableBitmap with the new CanvasControl from the Win2D APIs. To experiment, I made a blank Windows 8.1 application and included both the Win2D and Kinect SDK pieces;
and then I simply hacked my MainPage.xaml to include a CanvasControl. Unlike the previous post where I did all this in code, I’m now doing this declaratively in XAML;
and I then tried to write a little code-behind which would grab the images from the Kinect colour sensor at 1920×1080 and write them into this CanvasControl as below;
namespace App257 { using Microsoft.Graphics.Canvas; using System; using Windows.UI.Xaml; using Windows.UI.Xaml.Controls; using WindowsPreview.Kinect; public sealed partial class MainPage : Page { public MainPage() { this.InitializeComponent(); this.Loaded += OnLoaded; } void OnLoaded(object sender, RoutedEventArgs e) { this.threadId = Environment.CurrentManagedThreadId; this.sensor = KinectSensor.GetDefault(); this.sensor.Open(); this.reader = this.sensor.ColorFrameSource.OpenReader(); this.reader.FrameArrived += OnFrameArrived; this.buffer = new byte[ this.sensor.ColorFrameSource.FrameDescription.LengthInPixels * 4]; // BGRA == 4 } void OnFrameArrived(ColorFrameReader sender, ColorFrameArrivedEventArgs args) { this.CheckThreadId(); if (args.FrameReference != null) { using (ColorFrame frame = args.FrameReference.AcquireFrame()) { if (frame != null) { // NB: The default format that the SDK gives me here is Yuy2 and if I could // use that then it would, presumably, mean that I don't have to do this // conversion here. // However...when I ask the Win2D bits below to use Yuy2 at the call to // CanvasBitmap.CreateFromBytes it gives me some kind of "unsupported // format" exception. Hence, I'm using BGRA and converting. frame.CopyConvertedFrameDataToArray(this.buffer, ColorImageFormat.Bgra); this.drawCanvas.Invalidate(); } } } } void OnDrawCanvas(CanvasControl sender, CanvasDrawEventArgs args) { this.CheckThreadId(); using (var session = args.DrawingSession) { if (this.buffer != null) { using (this.bitmap = CanvasBitmap.CreateFromBytes( session.Device, this.buffer, this.sensor.ColorFrameSource.FrameDescription.Width, this.sensor.ColorFrameSource.FrameDescription.Height, Microsoft.Graphics.Canvas.DirectX.DirectXPixelFormat.B8G8R8A8UIntNormalized, CanvasAlphaBehavior.Premultiplied, 96)) { session.DrawImage(this.bitmap); } } } } void CheckThreadId() { if (Environment.CurrentManagedThreadId != this.threadId) { throw new InvalidOperationException(); } } int threadId; CanvasBitmap bitmap; byte[] buffer; ColorFrameReader reader; KinectSensor sensor; } }
I’m not 100% sure that I got that right but it seems to work out reasonably well.
As you’ll see from the code comments around line 37 up there I did experiment a little with trying to not convert the image data as given to me by the Kinect SDK in Yuy2 format but I seemed to struggle in feeding that into a CanvasBitmap for display in a CanvasControl and so I gave up and asked the Kinect SDK for a conversion to a BGRA format which I then fed into the CanvasBitmap and drew onto the CanvasControl without too much hassle.
In the code above, everything is happening on the UI thread and there’s an assumption that because of that, code in OnDrawCanvas which is reading from the buffer of image data will never run at the same time as code in OnFrameArrived otherwise there’s a possibility that one set of code is reading the buffer at the same time that another set of code is writing it. Neither of those methods include any asynchronous calls (I hope) so that should be the case.
That code seemed to give reasonable performance in terms of just standing in front of the sensor, waving my arms around and seeing how it worked out at 1920×1080.
I wasn’t sure about calling CanvasBitmap.CreateX with such a frequency but, as far as I could tell, there’s no way to overwrite the contents of a CanvasBitmap having already created it so I’ve assumed that this is the “right” way to go here as I didn’t find another way to go.
Having got a video frame drawn, I thought I’d try another form of data by pulling frames from the infra-red sensor which is more or less the same sort of idea;
using Microsoft.Graphics.Canvas; using System; using System.Runtime.InteropServices; using Windows.Storage.Streams; using Windows.UI; using Windows.UI.Xaml; using Windows.UI.Xaml.Controls; using WindowsPreview.Kinect; // You learn something new every day. I found this in the samples. I hadn't realised I could get // to the bits underneath by using this. Feels wrong to me to have to declare it here in my // own code though but maybe that'll change as the SDK changes. [Guid("905a0fef-bc53-11df-8c49-001e4fc686da"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)] interface IBufferByteAccess { unsafe void Buffer(out byte* pByte); } public sealed partial class MainPage : Page { public MainPage() { this.InitializeComponent(); this.Loaded += OnLoaded; } void OnLoaded(object sender, RoutedEventArgs e) { this.sensor = KinectSensor.GetDefault(); this.sensor.Open(); this.reader = this.sensor.OpenMultiSourceFrameReader( FrameSourceTypes.Infrared); this.irFrameDesc = this.sensor.InfraredFrameSource.FrameDescription; this.irPixels = new byte[this.irFrameDesc.LengthInPixels * 4]; this.reader.MultiSourceFrameArrived += OnFrameArrived; } void OnFrameArrived(MultiSourceFrameReader sender, MultiSourceFrameArrivedEventArgs args) { if (args.FrameReference != null) { using (MultiSourceFrame multiFrame = args.FrameReference.AcquireFrame()) { if (multiFrame != null) { using (InfraredFrame irFrame = multiFrame.InfraredFrameReference.AcquireFrame()) { if (irFrame != null) { IBuffer irBuffer = irFrame.LockImageBuffer(); IBufferByteAccess byteAccessIR = (IBufferByteAccess)irBuffer; unsafe { byte* pIR = null; ushort* pIRShorts = null; byteAccessIR.Buffer(out pIR); pIRShorts = (ushort*)pIR; fixed (byte* pIRPixels = this.irPixels) { var len = this.irFrameDesc.LengthInPixels; for (int i = 0; i < len; i++) { ushort irValue = *(pIRShorts + i); byte pixelValue = (byte)( ((float)irValue / ushort.MaxValue) * 0xFF * EMPIRICAL_SCALING_FACTOR_FROM_YEARS_OF_RESEARCH); pIRPixels[i * 4] = pixelValue; pIRPixels[i * 4 + 1] = pixelValue; pIRPixels[i * 4 + 2] = pixelValue; pIRPixels[i * 4 + 3] = 0xFF; } } Marshal.ReleaseComObject(byteAccessIR); Marshal.ReleaseComObject(irBuffer); } } this.drawCanvas.Invalidate(); } } } } } void OnDrawCanvas(CanvasControl sender, CanvasDrawEventArgs args) { using (var session = args.DrawingSession) { session.Units = CanvasUnits.Pixels; session.Clear(Colors.Black); if (this.irPixels != null) { using ( CanvasBitmap irBitmap = CanvasBitmap.CreateFromBytes( session.Device, this.irPixels, this.irFrameDesc.Width, this.irFrameDesc.Height, Microsoft.Graphics.Canvas.DirectX.DirectXPixelFormat.B8G8R8A8UIntNormalized, CanvasAlphaBehavior.Premultiplied, 96)) { session.DrawImage(irBitmap); } } } } static readonly float EMPIRICAL_SCALING_FACTOR_FROM_YEARS_OF_RESEARCH = 7.5f; byte[] irPixels; MultiSourceFrameReader reader; KinectSensor sensor; FrameDescription irFrameDesc; }
I learned a few things in writing that code and, no doubt, some of those will turn out to be wrong in the future. One of the things I learned was that implementations of IBuffer that are coming back from the APIs in the Kinect SDK also seem to implement that IBufferAccess COM interface that I included in my code which I can then use to pass a byte* by reference and get it populated with the pointer to the bytes.
The values that come from the IR frame are ushort values in the range of 0..65535 but if I want to display them as grey-scale then I need to somehow convert that range down into 0..255. I played around with that a lot and ultimately, for my scenario, I came up with a magic scale factor number of 0.75f to be applied to everything. In the code above it’s defined by the constant EMPIRICAL_SCALING_FACTOR_FROM_YEARS_OF_RESEARCH which isn’t strictly accurate naming
Regardless, that code gives me a nice 512×424 bitmap image;
and it’s pretty easy to scale that up so that it occupies my 1920×1080 display by just changing the OnDrawCanvas function to add a ScaleEffect;
void OnDrawCanvas(CanvasControl sender, CanvasDrawEventArgs args) { using (var session = args.DrawingSession) { session.Units = CanvasUnits.Pixels; session.Clear(Colors.Black); if (this.irPixels != null) { using ( CanvasBitmap irBitmap = CanvasBitmap.CreateFromBytes( session.Device, this.irPixels, this.irFrameDesc.Width, this.irFrameDesc.Height, Microsoft.Graphics.Canvas.DirectX.DirectXPixelFormat.B8G8R8A8UIntNormalized, CanvasAlphaBehavior.Premultiplied, 96)) { ScaleEffect scaledIrBitmap = new ScaleEffect() { Source = irBitmap, Scale = new Vector2() { X = 1920.0f / this.irFrameDesc.Width, Y = 1080.0f / this.irFrameDesc.Height } }; session.DrawImage(scaledIrBitmap); } } } }
and that works quite nicely althought, clearly, I’m being wasteful in the number of temporary objects that I am recreating when I could be keeping them around and I’m also being pretty heavy-handed when it comes to not preserving the aspect ratio of that image but a full screen image pops out nonetheless;
Now…at this point I wondered whether I could do something like the fairly common green-screen scenario where I combine this data with the body frame index data which tells me where a pixel in the image relates to a body being tracked by the sensor. What I’m hoping to do is something like;
- Capture the IR image as above.
- Capture the “Body Frame Index” image which is a bitmap the same size as the IR image but with pixel values that are represented as a single byte which stores a value of 0xFF for “no body present” and a value 0-5 meaning that the pixel in question represents a part of body being tracked by the sensor.
- Multiply the 2 together in order to produce an image which is the typical “green screen” scenario – i.e. only the pixels that make up bodies being tracked.
- Apply some kind of effect onto the image produced in (3).
- Overly the image produced in (4) on the original IR image from (1).
In doing this, I was thinking about those kinds of scenarios where you want to capture some video of someone walking past a camera/sensor but you want to remove the identify of the individual by blurring out their details much like you see in a technology like Bing/Google maps.
Here’s where I ended up;
namespace App257 { using Microsoft.Graphics.Canvas; using Microsoft.Graphics.Canvas.Effects; using System; using System.Diagnostics; using System.Numerics; using System.Runtime.InteropServices; using Windows.Storage.Streams; using Windows.UI; using Windows.UI.Xaml; using Windows.UI.Xaml.Controls; using WindowsPreview.Kinect; // You learn something new every day. I found this in the samples. I hadn't realised I could get // to the bits underneath by using this. Feels wrong to me to have to declare it here in my // own code though but maybe that'll change as the SDK changes. [Guid("905a0fef-bc53-11df-8c49-001e4fc686da"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)] interface IBufferByteAccess { unsafe void Buffer(out byte* pByte); } public sealed partial class MainPage : Page { public MainPage() { this.InitializeComponent(); this.Loaded += OnLoaded; } void OnLoaded(object sender, RoutedEventArgs e) { this.sensor = KinectSensor.GetDefault(); this.sensor.Open(); // We're now gathering both the body index data and the infrared data. this.reader = this.sensor.OpenMultiSourceFrameReader( FrameSourceTypes.BodyIndex | FrameSourceTypes.Infrared); // I'll need these frame descriptions later for their sizes although, strictly, // they are the same. this.bodyIndexFrameDescription = this.sensor.BodyIndexFrameSource.FrameDescription; this.irFrameDescription = this.sensor.InfraredFrameSource.FrameDescription; // array to store the infrared pixels in - BGRA so 4 * the frame pixels size this.irPixels = new byte[this.irFrameDescription.LengthInPixels * 4]; // array to store the body index pixels in - BGRA so 4 * the body index frame pixels size this.bodyIndexPixels = new byte[this.bodyIndexFrameDescription.LengthInPixels * 4]; this.reader.MultiSourceFrameArrived += OnFrameArrived; } void OnFrameArrived(MultiSourceFrameReader sender, MultiSourceFrameArrivedEventArgs args) { if (args.FrameReference != null) { using (MultiSourceFrame multiFrame = args.FrameReference.AcquireFrame()) { if (multiFrame != null) { using (BodyIndexFrame bodyIndexFrame = multiFrame.BodyIndexFrameReference.AcquireFrame()) { using (InfraredFrame irFrame = multiFrame.InfraredFrameReference.AcquireFrame()) { if ((bodyIndexFrame != null) && (irFrame != null)) { // lock both the buffers so that we can access them without copying them // although we are going to copy them but we want to transform them // as we do so. // that's because I want a bitmap from the body index data but it // is single-byte data and, similar for the IR data which is // 2-byte data. IBuffer biBuffer = bodyIndexFrame.LockImageBuffer(); IBuffer irBuffer = irFrame.LockImageBuffer(); // using these interfaces lets me grab pointers to the byte buffers // directly. IBufferByteAccess byteAccessBodyIndex = (IBufferByteAccess)biBuffer; IBufferByteAccess byteAccessIR = (IBufferByteAccess)irBuffer; unsafe { byte* pIRBytes = null; byte* pBodyIndexBytes = null; ushort* pIRBytesAsShorts = null; // grab pointers... byteAccessBodyIndex.Buffer(out pBodyIndexBytes); byteAccessIR.Buffer(out pIRBytes); // treat the IR data as an array of ushort, not an array of byte pIRBytesAsShorts = (ushort*)pIRBytes; // keep hold of both the IR and the body index bitmaps that we // are about to build fixed (byte* pIRPixels = this.irPixels) { fixed (byte* pBodyIndexPixels = this.bodyIndexPixels) { // len is the same for both IR and body index. var len = this.bodyIndexFrameDescription.LengthInPixels; // NB: this code breaks if body index frames are different sizes // to depth frames as I'm using one loop to drive population of // both pixel arrays. for (int i = 0; i < len; i++) { // inflate the body index data from a 1-byte per pixel array to a // 4-byte per pixel BGRA array. byte bindexValue = pBodyIndexBytes[i] == 0xFF ? (byte)0 : (byte)0xFF; pBodyIndexPixels[i * 4] = bindexValue; pBodyIndexPixels[i * 4 + 1] = bindexValue; pBodyIndexPixels[i * 4 + 2] = bindexValue; pBodyIndexPixels[i * 4 + 3] = bindexValue; // deal with the IR ushort values as previous code. ushort irValue = *(pIRBytesAsShorts + i); byte pixelValue = (byte)( ((float)irValue / ushort.MaxValue) * 0xFF * EMPIRICAL_SCALING_FACTOR_FROM_YEARS_OF_RESEARCH); pIRPixels[i * 4] = pixelValue; pIRPixels[i * 4 + 1] = pixelValue; pIRPixels[i * 4 + 2] = pixelValue; pIRPixels[i * 4 + 3] = 0xFF; } } } } Marshal.ReleaseComObject(byteAccessBodyIndex); Marshal.ReleaseComObject(byteAccessIR); Marshal.ReleaseComObject(biBuffer); Marshal.ReleaseComObject(irBuffer); } } this.drawCanvas.Invalidate(); } } } } } void OnDrawCanvas(CanvasControl sender, CanvasDrawEventArgs args) { using (var session = args.DrawingSession) { session.Units = CanvasUnits.Pixels; session.Clear(Colors.Black); if (this.bodyIndexPixels != null) { // bitmap for the body index data which is in the bodyIndexPixels member // as 4-byte BGRA using ( CanvasBitmap bodyIndexBitmap = CanvasBitmap.CreateFromBytes( session.Device, this.bodyIndexPixels, this.bodyIndexFrameDescription.Width, this.bodyIndexFrameDescription.Height, Microsoft.Graphics.Canvas.DirectX.DirectXPixelFormat.B8G8R8A8UIntNormalized, CanvasAlphaBehavior.Premultiplied, 96)) { using ( CanvasBitmap irBitmap = CanvasBitmap.CreateFromBytes( session.Device, this.irPixels, this.irFrameDescription.Width, this.irFrameDescription.Height, Microsoft.Graphics.Canvas.DirectX.DirectXPixelFormat.B8G8R8A8UIntNormalized, CanvasAlphaBehavior.Premultiplied, 96)) { // multiply the IR bitmap by the body index bitmap - the intention here is to // remove any background pixels that are not part of a tracked body. ArithmeticCompositeEffect irGreenScreenedBitmap = new ArithmeticCompositeEffect() { Source1 = irBitmap, Source2 = bodyIndexBitmap, MultiplyAmount = 1 }; Vector2 scaleToMyLaptopScreen = new Vector2() { X = 1920.0f / this.irFrameDescription.Width, Y = 1080.0f / this.irFrameDescription.Height }; // now draw the original IR bitmap at 1920x1080... session.DrawImage( new ScaleEffect() { Source = irBitmap, Scale = scaleToMyLaptopScreen }); // now scale the "green screened" bitmap to 1920x1080... ScaleEffect scale = new ScaleEffect() { Source = irGreenScreenedBitmap, Scale = scaleToMyLaptopScreen }; // and draw it on top of the IR image with a blur effect applied... session.DrawImage( new GaussianBlurEffect() { Source = scale, BlurAmount = 10.0f } ); } } } } } static readonly float EMPIRICAL_SCALING_FACTOR_FROM_YEARS_OF_RESEARCH = 7.5f; byte[] irPixels; byte[] bodyIndexPixels; MultiSourceFrameReader reader; KinectSensor sensor; FrameDescription bodyIndexFrameDescription; FrameDescription irFrameDescription; } }
I tried to comment that code as much as I could rather than write it up in great details in the post itself. What I liked about this is the “pipeline” type approach of chaining together effects and I see similarities here with the Reactive Extensions (and other “pipeline like” technologies) and I wonder how those things would combine – it’s perhaps something I’ll come back to.
The overall effect is sort of “anonymous” IR data image as per below where the central figure is blurred out in real time;
which I thought was an interesting effect – clearly, you could apply other effects here but I found it an interesting thing to do to overlay a cut out of the original image back over itself and I guess you could create some kind of nightmare scenario by overlaying more than once with a little offsetting. As quick hack, I change that last draw code to take out the Gaussian blur effect and instead I just overlaid the green-screen image 3 times;
session.DrawImage(scale); session.DrawImage(scale, -600, 0); session.DrawImage(scale, 600, 0);
and I quickly had a hall-of-mirrors style 3-man army who all looked a bit like me (in realtime, of course);
Watch out – we’re coming for you this Halloween
Meanwhile, if you want the code for this – it’s here for download.