Kinect for WindowsV2 SDK: Hello (Skeletal) World for the JavaScript Windows 8.1 App Developer

Following on from my previous posts;

and, again, highlighting the official videos and samples for the Kinect for Windows V2 SDK bits;

Programming-Kinect-for-Windows-v2

I thought I’d continue my own journey along the Windows 8.1 app path. In the previous post, I talked about the Kinect SDK APIs being WinRT APIs which means that they are available to app developers working in the different technologies for building Windows 8.1 native apps – namely C++, .NET and JavaScript.

In the previous post, I moved my C#/.NET code across from the desktop world of WPF into the Windows Store app world of WinRT.

In this post, I thought I’d see what it was like to take that port and move it away from .NET altogether and build similar functionality in JavaScript.

In doing that, I should say that I’m a bit of a basic-level JavaScript developer and I haven’t been writing much JavaScript in the past few months. I can write it but it’s a little like speaking a foreign language to me and I have to exert more energy into thinking about how to get something expressed.

The other thing that I’d say is that because I already had some C# code, I took the approach of largely porting this to JavaScript rather than re-think the whole thing and so the JavaScript code ended up following a similar structure to my C# code and the last point I’d add are that I brought in little pieces of the WinJS library in order to;

  1. Get the app up and running and to provide an AppBar control that I can use to place a few buttons.
  2. Provide the infrastructure for defining JavaScript “classes” – i.e. to provide a bit of a veneer of constructors/instance members/static members over the mechanisms that exist in JavaScript.

There’s absolutely no need to make use of WinJS to build out Windows Store app code in JavaScript – you could leave WinJS out of the picture altogether.

Here’s the app code up and running with smooth performance on tracking a skeleton;

In terms of getting this going, I made a blank app project in JavaScript and made sure that I configured up my app’s manifest to allow access to the webcam and microphone (this is important and easy to forget – I’ve forgotten it a few times and then scratched my head);

image

and, just like in .NET, I added a “reference” to the WinRT library for Kinect for Windows V2;

image

and, just like in .NET, this means changing to a project configuration that’s targeting a specific processor architecture so I did that via the Configuration Manager;

image

and then I set up a basic UI with a piece of HTML that contains a Canvas element and a div purposed at providing a WinJS AppBar control. On the AppBar I placed the same 4 buttons that I’d used in my previous blog post and I wired them up to call particular functions.

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8" />
    <title>App200</title>

    <script src="//Microsoft.WinJS.2.0/js/base.js"></script>

    <link href="/css/default.css" rel="stylesheet" />
    <script src="js/Iterable.js"></script>
    <script src="js/JointConnection.js"></script>
    <script src="js/CanvasBodyDrawer.js"></script>
    <script src="js/KinectControl.js"></script>
    <script src="js/UIHandler.js"></script>
    <script src="js/default.js"></script>
	<script src="//Microsoft.WinJS.2.0/js/ui.js" type="text/javascript"></script>
	<link href="//Microsoft.WinJS.2.0/css/ui-dark.css" rel="stylesheet" type="text/css">
</head>
<body>
    <!-- NB: setting this to 1920x1080 but CSS then scales it to the available space -->
    <!-- took some direction from http://stackoverflow.com/questions/2588181/canvas-is-stretched-when-using-css-but-normal-with-width-height-properties -->
    <canvas id="drawCanvas" width="1920" height="1080">
    </canvas>
	<div id="appBar" data-win-control="WinJS.UI.AppBar" data-win-options="{ sticky:true }">
		<button data-win-control="WinJS.UI.AppBarCommand" 
                onclick="Sample.UIHandler.onGetSensor(document.getElementById('drawCanvas'))"
                data-win-options="{icon:'camera', label:'get sensor', section:'global', type:'button'}"></button>
        <button data-win-control="WinJS.UI.AppBarCommand"
                onclick="Sample.UIHandler.onOpenReader()"
                data-win-options="{icon:'play', label:'open reader', section:'global', type:'button'}"></button>
        <button data-win-control="WinJS.UI.AppBarCommand"
                onclick="Sample.UIHandler.onCloseReader()"
                data-win-options="{icon:'stop', label:'close reader', section:'global', type:'button'}"></button>
        <button data-win-control="WinJS.UI.AppBarCommand"
                onclick="Sample.UIHandler.onReleaseSensor()"
                data-win-options="{icon:'closepane', label:'release sensor', section:'global', type:'button'}"></button>
	</div>
</body>
</html>

This file becomes the start (and only) page for my project and lives in my default.html file.

One thing I’d say about it is that I spent just a little time trying to figure out how the Canvas element in HTML5 deals with the difference between its stated width and height and the actual width and height that the element ends up with at runtime based on any CSS that’s applied to it. Without a little bit of care, the Canvas can end up being of a large actual size but working with its default coordinate size of 300×150 and then scaling content up which makes for a lo-fidelity, pixelated display.

I took some guidance on that from this post;

http://stackoverflow.com/questions/2588181/canvas-is-stretched-when-using-css-but-normal-with-width-height-properties 

Beyond that, this file includes my default.js file which does very little (including skipping the issue of app lifecycle management) other than making sure that WinJS does the right thing in terms of making sure the WinJS controls get processed (in my case the AppBar);

(function ()
{
  "use strict";

  var app = WinJS.Application;
  var activation = Windows.ApplicationModel.Activation;

  app.onactivated = function (args)
  {
    if (args.detail.kind === activation.ActivationKind.launch)
    {
      var promise = WinJS.UI.processAll();

      promise.done(
        function()
        {
          var appBar = document.getElementById('appBar');
          appBar.winControl.show();
        }
      );

      args.setPromise(promise);
    }
  };

  app.start();

})();

So, there’s really not much to it. The button handlers that are specified in my default.html file perform a very similar role to the “code behind” that I had in my previous XAML based post in that they delegate all the work down to a KinectControl class that I have placed into a “namespace” called Sample which is where I also place this class that I called UIHandler;

(function ()
{
  "use strict";

  var UIHandler = WinJS.Class.define(
    function ()
    {
      this._controller = new Sample.KinectControl(
        function ()
        {
          return (new Sample.CanvasBodyDrawer());
        },
        Sample.CanvasBodyDrawer.clearFrames
      );
    },
    {
      onGetSensor: function (canvas)
      {
        Sample.CanvasBodyDrawer.canvas = canvas;
        this._controller.getSensor();
      },
      onOpenReader: function ()
      {
        this._controller.openReader();
      },
      onCloseReader: function ()
      {
        this._controller.closeReader();
      },
      onReleaseSensor: function ()
      {
        this._controller.releaseSensor();
      },
      _controller: null
    }
  );

  WinJS.Namespace.define(
    'Sample',
    {
      UIHandler: new UIHandler()
    });

})();

and so here, I publish into the global namespace an instance of an “object” named Sample.UIHandler which provides methods for the UI buttons to call and largely delegates them down to an instance of a Sample.KinectControl object. That object is constructed with 2 dependencies;

  • A factory function that knows how to instantiate a component that knows how to draw bodies.
  • A function that knows how to clear the canvas.

If you’ve not seen WinJS.Class.define before then the 3 args above are essentially – (constructor, object containing ‘instance’ members, object containing ‘static’ members) – and WinJS.Namespace.define just creates a ‘global’ object named Sample.UIHandler in this case. This makes use of a KinectControl class which is a very literal port of a class I had in the .NET world;

(function ()
{
  "use strict";

  var nsKinect = WindowsPreview.Kinect;

  var constants = {
    bodyCount : 6
  };

  var kinectControl = WinJS.Class.define(
    function (bodyDrawerFactory, clearCanvas)
    {
      this._bodyDrawerFactory = bodyDrawerFactory;
      this._clearCanvas = clearCanvas;
    },
    {
      getSensor : function()
      {
        var bodyCount = 0;

        this._sensor = nsKinect.KinectSensor.getDefault();
        this._sensor.open();

        this._bodies = new Array(constants.bodyCount);
        this._bodyDrawers = new Array(constants.bodyCount);

        for (bodyCount = 0; bodyCount < constants.bodyCount; bodyCount++)
        {
          this._bodyDrawers[bodyCount] = this._bodyDrawerFactory();
          this._bodyDrawers[bodyCount].init(bodyCount, this._sensor);
        }
      },
      openReader : function()
      {
        this._boundHandler = this._onFrameArrived.bind(this);
        this._reader = this._sensor.bodyFrameSource.openReader();
        this._reader.addEventListener('framearrived', this._boundHandler);
      },
      closeReader : function()
      {
        this._reader.removeEventListener('framearrived', this._boundHandler);
        this._reader.close();
        this._reader = null;
      },
      releaseSensor : function()
      {
        this._sensor.close();
        this._sensor = null;
      },
      _onFrameArrived : function(e)
      {
        var frame = e.frameReference.acquireFrame();
        var i = 0;

        if (frame)
        {
          this._clearCanvas();

          frame.getAndRefreshBodyData(this._bodies);

          for (i = 0; i < constants.bodyCount; i++)
          {
            if (this._bodies[i].isTracked)
            {
              this._bodyDrawers[i].drawFrame(this._bodies[i]);
            }
          }
          frame.close();
        }
      },
      _boundHandler:null,
      _clearCanvas: null,
      _bodyDrawerFactory : null,
      _sensor: null,
      _reader: null,
      _bodyDrawers: null,
      _bodies : null
    }
  );

  WinJS.Namespace.define('Sample',
    {
      KinectControl : kinectControl
    }
  );

})();

This class takes at construction time a little factory method and uses it to create 6 “Body Drawing” instances of objects which will take responsibility for drawing any/all of the bodies being tracked by the sensor onto the screen in different colours. It also takes a function that knows how to clear the canvas.

The function here is a bit simpler than what I used in the previous post in the .NET world because I think there are differences between the XAML Canvas and the HTML5 Canvas in that the XAML Canvas is a collection of child elements that draw themselves whereas the HTML5 Canvas is more of a direct-draw model and I don’t think it’s so easy to ask an HTML5 Canvas to remove elements that have been previously drawn although you could try and take an approach of overwriting them with the background colour or clearing specific rectangles.

I didn’t take that route and so the implications for me were that in the XAML world I took an approach of adding elements to the Canvas and later removing them whereas in the HTML5 world I’ve taken an approach of clearing the entire Canvas between each frame that is drawn for up to 6 bodies which is what happens in that _onFrameArrived method above (for a single body).

The CanvasBodyDrawer class looks like as below and is, again, a fairly literal port of the same class in the previous, .NET-based post;

(function ()
{
  "use strict";

  var nsKinect = WindowsPreview.Kinect;

  var constants =
  {
    circleLeafRadius: 30,
    circleNonLeafRadius: 10,
    lineWidth: 3
  };

  var canvasBodyDrawer = WinJS.Class.define(
    function ()
    {
    },
    {
      init: function (index, sensor)
      {
        this._index = index;
        this._sensor = sensor;
        this._sensorColourFrameDimensions = {};

        this._sensorColourFrameDimensions.width =
          this._sensor.colorFrameSource.frameDescription.width;

        this._sensorColourFrameDimensions.height =
          this._sensor.colorFrameSource.frameDescription.height;
      },
      drawFrame: function (body)
      {
        // could almost certainly get this all done in one pass.
        var jointPositions = this._drawJoints(body);

        this._drawLines(jointPositions);
      },
      _drawJoints: function(body)
      {
        var that = this;
        var jointPositions = {};

        Iterable.forEach(body.joints,
          function (keyValuePair)
          {
            var jointType = keyValuePair.key;
            var joint = keyValuePair.value;
            var isTracked = joint.trackingState === nsKinect.TrackingState.tracked;    
            var mappedPoint = that._mapPointToCanvasSpace(joint.position);
            var context = canvasBodyDrawer.canvas.getContext('2d');

            if (that._isJointForDrawing(joint, mappedPoint))
            {
              context.fillStyle =
                isTracked ?
                canvasBodyDrawer._colors[that._index] : canvasBodyDrawer._inferredColor;

              context.beginPath();

              context.arc(
                mappedPoint.x,
                mappedPoint.y,
                that._isLeaf(jointType) ? constants.circleLeafRadius : constants.circleNonLeafRadius,
                2 * Math.PI,
                false);

              context.fill();
              context.stroke();
              context.closePath();

              jointPositions[jointType] = mappedPoint;
            }
          }
        );
        return (jointPositions);
      },
      _drawLines: function(jointPositions)
      {
        var that = this;
        var context = canvasBodyDrawer.canvas.getContext('2d');

        // setting some of these properites way more often than necessary.
        context.strokeStyle = canvasBodyDrawer._lineStyle;
        context.lineWidth = constants.lineWidth;

        canvasBodyDrawer._jointConnections.forEach(
          function (jointConnection)
          {
            jointConnection.forEachPair(
              function (j1, j2)
              {
                // do we have this pair recorded in our positions? 
                // i.e. have we drawn them?
                if (jointPositions[j1] && jointPositions[j2])
                {
                  context.beginPath();
                  context.moveTo(jointPositions[j1].x, jointPositions[j1].y);
                  context.lineTo(jointPositions[j2].x, jointPositions[j2].y);
                  context.stroke();
                  context.closePath();
                }
              }
            );
          }
        );
      },
      _isLeaf: function(jointType)
      {
        var leafs = [nsKinect.JointType.head, nsKinect.JointType.footLeft, nsKinect.JointType.footRight];
        return (leafs.indexOf(jointType) !== -1);
      },
      _isJointForDrawing: function(joint, point)
      {
        return (
          (joint.trackingState !== nsKinect.TrackingState.notTracked) &&
          (point.x !== Number.NEGATIVE_INFINITY) &&
          (point.y !== Number.POSITIVE_INFINITY));
      },
      _mapPointToCanvasSpace: function(cameraSpacePoint)
      {
        // NB: with the way I've set up my canvas in this example (1920x1080), this should be
        // a 1:1 mapping but leaving the flexibility here.
        var colourPoint = this._sensor.coordinateMapper.mapCameraPointToColorSpace(
          cameraSpacePoint);

        colourPoint.x *= canvasBodyDrawer.canvas.width / this._sensorColourFrameDimensions.width;
        colourPoint.y *= canvasBodyDrawer.canvas.height / this._sensorColourFrameDimensions.height;

        return (colourPoint);
      },
      _index : -1,
      _sensorColourFrameDimensions : null,
      _sensor : null
    },
    {
      clearFrames : function()
      {
        var canvas = canvasBodyDrawer.canvas;
        var ctx = canvas.getContext('2d');

        ctx.clearRect(0, 0, canvas.width, canvas.height);
      },
      canvas: {
        get : function()
        {
          return (canvasBodyDrawer._canvas);
        },
        set : function(value)
        {
          canvasBodyDrawer._canvas = value;
        }
      },
      _canvas : null,
      _colors: ['red', 'green', 'blue', 'yellow', 'purple', 'orange'],
      _lineColor: 'black',
      _inferredColor: 'grey',
      _lineStyle : 'black',
      _jointConnections:
        [
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.spineBase, 2),
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.shoulderLeft, 4),
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.shoulderRight, 4),
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.hipLeft, 4),
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.hipRight, 4),
          Sample.JointConnection.createFromStartingJoint(nsKinect.JointType.neck, 2),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.spineMid, nsKinect.JointType.spineShoulder, nsKinect.JointType.neck),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.shoulderLeft, nsKinect.JointType.spineShoulder, nsKinect.JointType.shoulderRight),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.hipLeft, nsKinect.JointType.spineBase, nsKinect.JointType.hipRight),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.handTipLeft, nsKinect.JointType.handLeft),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.handTipRight, nsKinect.JointType.handRight),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.wristLeft, nsKinect.JointType.thumbLeft),
          Sample.JointConnection.createFromJointList(nsKinect.JointType.wristRight, nsKinect.JointType.thumbRight)
        ]
    }
  );

  WinJS.Namespace.define('Sample',
    {
      CanvasBodyDrawer : canvasBodyDrawer
    });

})();

Hopefully, that code mostly speaks for itself – for a particular body that is being tracked by the sensor, an instance of this class draws that body in its drawFrame function by first drawing out the joints and then connecting them together (this could easily be combined into one pass rather than two). The class uses a different colour from its _colors array depending on which of the 0..5 bodies it is representing and this does mean that a single person in front of the sensor could be picked up and drawn as body 4 in one colour and then they could leave the frame and return as body 2 in a different colour. I make no attempt to somehow figure out that it’s “the same person”.

Just as in my .NET post, this class relies on a JointConnection class to represent connections between joints that need joining with lines and, just like in that post, I play a little fast-and-loose with the values of the enumeration of JointType to make some assumptions about building up lists of joints that should be connected together;

(function ()
{
  "use strict";
  
  var jointConnection = WinJS.Class.define(
    function ()
    {
      this._joints = [];
    },
    {
      forEachPair : function(handler)
      {
        for (var i = 0; i < this._joints.length - 1; i++)
        {
          handler(this._joints[i], this._joints[i + 1]);
        }
      },
      _joints: null
    },
    {
      createFromStartingJoint : function(jointType, range)
      {
        var connection = new jointConnection();

        for (var i = 0; i < range; i++)
        {
          connection._joints.push(jointType + i);
        }

        return (connection);
      },
      createFromJointList : function()
      {
        var connection = new jointConnection();

        for (var i = 0; i < arguments.length; i++)
        {
          connection._joints.push(arguments[i]);
        }
        return (connection);
      }
    }
  );

  WinJS.Namespace.define('Sample',
    {
      JointConnection : jointConnection
    });

})();

and that’s pretty much it with a couple of tiny other source files in the project.

I suspect that I could boil this code down to much, much less JavaScript if I took out all the WinJS.Class pieces of it and took away some of the structure and took more of a direct approach to getting the bodies drawn on-screen. In some ways, I feel I’ve bloated the JavaScript out a little by porting over from my .NET code rather than starting in JavaScript.

Regardless, it’s great to see that I can get this going in around 90-120 minutes and have great, smooth performance rendering skeletal data from the sensor direct onto the screen – I suspect that having strong JavaScript support here opens up the sensor to a lot of developers who might otherwise find too high of a barrier to entry with C++ or C#.

The code for the app above is here for download – feel free to download, poke around and borrow any bits.