A Follow-On Prague Experiment with Skeletons

A developer dropped me a line having found my previous blog posts around Project Prague;

Project Prague in the Cognitive Services Labs

They’d noticed that it seemed really easy and powerful to define and monitor for gestures with Project Prague but wanted to know where the support was for tracking lower level data such as hand positions and movement. I’ve a suspicion that they are looking for something similar to what the Kinect SDK offers which was out-of-the-box support for treating a user’s hand as a pointer and being able to drive an on-screen UI with it.

As usual, I hadn’t the foggiest clue about how this might be done and so I thought I’d better take a quick look at it and this post is the result of a few minutes looking at the APIs and the documentation.

If you haven’t seen Prague at all then I did write a couple of other posts;

Project Prague Posts

and so feel free to have a read of those if you want the background on what I’m posting here and I’ll attempt to avoid repeating what I wrote in those posts.

Project Prague and the UWP

Since I last looked at Project Prague, “significant things” have happened in that the Windows 10 Fall Creators Update has been released and, along with it, support for .NET Standard 2.0 in UWP apps which I just wrote about an hour or two ago in this post;

UWP and .NET Standard 2.0–Remembering the ‘Forgotten’ APIs –)

These changes mean that I now seem to be free to use Project Prague from inside a UWP app (targeting .NET Standard 2.0 on Windows 16299+) although I’m unsure about whether this is a supported scenario yet or what it might mean for an app that wanted to go into Store but, technically, it seems that I can make use of the Prague SDK from a UWP app and so that’s what I did.

Project Prague and Skeleton Tracking

I revisited the Project Prague documentation and scanned over this one page which covers a lot of ground but it mostly focuses on how to get gestures working and doesn’t drop to the lower level details.

However, there’s a response to a comment further down the page which does talk in terms of;

“The SDK provides both the high level abstraction of the gestures as they are described in the overview above and also the raw skeleton we produce. The skeleton we produce is ‘light-weight’ namely it exposes the palm & fingertips’ locations and directions vectors (palm also has an orientation vector).

In the slingshot example above, you would want to register to the skeleton event once the slingshot gesture reaches the Pinch state and then track the motion instead of simply expecting a (non negligible) motion backwards as defined above.

Depending on your needs, you could either user the simplistic gesture-states-only approach or weave in the use of raw skeleton stream.

We will followup soon with a code sample in https://aka.ms/gestures/samples that will show how to utilize the skeleton stream”

and that led me back to the sample;

3D Camera Sample

which looks to essentially use gestures as a start/stop mechanism in between which it makes use of the API;

GesturesServiceEndpoint.RegisterToSkeleton

in order to get raw hand-tracking data including the position of the palm and digits and so it felt like this was the API that I might want to take a look at – it seemed that this might be the key to the question that I got asked.

Alongside discovering this API I also had a look through the document which is targeted at Unity but generally useful;

“3D Object Manipulation”

because it talks about the co-ordinate system that positions, directions etc. are offered in by the SDK and also units;

“The hand-skeleton is provided in units of millimeters, in the following left-handed coordinate system”

although what wasn’t clear to me from the docs was whether I had to think in terms of different ranges for distances based on the different cameras that the SDK supports. I was using a RealSense SR300 as it is easier to plug in than a Kinect and one of my out-standing questions remains what sort of range of motion in the horizontal and vertical planes I should expect the SDK to be able to track for the camera.

Regardless, I set about trying to put together a simple UWP app that let me move something around on the screen using my hand and the Prague SDK.

Experimenting in a UWP App

I made a new UWP project (targeting 16299) and I referenced the Prague SDK assemblies (see previous post for details of where to find them);

image

and then added a small piece of XAML UI with a green dot which I want to move around purely by dragging my index finger in front of the screen;

<Page
    x:Class="App2.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:App2"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d">

    <Grid>
        <Canvas HorizontalAlignment="Stretch" VerticalAlignment="Stretch" Background="{ThemeResource ApplicationPageBackgroundThemeBrush}" SizeChanged="CanvasSizeChanged">
            <Ellipse Width="10" Height="10" Fill="Green" x:Name="marker" Visibility="Collapsed"/>
        </Canvas>
        <TextBlock FontSize="24" x:Name="txtDebug" HorizontalAlignment="Left" VerticalAlignment="Bottom"/>
    </Grid>
</Page>

With that in place, I added some code behind which attempts to permanently be tracking the user’s right hand and linking it to movement of this green dot. The code’s fairly self-explanatory I think with the exception that I limited the hand range to be –200mm to 200mm on the X axis and –90mm to +90mm on the Y axis based on experimentation. I’m unsure of whether this is “right” or not at the time of writing. I did experiment with normalising the vectors and trying to use those to drive my UI but that didn’t work out well for me as I never seemed to be able to get more than around +/- 0.7 units along the X or Y axis.

using Microsoft.Gestures;
using Microsoft.Gestures.Endpoint;
using Microsoft.Gestures.Samples.Camera3D;
using System;
using System.Linq;
using Windows.Foundation;
using Windows.UI.Core;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;

namespace App2
{
    public sealed partial class MainPage : Page
    {
        public MainPage()
        {
            this.InitializeComponent();
            this.Loaded += OnLoaded;
        }
        async void OnLoaded(object sender, RoutedEventArgs e)
        {
            this.gestureService = GesturesServiceEndpointFactory.Create();
            await this.gestureService.ConnectAsync();

            this.smoother = new IndexSmoother();
            this.smoother.SmoothedPositionChanged += OnSmoothedPositionChanged;

            await this.gestureService.RegisterToSkeleton(this.OnSkeletonDataReceived);
        }
        void CanvasSizeChanged(object sender, SizeChangedEventArgs e)
        {
            this.canvasSize = e.NewSize;
        }
        void OnSkeletonDataReceived(object sender, HandSkeletonsReadyEventArgs e)
        {
            var right = e.HandSkeletons.FirstOrDefault(h => h.Handedness == Hand.RightHand);

            if (right != null)
            {
                this.smoother.Smooth(right);
            }
        }
        async void OnSmoothedPositionChanged(object sender, SmoothedPositionChangeEventArgs e)
        {
            // AFAIK, the positions here are defined in terms of millimetres and range
            // -ve to +ve with 0 at the centre.

            // I'm unsure what range the different cameras have in terms of X,Y,Z and
            // so I've made up my own range which is X from -200 to 200 and Y from
            // -90 to 90 and that seems to let me get "full scale" on my hand 
            // movements.

            // I'm sure there's a better way. X is also reversed for my needs so I
            // went with a * -1.

            var xPos = Math.Clamp(e.SmoothedPosition.X * - 1.0, 0 - XRANGE, XRANGE);
            var yPos = Math.Clamp(e.SmoothedPosition.Y, 0 - YRANGE, YRANGE);
            xPos = (xPos + XRANGE) / (2.0d * XRANGE);
            yPos = (yPos + YRANGE) / (2.0d * YRANGE);

            await this.Dispatcher.RunAsync(
                CoreDispatcherPriority.Normal,
                () =>
                {
                    this.marker.Visibility = Visibility.Visible;

                    var left = (xPos * this.canvasSize.Width);
                    var top = (yPos * this.canvasSize.Height);

                    Canvas.SetLeft(this.marker, left - (this.marker.Width / 2.0));
                    Canvas.SetTop(this.marker, top - (this.marker.Height / 2.0));
                    this.txtDebug.Text = $"{left:N1},{top:N1}";
                }

            );
        }
        static readonly double XRANGE = 200;
        static readonly double YRANGE = 90;
        Size canvasSize;
        GesturesServiceEndpoint gestureService;
        IndexSmoother smoother;
    }
}

As part of writing that code, I modified the PalmSmoother class from the 3D sample provided to become an IndexSmoother class which essentially performs the same function but on a different piece of data and with some different parameters. It looks like a place where something like the Reactive Extensions might be a good thing to use instead of writing these custom classes but I went with it for speed/ease.

Wrapping Up

This was just a quick experiment but I learned something from it. The code’s here if it’s of use to anyone else glancing at Project Prague and, as always, feed back if I’ve messed this up – I’m very new to using Project Prague.

UWP and .NET Standard 2.0–Remembering the ‘Forgotten’ APIs :-)

This post comes out of a conversation that I was having with my colleague Pete around the use of the HttpListener class inside of a UWP application.

I’m using HttpListener as an example because it’s the type that we were talking about but there’s many, many other types that I could use instead.

Over the past few years, I’ve been gradually building up a table of APIs in my head partitioned something like;

  • .NET APIs that I know are in the .NET Framework
  • .NET APIs that I know are available to the UWP developer

and so it’s become easier and easier over time to know that (e.g.) a class like HttpListener isn’t available to the UWP developer because that class wasn’t in the API set offered by .NET Core on which UWP apps are based.

Of course, in recent times .NET Standard 2.0 support has come to UWP apps running on Windows Fall Creators Update as detailed;

Announcing UWP Support for .NET Standard 2.0

and so this means that a UWP application that’s definitely running on Windows build 16299 upwards has access to .NET Standard 2.0 making our example API HttpListener suddenly usable inside of a UWP application.

Knowing whether an API is/isn’t in one of these variants of .NET can be tricky to keep in your head but the “.NET API Browser” web page provides a quick look up;

.NET API Browser

So, what does this mean? If I’m sitting in a UWP project that has been set to target 16299 upwards of the UWP platform;

image

and then I can spin up a little piece of a MainPage.xaml type UI;


<Page
     x:Class="App1.MainPage"
     xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
     xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
     xmlns:local="using:App1"
     xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
     xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
     mc:Ignorable="d">

    <Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">
         <Viewbox Margin="40">
             <TextBlock>
                 <Run Text="Number of HTTP requests served "/>
                 <Run Text="{x:Bind NumberOfRequests, Mode=OneWay}"/>
             </TextBlock>
         </Viewbox>
     </Grid>
</Page>

and marry that up with a little code behind;


using System;
 using System.ComponentModel;
 using System.IO;
 using System.Linq;
 using System.Net;
 using System.Runtime.CompilerServices;
 using Windows.UI.Xaml;
 using Windows.UI.Xaml.Controls;

namespace App1
 {
     public sealed partial class MainPage : Page, INotifyPropertyChanged
     {
         public event PropertyChangedEventHandler PropertyChanged;

        public MainPage()
         {
             this.InitializeComponent();
             this.Loaded += OnLoaded;
         }
         public int NumberOfRequests
         {
             get => this.numberOfRequests;
             set
             {
                 if (this.numberOfRequests != value)
                 {
                     this.numberOfRequests = value;
                     this.FirePropertyChanged();
                 }
             }
         }
         void FirePropertyChanged([CallerMemberName] string property = null)
         {
             this.Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal,
                 () =>
                 {
                     this.PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(property));
                 }
             );
         }
         async void OnLoaded(object sender, RoutedEventArgs e)
         {
             // an endless async method isn't perhaps the nicest thing in the world
             // but it's very easy to read.
             this.listener = new HttpListener();
             this.listener.Prefixes.Add("http://+:8088/");
             this.listener.Start();

            while (true)
             {
                 var context = await this.listener.GetContextAsync();

                if (context.Request.AcceptTypes.Contains(HTML))
                 {
                     // Ah, for an HtmlTextWriter being part of .NET Standard 2.0 😉
                     context.Response.ContentType = "text/html";
                     context.Response.StatusCode = 200;

                    using (var writer = new StreamWriter(context.Response.OutputStream))
                     {
                         writer.Write(
                             $"<html><body>{DateTime.Now.ToShortTimeString()}</body></html>");
                     }
                     this.NumberOfRequests++;
                 }
                 else
                 {
                     context.Response.StatusCode = 501;
                 }
                 context.Response.Close();
             }
         }
         static readonly string HTML = "text/html";
         HttpListener listener;
         int numberOfRequests;
     }
 }

and then make some changes to my application manifest;

image

and I’ve then got a little web server that I can hit from a second machine asking for //myMachine:8080/whatever and I can see responses being returned and the usage counter going up;

image

and that all seems to work quite nicely Smile

Again, the use of HttpListener here is just an example – I’m not sure whether/where you’d actually want to build a UWP app that offered up an HTTP server so it’s just an example of an API as it happened to be the one that we were discussing.

Baby Steps with Spatial Mapping in 2D and 3D Using XAML and SharpDX

NB: The usual blog disclaimer for this site applies to posts around HoloLens. I am not on the HoloLens team. I have no details on HoloLens other than what is on the public web and so what I post here is just from my own experience experimenting with pieces that are publicly available and you should always check out the official developer site for the product documentation.

I’ve been living in fear and hiding from a particular set of APIs Winking smile

Ever since HoloLens and Windows Mixed Reality first came along, I’ve been curious about the realities of the spatial mapping APIs and yet I’ve largely just treated them as a “black box”.

Naturally, that’s not to say that I haven’t benefitted from those APIs because I’ve been using them for many months in Unity via the Mixed Reality Toolkit and its support for spatial mapping and the prefab that’s fairly easy to drop into a Unity project as I first explored in this post last year;

Hitchhiking the HoloToolkit-Unity, Leg 3–Spatial Understanding (& Mapping)

That said, I’ve still had it on my “to do list” for a long time to visit these APIs a little more directly and that’s what this blog post is about.

It’s important to say that the post is mostly meant to be just “for fun” to give me a place to write down some explorations – I’m not planning to do an exhaustive write up of the APIs and what I’ll end up with by the end of this post is going to be pretty “rough”.

It’s also important to say that there are official documentation pages which detail a lot more than I’m about to write up in this post.

Spatial mapping

Spatial mapping in DirectX

but (as usual) I hadn’t really read those documents in nearly enough detail until I started to explore on my own for this post – it’s the exploration that drives the learning.

Additionally, there’s a great official sample that goes along with those documents;

Holographic Spatial Mapping Sample

but, again, I hadn’t actually seen this sample until I got well into writing this post and was trying to figure things out and I realised that I was largely trying to produce a much simpler, less functional piece of code which targeted a different type of application than the one in the sample but there are many similarities between where I ended up and that sample.

So, if you want the definitive views on these topics there are lots of links to visit.

In the meantime, I’m going to write up my own experiments here.

Choosing a Sandbox to Play In

Generally speaking, if I’m wanting to experiment with some .NET APIs then I write a console application. It seems the quickest, easiest thing to spin up.

In a Mixed Reality world, the equivalent seems to be a 2D XAML application. I find it is much quicker to Code->Deploy->Test->Debug when working on a 2D XAML application than when working on (e.g.) a 3D Unity application.

Of course, the output is then a 2D app rather than an immersive app but if you just want to test out some UWP APIs (which the spatial mapping APIs are) then that’s ok.

Specifically, in this case, I found that trying to make use of these APIs in a 2D environment seemed to actually be helpful to gaining some understanding of them as it stopped me from just looking for a quick Unity solution to various challenges and I definitely felt that I wasn’t losing anything by at least starting my journey inside of a 2D XAML application where I could quickly iterate.

Getting Going – Asking for Spatial Mapping API Access

I made a quick, blank 2D XAML UWP application in Visual Studio and made sure that its application manifest gave me the capability to use Spatial Mapping.

When I look in Visual Studio today, I don’t see this listed as an option in the UI and so I hacked the manifest file in the XML editor;

image

where uap2 translates as a namespace to;

xmlns:uap2=http://schemas.microsoft.com/appx/manifest/uap/windows10/2

in case you ever got stuck on that one. From there, I had a blank app where I could write some code to run on the Loaded event of my main XAML page.

Figuring out the SurfaceSpatialObserver

At this point, I had an idea of what I wanted to do and I was fairly sure that I needed to spin up a SpatialSurfaceObserver which does a lot of the work of trying to watch surfaces as they are discovered and refined by HoloLens.

The essence of the class would seem to be to check whether spatial mapping is supported and available via the IsSupported and RequestAccessAsync() methods.

Once support is ascertained, you define some “volumes” for the observer to observe for spatial mapping data via the SetBoundingVolume/s method and then you can interrogate that data via the GetObservedSurfaces method.

Additionally, there’s an event ObservedSurfacesChanged to tell you when the data relating to surfaces has changed because the device has added/removed or updated data.

This didn’t seem too bad and so my code for checking for support ended up looking as below;

  async void OnLoaded(object sender, RoutedEventArgs e)
    {
      bool tryInitialisation = true;

      if (Windows.Foundation.Metadata.ApiInformation.IsApiContractPresent(
          "Windows.Foundation.UniversalApiContract", 4, 0))
      {
        tryInitialisation = SpatialSurfaceObserver.IsSupported();
      }

      if (tryInitialisation)
      {
        var access = await SpatialSurfaceObserver.RequestAccessAsync();

        if (access == SpatialPerceptionAccessStatus.Allowed)
        {
          this.InitialiseSurfaceObservation();
        }
        else
        {
          tryInitialisation = false;
        }
      }
      if (!tryInitialisation)
      {
        var dialog = new MessageDialog(
          "Spatial observation is either not supported or not allowed", "Not Available");

        await dialog.ShowAsync();
      }
    }

Now, as far as I could tell the SpatialSurfaceObserver.IsSupported() method only became available in V4 of the UniversalApiContract so I’m trying to figure out whether it’s safe to call that API or not as you can see above before using it.

The next step would be perhaps to try and define volumes and so I ploughed ahead there…

Volumes, Coordinate Systems, Reference Frames, Locators – Oh My Winking smile

I wanted to keep things as simple as possible and so I chose to look at the SetBoundingVolume method which takes a single SpatialBoundingVolume and there are a number of ways of creating these based on Boxes, Frustrums and Spheres.

I figured that a sphere was a fairly understandable thing and so I went with a sphere and decided I’d use a 5m radius on my sphere hoping to determine all surface information within that radius.

However, to create a volume you first need a SpatialCoordinateSystem and the easiest way I found of getting hold of one of those was to get hold of a frame of reference.

Frames of reference can either be “attached” in the sense of being head-locked and following the device or they can be “stationary” where they don’t follow the device.

A stationary frame of reference seemed easier to think about and so I went that way but to get hold of a frame of reference at all I seemed to need to use a SpatialLocator which has a handy GetDefault() method on it and then I can use the CreateStationaryFrameOfReferenceAtCurrentLocation() method to create my frame.

So…my reasoning here is that I’m creating a frame of reference at the place where the app starts up and that it will never move during the app’s lifetime. Not perhaps the most “flexible” thing in the world, but it seemed simpler than any other options so I went with it.

With that in place, my “start-up” code looks as below;

  void InitialiseSurfaceObservation()
    {
      // We want the default locator.
      this.locator = SpatialLocator.GetDefault();

      // We try to make a frame of reference that is fixed at the current position (i.e. not
      // moving with the user).
      var frameOfReference = this.locator.CreateStationaryFrameOfReferenceAtCurrentLocation();

      this.baseCoordinateSystem = frameOfReference.CoordinateSystem;

      // Make a box which is centred at the origin (the user's startup location)
      // and is hopefully oriented to the Z axis and a certain width/height.
      var boundingVolume = SpatialBoundingVolume.FromSphere(
        this.baseCoordinateSystem,
        new SpatialBoundingSphere()
        {
          Center = new Vector3(0, 0, 0),
          Radius = SPHERE_RADIUS
        }
      );
      this.surfaceObserver = new SpatialSurfaceObserver();
      this.surfaceObserver.SetBoundingVolume(boundingVolume);
    }

Ok…I have got hold of a SpatialSurfaceObserver that’s observing one volume for me defined by a sphere. What next?

Gathering and Monitoring Surfaces Over Time

Having now got my SpatialSurfaceObserver with a defined volume, I wanted some class that took on the responsibility of grabbing any surfaces from it, putting them on a list and then managing that list as the observer fired events to flag that surfaces had been added/removed/updated.

In a real application, it’s likely that you’d need to do this in a highly performant way but I’m more interested in experimentation here than performance and so I wrote a small SurfaceChangeWatcher class which I can pass the SpatialSurfaceObserver to.

Surfaces are identified by GUID and so this watcher class maintains a simple Dictionary<Guid,SpatialSurfaceInfo>. On startup, it calls the GetObservedSurfaces method to initially populate its dictionary and then it handles the ObservedSurfacesChanged event to update its dictionary as data changes over time.

It aggregates up the changes that it sees and fires its own event to tell any interested parties about the changes.

I won’t post the whole source code for the class here but will just link to it instead. It’s not too long and it’s not too complicated.

Source SurfaceChangeWatcher.cs.

Checking for Surface Data

At this point, I’ve enough code to fire up the debugger and debug my 2D app on a HoloLens or an emulator and see if I can get some spatial mapping data into my code.

It’s worth remembering that the HoloLens emulator is good for debugging spatial mapping as by default the emulator places itself into a “default room” and it can switch to a number of other rooms provided with the SDK and also to custom rooms that have been recorded from a HoloLens.

So, debugging on the emulator I can see that in the first instances here I see 22 loaded surfaces coming back from the SpatialSurfaceObserver;

image

and you can see the ID for my first surface and the UpdateTime that it’s associated with.

I also notice that very early on in the application I see the ObservedSurfacesChanged event fire and my code in SurfaceChangeWatcher simply calls back into the LoadSurfaces method shown in the screenshot above which then attempts to figure out which surfaces have been added/removed or updated since they were last queried.

So, getting hold of the surfaces within a volume and responding to their changes as they evolve doesn’t seem too onerous.

But, how to get the actual polygonal mesh data itself?

Getting Mesh Data

Once you have hold of a SpatialSurfaceInfo, you can attempt to get hold of the SpatialSurfaceMesh which it represents via the TryComputeLatestMeshAsync method.

This method wants a “triangle density” in terms of how many triangles it should attempt to bring back per cubic metre. If you’ve used the Unity prefab then you’ll have seen this parameter before and in my code here I chose a value of 100 and stuck with it.

The method is also asynchronous and so you can’t just demand the mesh in realtime but it’s a fairly simple call and here’s a screenshot of me back in the debugger having made that call to get some data;

image

That screenshot shows that I’ve got a SpatialSurfaceMesh and it contains 205 vertices in the R16G16B16A16IntNormalized format and that there are 831 triangle vertices in an R16Uint format and it also gives me the Id and the UpdateTime of the SpatialSurfaceInfo.

It’s also worth noting the VertexPositionScale which needs to be applied to the vertices to reconstruct them.

Rendering Mesh Data

Now, at this point I felt that I had learned a few things about how to get hold of spatial mapping meshes but I thought that it wasn’t really “enough” if I didn’t make at least some attempt to render the meshes produced.

I thought about a few options around how I might do that given that I’m running code inside of a 2D XAML application.

I wondered whether I might somehow flatten the mesh and draw it with a XAML Canvas but that seemed unlikely and I suspected that the best road to go down would be to keep the data in the format that it was already being provided in and try and hand it over to DirectX for rendering.

That led me to wonder whether something from Win2D might be able to draw it for me but Win2D stays true to its name and doesn’t (as far as I know) get into the business of wrapping up Direct3D APIs.

So…I figured that I’d need to bite the bullet and see if I could bring this into my 2D app via the XAML SwapChainPanel integration element with some rendering provided by SharpDX.

It’s worth saying that I’ve hardly ever used SwapChainPanel and I’ve never used SharpDX before so I figured that putting them together with this mesh data might be “fun” Winking smile

A UWP SharpDX SwapChainPanel Sample

In order to try and achieve that, I went on a bit of a search to try and see if I could find a basic sample which illustrated how to integrate SharpDX code inside of a XAML application rendering to a SwapChainPanel.

It took me a little while to find that sample as quite a few of the SharpDX samples seem to be out of date these days and I asked around on Twitter before finding this great sample which uses SharpDX and SwapChainPanel to render a triangle inside of a UWP XAML app;

https://github.com/minhcly/UWP3DTest

That let me drop a few SharpDX packages into my project;

image

and the sample was really useful in that it enabled me to drop a SwapChainPanel into my XAML UI app and, using code that I lifted and reworked out of the sample, I could get that same triangle to render inside of my 2D XAML application.

That gave me a little hope that I might be able to get the mesh data rendered inside of my application too.

Building a SharpDX Renderer (or trying to!)

I wrote a class SwapChainPanelRenderer (source) which essentially takes the SwapChainPanel and my SurfaceChangeWatcher class and it puts them together in order to retrieve/monitor spatial meshes as they are produced by the SpatialSurfaceObserver.

The essence of that class is that it goes through a few steps;

  1. It Initialises D3D via SharpDX largely following the pattern from the sample I found.
  2. It creates a very simple vertex shader and pixel shader much like the sample does although I ended up tweaking them a little.
  3. Whenever a new SpatialSurfaceInfo is provided by the SurfaceChangeWatcher the renderer attempts asks the system to compute the mesh for it and creates a number of data structures from that mesh;
    1. A vertex buffer to match the format provided by the mesh
    2. An index buffer to match the format provided by the mesh
    3. A constant buffer with details of how to transform the vertices provided by the mesh
  4. Whenever the renderer is asked to render, it loads up the right vertex/index/constant buffers for each of the meshes that it knows about and asks the system to render them passing through a few transformation pieces to the vertex shader.

It’s perhaps worth noting a couple of things around how that code works – the first would be;

  • In order to get hold of the actual vertex data, this code relies on using unsafe C# code and IBufferByteAccess in order to be able to grab the real data buffers rather than copying it.

The second point that might be worth mentioning is that I spent quite a bit of time trying to see if I could get the mesh rendering right.

  • I’m not 100% there at the time of writing but what I have managed to get working has been done by consulting back with the official C++ sample which has a more complex pipeline but I specifically consulted it around how to make use of the SpatialSurfaceMesh.VertexPositionScale property and I tried to make my code line up with the sample code around that in as much as possible.

I must admit that I spent a bit of time staring at my code and trying to compare to the sample code as a way of trying to figure out if I could improve the way mine was seeming to render and I think I can easily spend more time on it to make it work better.

The last point I’d make is that there’s nothing in the code at the time of writing which attempts to align the HoloLens position, orientation and view with what’s being shown inside of the 2D app. What that means is;

  • The 2D app starts at a position (0,0.5,0) so half a metre above where the HoloLens is in the world.
  • The 2D app doesn’t know the orientation of the user so could be pointing in the wrong direction with respect to the mesh.

This can make the app a little “disorientating” unless you are familiar with what it’s doing Smile

Trying it Out

At the time of writing, I’ve mostly been trying this code out on the emulator but I have also experimented with it on HoloLens.

Here’s a screenshot of the official sample 3D app with its fancier shader running on the emulator where I’m using the “living room” room;

image

and here’s my 2D XAML app running in a window but, hopefully, rendering a similar thing albeit in wireframe;

image

and, seemingly, there’s something of a mirroring going on in there as well which I still need to dig into!

Wrapping Up & The Source

As I said at the start of the post, this one was very much just “for fun” but I thought I’d write it down so that I can remember it and maybe some pieces of it might be useful to someone else in the future.

If you want the source, it’s all over here on github so feel free to take it, play with it and feel very free to improve it Smile.