Windows 10, 1607, UWP and Experimenting with the Kinect for Windows V2 Update

I was really pleased to see this blog post;

Kinect demo code and new driver for UWP now available

announcing a new driver which provides more access to the functionality of the Kinect for Windows V2 into Windows 10 including for the UWP developer.

I wrote a little about this topic in this earlier post around 10 months ago when some initial functionality became available for the UWP developer;

Kinect V2, Windows Hello and Perception APIs

and so it’s great to see that more functionality has become available and, specifically, that skeletal data is being surfaced.

I plugged my Kinect for Windows V2 into my Surface Pro 3 and had a look at the driver being used for Kinect.

image

and I attempted to do an update but didn’t seem to see one but it’s possible that the version of the driver which I have;

image

is the latest driver as it seems to be a week or two old. At the time of writing, I haven’t confirmed this driver version but I went on to download the C++ sample from GitHub;

Camera Stream Correlation Sample

and ran it up on my Surface Pro 3 where it initially displayed the output of the rear webcam;

image

and so I pressed the ‘Next Source’ button and it attempted to work with the RealSense camera on my machine;

image

and so I pressed the ‘Next Source’ button and things seemed to hang. I’m unsure of the status of my RealSense drivers on this machine and so I disabled the RealSense virtual camera driver;

image

and then re-ran the sample and, sure enough, I could use the ‘Next Source’ button to move to the Kinect for Windows V2 sensor and then I used the ‘Toggle Depth Fading’ button to turn that option off and the ‘Toggle Skeletal Overlay’ button to switch that option on and, sure enough, I’ve got a (flat) skeletal overlay on the colour frames and it’s delivering very smooth performance here;

image

and so that’s great to see working. Given that the sample seemed to be C++ code, I wondered what this might look like for a C# developer working with the UWP and so I set about seeing if I could reproduce some of the core of what the sample is doing here.

Getting Skeletal Data Into a C# UWP App

Rather than attempting to ‘port’ the C++ sample, I started by lifting pieces of the code that I’d written for that earlier blog post into a new project.

I made a blank app targeting SDK 14393, made sure that it had access to webcam and microphone and then added in win2d.uwp as a NuGet package and added a little UI;

<Page
    x:Class="KinectTestApp.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:KinectTestApp"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    xmlns:w2d="using:Microsoft.Graphics.Canvas.UI.Xaml"
    mc:Ignorable="d">

    <Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">
        <TextBlock
            FontSize="36"
            HorizontalAlignment="Center"
            VerticalAlignment="Center"
            TextAlignment="Center"
            Text="No Cameras" />
        <w2d:CanvasControl
            x:Name="canvasControl"
            Visibility="Collapsed"
            SizeChanged="OnCanvasControlSizeChanged"
            Draw="OnDraw"/>
    </Grid>
</Page>

From there, I wanted to see if I could get a basic render of the colour frame from the camera along with an overlay of some skeletal points.

I’d spotted that the official samples include a project which builds out a WinRT component that is then used to interpret the custom data that comes from the Kinect via a MediaFrameReference and so I included a reference to this project into my solution so that I could use it in my C# code. That project is here and looks to stand independent of the surrounding sample. I made my project reference as below;

image

and then set about trying to see if I could write some code that got colour data and skeletal data onto the screen.

I wrote a few, small, supporting classes and named them all with an mt* prefix to try and make it more obvious which code here is mine rather than in the framework or the sample. This simple class delivers a SoftwareBitmap containing the contents of the colour frame to be fired as an event;

namespace KinectTestApp
{
  using System;
  using Windows.Graphics.Imaging;

  class mtSoftwareBitmapEventArgs : EventArgs
  {
    public SoftwareBitmap Bitmap { get; set; }
  }
}

whereas this class delivers the data that I’ve decided I need in order to draw a subset of the skeletal data onto the screen;

namespace KinectTestApp
{
  using System;

  class mtPoseTrackingFrameEventArgs : EventArgs
  {
    public mtPoseTrackingDetails[] PoseEntries { get; set; }
  }
}

and it’s a simple array which will be populated with one of these types below for each user being tracked by the sensor;

namespace KinectTestApp
{
  using System;
  using System.Linq;
  using System.Numerics;
  using Windows.Foundation;
  using Windows.Media.Devices.Core;
  using WindowsPreview.Media.Capture.Frames;

  class mtPoseTrackingDetails
  {
    public Guid EntityId { get; set; }
    public Point[] Points { get; set; }

    public static mtPoseTrackingDetails FromPoseTrackingEntity(
      PoseTrackingEntity poseTrackingEntity,
      CameraIntrinsics colorIntrinsics,
      Matrix4x4 depthColorTransform)
    {
      mtPoseTrackingDetails details = null;

      var poses = new TrackedPose[poseTrackingEntity.PosesCount];
      poseTrackingEntity.GetPoses(poses);

      var points = new Point[poses.Length];

      colorIntrinsics.ProjectManyOntoFrame(
        poses.Select(p => Multiply(depthColorTransform, p.Position)).ToArray(),
        points);

      details = new mtPoseTrackingDetails()
      {
        EntityId = poseTrackingEntity.EntityId,
        Points = points
      };
      return (details);
    }
    static Vector3 Multiply(Matrix4x4 matrix, Vector3 position)
    {
      return (new Vector3(
        position.X * matrix.M11 + position.Y * matrix.M21 + position.Z * matrix.M31 + matrix.M41,
        position.X * matrix.M12 + position.Y * matrix.M22 + position.Z * matrix.M32 + matrix.M42,
        position.X * matrix.M13 + position.Y * matrix.M23 + position.Z * matrix.M33 + matrix.M43));
    }
  }
}

which would be a simple class containing a GUID to identify the tracked person and an array of Points representing their tracked joints except that I wanted those 2D Points to be in the colour space which means having to map them from the depth space that the sensor presents them in and so the FromPoseTrackingEntity() method takes a PoseTrackingEntity which is one of the types from the referenced C++ project and;

  1. Extracts the ‘poses’ (i.e. joints in my terminology)
  2. Uses the CameraIntrinsics from the colour camera to project them onto its frame having first transformed them using a matrix which maps from depth space to colour space.

Step 2 is code that I largely duplicated from the original C++ sample after trying a few other routes which didn’t end well for me Smile

I then wrote this class which wraps up a few areas;

namespace KinectTestApp
{
  using System;
  using System.Linq;
  using System.Threading.Tasks;
  using Windows.Media.Capture;
  using Windows.Media.Capture.Frames;

  class mtMediaSourceReader
  {
    public mtMediaSourceReader(
      MediaCapture capture, 
      MediaFrameSourceKind mediaSourceKind,
      Action<MediaFrameReader> onFrameArrived,
      Func<MediaFrameSource, bool> additionalSourceCriteria = null)
    {
      this.mediaCapture = capture;
      this.mediaSourceKind = mediaSourceKind;
      this.additionalSourceCriteria = additionalSourceCriteria;
      this.onFrameArrived = onFrameArrived;
    }
    public bool InitialiseWithMediaCapture()
    {
      this.mediaSource = this.mediaCapture.FrameSources.FirstOrDefault(
        fs =>
          (fs.Value.Info.SourceKind == this.mediaSourceKind) &&
          ((this.additionalSourceCriteria != null) ? 
            this.additionalSourceCriteria(fs.Value) : true)).Value;   

      return (this.mediaSource != null);
    }
    public async Task OpenReaderAsync()
    {
      this.frameReader =
        await this.mediaCapture.CreateFrameReaderAsync(this.mediaSource);

      this.frameReader.FrameArrived +=
        (s, e) =>
        {
          this.onFrameArrived(s);
        };

      await this.frameReader.StartAsync();
    }
    Func<MediaFrameSource, bool> additionalSourceCriteria;
    Action<MediaFrameReader> onFrameArrived;
    MediaFrameReader frameReader;
    MediaFrameSource mediaSource;
    MediaCapture mediaCapture;
    MediaFrameSourceKind mediaSourceKind;
  }
}

This type takes a MediaCapture and a MediaSourceKind and can then report via the Initialise() method whether that media source kind is available on that media capture. It can also apply some additional criteria if they are provided in the constructor. This class can also create a frame reader and redirect its FrameArrived events into the method provided to the constructor. There should be some way to stop this class as well but I haven’t written that yet.

With those classes in place, I added the following mtKinectColorPoseFrameHelper;

namespace KinectTestApp
{
  using System;
  using System.Collections.Generic;
  using System.Linq;
  using System.Numerics;
  using System.Threading.Tasks;
  using Windows.Media.Capture;
  using Windows.Media.Capture.Frames;
  using Windows.Media.Devices.Core;
  using Windows.Perception.Spatial;
  using WindowsPreview.Media.Capture.Frames;

  class mtKinectColorPoseFrameHelper
  {
    public event EventHandler<mtSoftwareBitmapEventArgs> ColorFrameArrived;
    public event EventHandler<mtPoseTrackingFrameEventArgs> PoseFrameArrived;

    public mtKinectColorPoseFrameHelper()
    {
      this.softwareBitmapEventArgs = new mtSoftwareBitmapEventArgs();
    }
    internal async Task<bool> InitialiseAsync()
    {
      bool necessarySourcesAvailable = false;

      // Find all possible source groups.
      var sourceGroups = await MediaFrameSourceGroup.FindAllAsync();

      // We try to find the Kinect by asking for a group that can deliver
      // color, depth, custom and infrared. 
      var allGroups = await GetGroupsSupportingSourceKindsAsync(
        MediaFrameSourceKind.Color,
        MediaFrameSourceKind.Depth,
        MediaFrameSourceKind.Custom,
        MediaFrameSourceKind.Infrared);

      // We assume the first group here is what we want which is not
      // necessarily going to be right on all systems so would need
      // more care.
      var firstSourceGroup = allGroups.FirstOrDefault();

      // Got one that supports all those types?
      if (firstSourceGroup != null)
      {
        this.mediaCapture = new MediaCapture();

        var captureSettings = new MediaCaptureInitializationSettings()
        {
          SourceGroup = firstSourceGroup,
          SharingMode = MediaCaptureSharingMode.SharedReadOnly,
          StreamingCaptureMode = StreamingCaptureMode.Video,
          MemoryPreference = MediaCaptureMemoryPreference.Cpu
        };
        await this.mediaCapture.InitializeAsync(captureSettings);

        this.mediaSourceReaders = new mtMediaSourceReader[]
        {
          new mtMediaSourceReader(this.mediaCapture, MediaFrameSourceKind.Color, this.OnFrameArrived),
          new mtMediaSourceReader(this.mediaCapture, MediaFrameSourceKind.Depth, this.OnFrameArrived),
          new mtMediaSourceReader(this.mediaCapture, MediaFrameSourceKind.Custom, this.OnFrameArrived,
            DoesCustomSourceSupportPerceptionFormat)
        };

        necessarySourcesAvailable = 
          this.mediaSourceReaders.All(reader => reader.Initialise());

        if (necessarySourcesAvailable)
        {
          foreach (var reader in this.mediaSourceReaders)
          {
            await reader.OpenReaderAsync();
          }
        }
        else
        {
          this.mediaCapture.Dispose();
        }
      }
      return (necessarySourcesAvailable);
    }
    void OnFrameArrived(MediaFrameReader sender)
    {
      var frame = sender.TryAcquireLatestFrame();

      if (frame != null)
      {
        switch (frame.SourceKind)
        {
          case MediaFrameSourceKind.Custom:
            this.ProcessCustomFrame(frame);
            break;
          case MediaFrameSourceKind.Color:
            this.ProcessColorFrame(frame);
            break;
          case MediaFrameSourceKind.Infrared:
            break;
          case MediaFrameSourceKind.Depth:
            this.ProcessDepthFrame(frame);
            break;
          default:
            break;
        }
        frame.Dispose();
      }
    }
    void ProcessDepthFrame(MediaFrameReference frame)
    {
      if (this.colorCoordinateSystem != null)
      {
        this.depthColorTransform = frame.CoordinateSystem.TryGetTransformTo(
          this.colorCoordinateSystem);
      }     
    }
    void ProcessColorFrame(MediaFrameReference frame)
    {
      if (this.colorCoordinateSystem == null)
      {
        this.colorCoordinateSystem = frame.CoordinateSystem;
        this.colorIntrinsics = frame.VideoMediaFrame.CameraIntrinsics;
      }
      this.softwareBitmapEventArgs.Bitmap = frame.VideoMediaFrame.SoftwareBitmap;
      this.ColorFrameArrived?.Invoke(this, this.softwareBitmapEventArgs);
    }
    void ProcessCustomFrame(MediaFrameReference frame)
    {
      if ((this.PoseFrameArrived != null) &&
        (this.colorCoordinateSystem != null))
      {
        var trackingFrame = PoseTrackingFrame.Create(frame);
        var eventArgs = new mtPoseTrackingFrameEventArgs();

        if (trackingFrame.Status == PoseTrackingFrameCreationStatus.Success)
        {
          // Which of the entities here are actually tracked?
          var trackedEntities =
            trackingFrame.Frame.Entities.Where(e => e.IsTracked).ToArray();

          var trackedCount = trackedEntities.Count();

          if (trackedCount > 0)
          {
            eventArgs.PoseEntries =
              trackedEntities
              .Select(entity =>
                mtPoseTrackingDetails.FromPoseTrackingEntity(entity, this.colorIntrinsics, this.depthColorTransform.Value))
              .ToArray();
          }
          this.PoseFrameArrived(this, eventArgs);
        }
      }
    }
    async static Task<IEnumerable<MediaFrameSourceGroup>> GetGroupsSupportingSourceKindsAsync(
      params MediaFrameSourceKind[] kinds)
    {
      var sourceGroups = await MediaFrameSourceGroup.FindAllAsync();

      var groups =
        sourceGroups.Where(
          group => kinds.All(
            kind => group.SourceInfos.Any(sourceInfo => sourceInfo.SourceKind == kind)));

      return (groups);
    }
    static bool DoesCustomSourceSupportPerceptionFormat(MediaFrameSource source)
    {
      return (
        (source.Info.SourceKind == MediaFrameSourceKind.Custom) &&
        (source.CurrentFormat.MajorType == PerceptionFormat) &&
        (Guid.Parse(source.CurrentFormat.Subtype) == PoseTrackingFrame.PoseTrackingSubtype));
    }
    SpatialCoordinateSystem colorCoordinateSystem;
    mtSoftwareBitmapEventArgs softwareBitmapEventArgs;
    mtMediaSourceReader[] mediaSourceReaders;
    MediaCapture mediaCapture;
    CameraIntrinsics colorIntrinsics;
    const string PerceptionFormat = "Perception";
    private Matrix4x4? depthColorTransform;
  }
}

This is essentially doing;

  1. InitialiseAsync
    1. Using the MediaFrameSourceGroup type to try and find a source group that looks like it is Kinect by searching for Infrared+Color+Depth+Custom source kinds. This isn’t a complete test and it might be better to make it more complete. Also, there’s an assumption that the first group found is the best which isn’t likely to always hold true.
    2. Initialising a MediaCapture for the group found in step 1 above.
    3. Initialising three of my mtMediaSourceReader types for the Color/Depth/Custom source kinds and adding some extra criteria for the Custom source type to try and make sure that it supports the ‘Perception’ media format – this code is essentially lifted from the original sample.
    4. Opening frame readers on those three items and handling the events as frame arrives.
  2. OnFrameArrived simply passes the frame on to sub-functions based on type and this could have been done by deriving specific mtMediaSourceReaders.
  3. ProcessDepthFrame tries to get a transformation from depth space to colour space for later use.
  4. ProcessColorFrame fires the ColorFrameArrived event with the SoftwareBitmap that has been received.
  5. ProcessCustomFrame handles the custom frame by;
    1. Using the PoseTrackingFrame.Create() method from the referenced C++ project to interpret the raw data that comes from the custom sensor.
    2. Determining how many bodies are being tracked by the data.
    3. Converts the data types from the referenced C++ project to my own data types which include less of the data and which try to map the positions of joints given using 3D depth points to their respective 2D colour space points.

Lastly, there’s some code-behind which tries to glue this into the UI;

namespace KinectTestApp
{
  using Microsoft.Graphics.Canvas;
  using Microsoft.Graphics.Canvas.UI.Xaml;
  using System.Numerics;
  using System.Threading;
  using Windows.Foundation;
  using Windows.Graphics.Imaging;
  using Windows.UI;
  using Windows.UI.Core;
  using Windows.UI.Xaml;
  using Windows.UI.Xaml.Controls;

  public sealed partial class MainPage : Page
  {
    public MainPage()
    {
      this.InitializeComponent();
      this.Loaded += this.OnLoaded;
    }
    void OnCanvasControlSizeChanged(object sender, SizeChangedEventArgs e)
    {
      this.canvasSize = new Rect(0, 0, e.NewSize.Width, e.NewSize.Height);
    }
    async void OnLoaded(object sender, RoutedEventArgs e)
    {
      this.helper = new mtKinectColorPoseFrameHelper();

      this.helper.ColorFrameArrived += OnColorFrameArrived;
      this.helper.PoseFrameArrived += OnPoseFrameArrived;

      var suppported = await this.helper.InitialiseAsync();

      if (suppported)
      {
        this.canvasControl.Visibility = Visibility.Visible;
      }
    }
    void OnColorFrameArrived(object sender, mtSoftwareBitmapEventArgs e)
    {
      // Note that when this function returns to the caller, we have
      // finished with the incoming software bitmap.
      if (this.bitmapSize == null)
      {
        this.bitmapSize = new Rect(0, 0, e.Bitmap.PixelWidth, e.Bitmap.PixelHeight);
      }

      if (Interlocked.CompareExchange(ref this.isBetweenRenderingPass, 1, 0) == 0)
      {
        this.lastConvertedColorBitmap?.Dispose();

        // Sadly, the format that comes in here, isn't supported by Win2D when
        // it comes to drawing so we have to convert. The upside is that 
        // we know we can keep this bitmap around until we are done with it.
        this.lastConvertedColorBitmap = SoftwareBitmap.Convert(
          e.Bitmap,
          BitmapPixelFormat.Bgra8,
          BitmapAlphaMode.Ignore);

        // Cause the canvas control to redraw itself.
        this.InvalidateCanvasControl();
      }
    }
    void InvalidateCanvasControl()
    {
      // Fire and forget.
      this.Dispatcher.RunAsync(CoreDispatcherPriority.High, this.canvasControl.Invalidate);
    }
    void OnPoseFrameArrived(object sender, mtPoseTrackingFrameEventArgs e)
    {
      // NB: we do not invalidate the control here but, instead, just keep
      // this frame around (maybe) until the colour frame redraws which will 
      // (depending on race conditions) pick up this frame and draw it
      // too.
      this.lastPoseEventArgs = e;
    }
    void OnDraw(CanvasControl sender, CanvasDrawEventArgs args)
    {
      // Capture this here (in a race) in case it gets over-written
      // while this function is still running.
      var poseEventArgs = this.lastPoseEventArgs;

      args.DrawingSession.Clear(Colors.Black);

      // Do we have a colour frame to draw?
      if (this.lastConvertedColorBitmap != null)
      {
        using (var canvasBitmap = CanvasBitmap.CreateFromSoftwareBitmap(
          this.canvasControl,
          this.lastConvertedColorBitmap))
        {
          // Draw the colour frame
          args.DrawingSession.DrawImage(
            canvasBitmap,
            this.canvasSize,
            this.bitmapSize.Value);

          // Have we got a skeletal frame hanging around?
          if (poseEventArgs?.PoseEntries?.Length > 0)
          {
            foreach (var entry in poseEventArgs.PoseEntries)
            {
              foreach (var pose in entry.Points)
              {
                var centrePoint = ScalePosePointToDrawCanvasVector2(pose);

                args.DrawingSession.FillCircle(
                  centrePoint, circleRadius, Colors.Red);
              }
            }
          }
        }
      }
      Interlocked.Exchange(ref this.isBetweenRenderingPass, 0);
    }
    Vector2 ScalePosePointToDrawCanvasVector2(Point posePoint)
    {
      return (new Vector2(
        (float)((posePoint.X / this.bitmapSize.Value.Width) * this.canvasSize.Width),
        (float)((posePoint.Y / this.bitmapSize.Value.Height) * this.canvasSize.Height)));
    }
    Rect? bitmapSize;
    Rect canvasSize;
    int isBetweenRenderingPass;
    SoftwareBitmap lastConvertedColorBitmap;
    mtPoseTrackingFrameEventArgs lastPoseEventArgs;
    mtKinectColorPoseFrameHelper helper;
    static readonly float circleRadius = 10.0f;
  }
}

I don’t think there’s too much in there that would require explanation other than that I took a couple of arbitrary decisions;

  1. That I essentially process one colour frame at a time using a form of ‘lock’ to try and drop any colour frames that arrive while I am still in the process of drawing the last colour frame and that ‘drawing’ involves both the method OnColorFrameArrived and the async call to OnDraw it causes.
  2. That I don’t force a redraw when a ‘pose’ frame arrives. Instead, the data is held until the next OnDraw call which comes from handling the colour frames.It’s certainly possible that the various race conditions involved there might cause that frame to be dropped and another to replace it in the meantime.

Even though there’s a lot of allocations going on in that code as it stands, here’s a screenshot of it running and the performance isn’t bad at all running it from my Surface Pro 3 and I’m particularly pleased with the red nose that I end up with here Smile

image

The code is quite rough and ready as I was learning as I went along and some next steps might be to;

  1. Draw joints that are inferred in a different colour to those that are properly tracked.
  2. Draw the skeleton rather than just the joints.
  3. Do quite a lot of optimisations as the code here allocates a lot.
  4. Do more tracking around entities arriving/leaving based on their IDs and handle multiple people with different colours.
  5. Refactor to specialise the mtMediaSourceReader class to have separate types for Color/Depth/Custom and thereby tidy up the code which uses this type.

but, for now, I was just trying to get some basics working.

Here’s the code on GitHub if you want to try things out and note that you’d need that additional sample code from the official samples to make it work.

Windows 10 1607, UWP, Composition APIs–Walked Through Demo Code

I’ve written a few posts about the Windows 10 composition APIs for beautiful, fluid, animated UX gathered under this URL;

Composition Posts

and today I was putting together some demo code for other purposes and I thought I’d screen-capture what I had as a walk through of some of the capabilities of those composition APIs starting from a blank slate and walking through it;

That’s just one of my own, unofficial walk-throughs. For the official bits, visit the team site at;

http://aka.ms/winuilabs

Enjoy Smile

Windows 10, 1607 and UWP –Returning to Rome for an Experiment with Remote App Services

I wanted to return to the experiment that I did with ‘Project Rome’ in this post;

Windows 10 Anniversary Update (1607) and UWP Apps – Connected Apps and Devices via Rome

where I managed to experiment with the new APIs in Windows 10 1607 which allow you to interact with your graph of devices.

If you’ve not seen ‘Rome’ and the Windows.Systems.RemoteSystems classes then there’s a good overview here;

Connected Apps and Devices

In that previous post, I’d managed to use the RemoteSystemWatcher class to determine which remote devices I had and then to use its friends the RemoteSystemConnectionRequest and the RemoteLauncher class to have code on one of my devices launch an application (Maps) on another one of my devices. That post was really my own experimentation around the document here;

Launch an app on a remote device

I wanted to take that further though and see if I could use another capability of ‘Rome’ which is the ability for an app on one device to invoke an app service that is available on another device. That’s what this post is about and it’s really my own experimentation around the document here;

Communicate with a remote app service

In order to do that, I needed to come up with a scenario and I made up an idea that runs as follows;

  • There’s some workflow which involves redacting faces from images
  • The images to be redacted are stored in some blob container within Azure acting as a ‘queue’ of images to be worked on
  • The redacted images are to be stored in some other blob container within Azure
  • The process of downloading images, redacting them and then uploading the new images might be something that you’d want to run either locally on the device you’re working on or, sometimes, you might choose to do it remotely on another device which perhaps was less busy or had a faster/cheaper network connection.

Getting Started

Clearly, this is a fairly ‘contrived’ scenario but I wandered off into one of my Azure Storage accounts with the ‘Azure Storage Explorer’ and I made two containers named processed and unprocessed respectively;

Capture

and here’s the empty processed container;

Capture1

I then wrote some a fairly clunky class based on top of the Nuget package WindowsAzure.Storage which would do a few things for me;

  • Get me lists of the URIs of the blobs in the two containers.
  • Download a blob and present it back as a decoded bitmap in the form of a SoftwareBitmap
  • Upload a StorageFile to the processed container given the file and a name for the new blob
  • Delete a blob given its URI

i.e. it’s pretty much just the subset of the CRUD operations that I need for what my app needs to do.

That class ended up looking like this and, if you take a look at it, then note that it’s hard-wired to expect JPEG images;

namespace App26
{
  using Microsoft.WindowsAzure.Storage;
  using Microsoft.WindowsAzure.Storage.Auth;
  using Microsoft.WindowsAzure.Storage.Blob;
  using System;
  using System.Collections.Generic;
  using System.IO;
  using System.Linq;
  using System.Threading.Tasks;
  using Windows.Graphics.Imaging;
  using Windows.Storage;

  public class AzurePhotoStorageManager
  {
    public AzurePhotoStorageManager(
      string azureStorageAccountName,
      string azureStorageAccountKey,
      string unprocessedContainerName = "unprocessed",
      string processedContainerName = "processed")
    {
      this.azureStorageAccountName = azureStorageAccountName;
      this.azureStorageAccountKey = azureStorageAccountKey;
      this.unprocessedContainerName = unprocessedContainerName;
      this.processedContainerName = processedContainerName;
      this.InitialiseBlobClient();
    }
    void InitialiseBlobClient()
    {
      if (this.blobClient == null)
      {
        this.storageAccount = new CloudStorageAccount(
          new StorageCredentials(this.azureStorageAccountName, this.azureStorageAccountKey),
          true);

        this.blobClient = this.storageAccount.CreateCloudBlobClient();
      }
    }
    public async Task<IEnumerable<Uri>> GetProcessedPhotoUrisAsync()
    {
      var entries = await this.GetPhotoUrisAsync(this.processedContainerName);
      return (entries);
    }
    public async Task<IEnumerable<Uri>> GetUnprocessedPhotoUrisAsync()
    {
      var entries = await this.GetPhotoUrisAsync(this.unprocessedContainerName);
      return (entries);
    }
    public async Task<SoftwareBitmap> GetSoftwareBitmapForPhotoBlobAsync(Uri storageUri)
    {
      // This may not quite be the most efficient function ever known to man 🙂
      var reference = await this.blobClient.GetBlobReferenceFromServerAsync(storageUri);
      await reference.FetchAttributesAsync();

      SoftwareBitmap bitmap = null;

      using (var memoryStream = new MemoryStream())
      {
        await reference.DownloadToStreamAsync(memoryStream);

        var decoder = await BitmapDecoder.CreateAsync(
          BitmapDecoder.JpegDecoderId,
          memoryStream.AsRandomAccessStream());

        // Going for BGRA8 and premultiplied here saves me a lot of pain later on
        // when using SoftwareBitmapSource or using CanvasBitmap from Win2D.
        bitmap = await decoder.GetSoftwareBitmapAsync(
          BitmapPixelFormat.Bgra8, BitmapAlphaMode.Premultiplied);
      }
      return (bitmap);
    }
    public async Task PutFileForProcessedPhotoBlobAsync(
      string photoName,
      StorageFile file)
    {
      var container = this.blobClient.GetContainerReference(this.processedContainerName);

      var reference = container.GetBlockBlobReference(photoName);
      
      await reference.UploadFromFileAsync(file);
    }
    public async Task<bool> DeletePhotoBlobAsync(Uri storageUri)
    {
      var container = await this.blobClient.GetBlobReferenceFromServerAsync(storageUri);
      var result = await container.DeleteIfExistsAsync();
      return (result);
    }
    async Task<IEnumerable<Uri>> GetPhotoUrisAsync(string containerName)
    {
      var uris = new List<Uri>();
      var container = this.blobClient.GetContainerReference(containerName);

      BlobContinuationToken continuationToken = null;

      do
      {
        var results = await container.ListBlobsSegmentedAsync(continuationToken);

        if (results.Results?.Count() > 0)
        {
          uris.AddRange(results.Results.Select(r => r.Uri));
        }
        continuationToken = results.ContinuationToken;

      } while (continuationToken != null);

      return (uris);
    }
    CloudStorageAccount storageAccount;
    CloudBlobClient blobClient;
    string azureStorageAccountName;
    string azureStorageAccountKey;
    string unprocessedContainerName;
    string processedContainerName;
  }
}

and is probably nothing much to write home about Smile I also wrote another little class which attempts to take a SoftwareBitmap, to use the FaceDetector (UWP) API to find faces within that SoftwareBitmap and then to use Win2D.uwp to replace any faces that the FaceDetector finds with black rectangles.

For my own ease, I had the class then store the resultant bitmap into a temporary StorageFile. That class ended up looking like this;

namespace App26
{
  using Microsoft.Graphics.Canvas;
  using System;
  using System.Collections.Generic;
  using System.Linq;
  using System.Runtime.InteropServices.WindowsRuntime;
  using System.Threading.Tasks;
  using Windows.Foundation;
  using Windows.Graphics.Imaging;
  using Windows.Media.FaceAnalysis;
  using Windows.Storage;
  using Windows.UI;

  public class PhotoFaceRedactor
  {
    public async Task<StorageFile> RedactFacesToTempFileAsync(SoftwareBitmap incomingBitmap)
    {
      StorageFile tempFile = null;

      await this.CreateFaceDetectorAsync();

      // We assume our incoming bitmap format won't be supported by the face detector. 
      // We can check at runtime but I think it's unlikely.
      IList<DetectedFace> faces = null;
      var pixelFormat = FaceDetector.GetSupportedBitmapPixelFormats().First();

      using (var faceBitmap = SoftwareBitmap.Convert(incomingBitmap, pixelFormat))
      {
        faces = await this.faceDetector.DetectFacesAsync(faceBitmap);
      }
      if (faces?.Count > 0)
      {
        // We assume that our bitmap is in decent shape to be used by CanvasBitmap
        // as it should already be BGRA8 and Premultiplied alpha.
        var device = CanvasDevice.GetSharedDevice();

        using (var target = new CanvasRenderTarget(
          CanvasDevice.GetSharedDevice(),
          incomingBitmap.PixelWidth,
          incomingBitmap.PixelHeight,
          96.0f))
        {
          using (var canvasBitmap = CanvasBitmap.CreateFromSoftwareBitmap(device, incomingBitmap))
          {
            using (var session = target.CreateDrawingSession())
            {
              session.DrawImage(canvasBitmap,
                new Rect(0, 0, incomingBitmap.PixelWidth, incomingBitmap.PixelHeight));

              foreach (var face in faces)
              {
                session.FillRectangle(
                  new Rect(
                    face.FaceBox.X,
                    face.FaceBox.Y,
                    face.FaceBox.Width,
                    face.FaceBox.Height),
                  Colors.Black);
              }
            }
          }
          var fileName = $"{Guid.NewGuid()}.jpg";

          tempFile = await ApplicationData.Current.TemporaryFolder.CreateFileAsync(
            fileName, CreationCollisionOption.GenerateUniqueName);

          using (var fileStream = await tempFile.OpenAsync(FileAccessMode.ReadWrite))
          {
            await target.SaveAsync(fileStream, CanvasBitmapFileFormat.Jpeg);
          }
        }
      }
      return (tempFile);
    }
    async Task CreateFaceDetectorAsync()
    {
      if (this.faceDetector == null)
      {
        this.faceDetector = await FaceDetector.CreateAsync();
      }
    }
    FaceDetector faceDetector;
  }
}

I also wrote a static method that co-ordinated these two classes to perform the whole process of getting hold of a photo, taking out the faces in it and uploading it back to blob storage and that ended up looking like this;

namespace App26
{
  using System;
  using System.Threading.Tasks;

  static class RedactionController
  {
    public static async Task RedactPhotoAsync(Uri photoBlobUri, string newName)
    {
      var storageManager = new AzurePhotoStorageManager(
        Constants.AZURE_STORAGE_ACCOUNT_NAME,
        Constants.AZURE_STORAGE_KEY);

      var photoRedactor = new PhotoFaceRedactor();

      using (var bitmap = await storageManager.GetSoftwareBitmapForPhotoBlobAsync(photoBlobUri))
      {
        var tempFile = await photoRedactor.RedactFacesToTempFileAsync(bitmap);

        await storageManager.PutFileForProcessedPhotoBlobAsync(newName, tempFile);

        await storageManager.DeletePhotoBlobAsync(photoBlobUri);
      }
    }
  }
}

Adding in Some UI

I added in a few basic ‘ViewModels’ which surfaced this information into a UI and made something that seemed to essentially work. The UI is as below;

Capture2

and you can see the 2 lists of processed/unprocessed photos and if I click on one of the View buttons then the UI displays that photo;

Capture3

and then tapping on that photo takes it away again. If I click on one of the ‘Process’ buttons then there’s a little bit of a progress ring followed by an update to the UI which I’m quite lazy about in the sense that I simply requery all the data from Azure again. Here’s the UI after I’ve processed that particular image;

Capture4

and if I click on that bottom View button then I see;

Capture5

As an aside, the Image that is displaying things here has its Stretch set which is perhaps why the images look a bit odd Smile

Without listing all the XAML and all the view model code, that got me to the point where I had my basic bit of functionality working.

What I wanted to add to this then was a little bit from ‘Project Rome’ to see if I could set this up such that this functionality could be offered as an ‘app service’ and what especially interested me about this idea was whether the app could become a client of itself in the sense that this app could choose to let the user either do this photo ‘redaction’ locally on the device they were on or remotely on another one of their devices.

Making an App Service

Making a (basic) app service is pretty easy. I simply edited my manifest to say that I was making an App Service but I thought that I’d highlight that it’s necessary (as per the official docs) to make sure that my service called PhotoRedactionService has marked itself as being available to remote systems as below;

Capture6

and then I wrote the basics of a background task and an app service using the new mechanism that’s present in 1607 which is to override the OnBackgroundActivated method on the App class and do the background work inside of there rather than having to go off and write a completely separate WinRT component. Here’s that snippet of code;

    protected override void OnBackgroundActivated(BackgroundActivatedEventArgs args)
    {
      this.taskDeferral = args.TaskInstance.GetDeferral();
      args.TaskInstance.Canceled += OnBackgroundTaskCancelled;

      var details = args.TaskInstance.TriggerDetails as AppServiceTriggerDetails;

      if ((details != null) && (details.Name == Constants.APP_SERVICE_NAME))
      {
        this.appServiceConnection = details.AppServiceConnection;
        this.appServiceConnection.RequestReceived += OnRequestReceived;        
      }
    }
    void OnBackgroundTaskCancelled(IBackgroundTaskInstance sender, BackgroundTaskCancellationReason reason)
    {
      this.appServiceConnection.Dispose();
      this.appServiceConnection = null;
      this.taskDeferral?.Complete();
    }
    async void OnRequestReceived(AppServiceConnection sender, AppServiceRequestReceivedEventArgs args)
    {
      var deferral = args.GetDeferral();

      var incomingUri = args.Request.Message[Constants.APP_SERVICE_URI_PARAM_NAME] as string;

      var uri = new Uri(incomingUri);

      // TODO: Move this function off the viewmodel into some utiliy class.
      await RedactionController.RedactPhotoAsync(uri, MainPageViewModel.UriToFileName(uri));

      deferral.Complete();
    }
    AppServiceConnection appServiceConnection;
    BackgroundTaskDeferral taskDeferral;

In that code fragment – you’ll see that all that’s happening is;

  1. We receive a background activation.
  2. We check to see if its an ‘app service’ type of activation and, if so, whether the name of the activation matches my service (Constants.APP_SERVICE_NAME = “PhotoRedactionService”)
  3. We handle the RequestReceived event
    1. We look for a URI parameter to be passed to us (the URI of the photo to be redacted)
    2. We call into our code to do the redaction

and that’s pretty much it. I now how have an app service that does ‘photo redaction’ for me and I’ve got no security or checks around it whatsoever (which isn’t perhaps the best idea!).

Adding in some ‘Rome’

In that earlier screenshot of my ‘UI’ you’d have noticed that I have a Checkbox which says whether to perform ‘Remote Processing’ or not;

Capture7

this Checkbox is simply bound to a property on a ViewModel and the ComboBox next to it is bound to an ObservableCollection<RemoteSystem> in this way;

        <ComboBox
          Margin="4"
          MinWidth="192"
          HorizontalAlignment="Center"
          ItemsSource="{x:Bind ViewModel.RemoteSystems, Mode=OneWay}"
          SelectedValue="{x:Bind ViewModel.SelectedRemoteSystem, Mode=TwoWay}">
          <ComboBox.ItemTemplate>
            <DataTemplate x:DataType="rem:RemoteSystem">
              <TextBlock
                Text="{x:Bind DisplayName}" />
            </DataTemplate>
          </ComboBox.ItemTemplate>
        </ComboBox>

The population of that list of ViewModel.RemoteSystems is pretty easy and it was something that I learned in my previous post. I simply have some code which bootstraps the process;

      var result = await RemoteSystem.RequestAccessAsync();

      if (result == RemoteSystemAccessStatus.Allowed)
      {
        this.RemoteSystems = new ObservableCollection<RemoteSystem>();
        this.remoteWatcher = RemoteSystem.CreateWatcher();
        this.remoteWatcher.RemoteSystemAdded += OnRemoteSystemAdded;
        this.remoteWatcher.Start();
      }

and then when a new RemoteSystem is added I make sure it goes into my collection;

    void OnRemoteSystemAdded(RemoteSystemWatcher sender, RemoteSystemAddedEventArgs args)
    {
      this.Dispatch(
        () =>
        {
          this.remoteSystems.Add(args.RemoteSystem);

          if (this.SelectedRemoteSystem == null)
          {
            this.SelectedRemoteSystem = args.RemoteSystem;
          }
        }
      );
    }

and so now I’ve got a list of remote systems that might be able to process an image for me.

Invoking the Remote App Service

The last step is to invoke the app service remotely and I have a method which does that for me that is invoked with the URI of the blob of the photo to be processed;

    async Task RemoteRedactPhotoAsync(Uri uri)
    {
      var request = new RemoteSystemConnectionRequest(this.selectedRemoteSystem);
      using (var connection = new AppServiceConnection())
      {
        connection.AppServiceName = Constants.APP_SERVICE_NAME;

        // Strangely enough, we're trying to talk to ourselves but on another
        // machine.
        connection.PackageFamilyName = Package.Current.Id.FamilyName;
        var remoteConnection = await connection.OpenRemoteAsync(request);

        if (remoteConnection == AppServiceConnectionStatus.Success)
        {
          var valueSet = new ValueSet();
          valueSet[Constants.APP_SERVICE_URI_PARAM_NAME] = uri.ToString();
          var response = await connection.SendMessageAsync(valueSet);

          if (response.Status != AppServiceResponseStatus.Success)
          {
            // Bit naughty throwing a UI dialog from this view model
            await this.DisplayErrorAsync($"Received a response of {response.Status}");
          }
        }
        else
        {
          await this.DisplayErrorAsync($"Received a status of {remoteConnection}");
        }
      }
    }

For me, the main things of interest here would be that this code looks pretty much like any invocation to an app service except that we have the extra step here of constructing the RemoteSystemConnectionRequest based on the RemoteSystem that the ComboBox has selected and then on the AppServiceConnection class I use the OpenRemoteAsync() method rather than the usual OpenAsync() method.

The other thing which I think is unusual in my scenario here is that the PackageFamilyName that I set for the remote app is actually the same as the calling app because I’ve conjured up this weird scenario where my app talks to its own app service on another device.

It’s worth noting that I don’t need to have the app running on another device to invoke it, it just has to be installed.

Wrapping Up

As is often the case, my code here is sketchy and quite rough-and-ready but I quite enjoyed putting this little experiment together because I wasn’t sure whether the ‘Rome’ APIs would;

  1. Allow an app to invoke another instance of ‘itself’ on one of the user’s other devices
  2. Make (1) difficult if it even allowed it

and I was pleasantly surprised to find that the APIs actually made it pretty easy and it’s just like invoking a regular App Service.

I need to have a longer think about what sort of scenarios this enables but I found it interesting here to toy with the idea that I can run this app on my phone, get a list of work items to be processed and then I can elect to process those work items (using the exact same app) on one of my other devices which might have better/cheaper bandwidth and/or more CPU power.

I need to think on that. In the meantime, the code is here on github if you want to play with it. Be aware that to make it run you’d need;

  1. To edit the Constants file to provide storage account name and key.
  2. To make sure that you’d created blob containers called processed and unprocessed within your storage account.

Enjoy.