Windows 10 UWP–Migrating a Windows 8.1 App (Part 6 of N)

Following on from this growing series of posts;

Windows 10 UWP–Migrating a Windows 8.1 App (Part 1 of N where N tends to infinity)

Windows 10 UWP–Migrating a Windows 8.1 App (Part 2 of N)

Windows 10 UWP–Migrating a Windows 8.1 App (Part 3 of N)

Windows 10 UWP–Migrating a Windows 8.1 App (Part 4 of N)

Windows 10 UWP–Migrating a Windows 8.1 App (Part 5 of N)

Having managed to take away my dependency on my custom-built version of ZXing and move to a dependency on a NuGet package for WinRT Smile I thought it was time to take on a larger challenge and finally address the issue of camera capture.

Changing the Way in Which kwiQR Does Camera Capture

It’s hard to really document exactly what steps I’ve taken in doing this because I’ve removed an awful lot of code from the app and, partly, that’s because of changes in Windows 10 over Windows 8.x but, equally, it’s because I’m 3 years older than when I originally wrote the 8.0 code and I found what I had written to be too complicated and ‘clever’ and I found that I couldn’t reason about it.

When the app starts to capture photos from the camera, that process needs to be cancellable because the user can navigate away from the camera capture page in the app either manually or the QR code can be recognised which causes the navigation to happen automatically.

Equally, because of the way that camera capture works, if the app’s window is taken off-screen while capturing then the capture needs to be halted and re-started when the app comes back onto the screen.

When I built the original version of the app, I wanted to make sure that it was both performant (enough) and responsive to cancellation and I was trying to glue together the output of camera capture with the input of ZXing.

Consequently, I ended up moving the whole process of grabbing photos from the camera, analysing them for QR codes and stopping when one is found into its own, isolated Task.

This, naturally, comes with its own complications around having a CancellationToken which can be used to stop that Task as/when the app needs to do that and also of having to take care in updating any UI as processing is occurring.

The general flow for this code is;

  • A UserControl hosts a CaptureElement
  • A view model builds a list of available cameras, preferring the ones which say that they are located on the rear of the device
  • A view model constructs and initialises a MediaCapture, feeds it the ID of the camera to use, links it up to be the Source of the CaptureElement and starts it previewing by StartPreviewAsync()
  • The view model enters into some loop where it grabs frames from the camera, feeds them into ZXing to see if they look like QR codes and halts the loop when it finds one or when it is told to cancel.
  • At the end of the loop, we persist any located QR code back into the app’s storage and life continues on.

In the Windows 8.x code-base while I was previewing video nicely, the only mechanism that I had to grab a frame from that video was via this API call (taken from the app’s code-base);

           Task capturePhotoTask = this.captureSource.CapturePhotoToStreamAsync(props, stream).AsTask();
              capturePhotoTask.Wait(cancellationToken);

At least, that’s how I remember it. In my code, this was bringing back a single photo into a stream (an InMemoryRandomAccessStream as it happens) and I then had to go through some hoops to pass that across to ZXing, specifically,

  • Take the photo stream and decode the bitmap from it into a format (I used RGBA8) using a BitmapDecoder
  • Create a raw RGB array from that to pass to ZXing.

and there was some cost to this and particularly the second part. I think that’s probably why I made the decision to introduce an additional Task and, with it, add quite a lot of complexity.

The Windows 10 code takes the same approach outlined under ‘general flow’ above but it uses a different API to grab a frame of video from the camera when it’s in previewing mode. In fact, here’s the whole function that does that work;

    public async Task ProcessAsync(CancellationToken cancellationToken)
    {
      this.Reset();

      bool processed = false;

      var previewProperties = this.captureSource.VideoDeviceController.GetMediaStreamProperties(
        MediaStreamType.VideoPreview) as VideoEncodingProperties;

      byte[] buffer = null;

      Stopwatch stopWatch = new Stopwatch();
          
      while (!processed)
      {
        cancellationToken.ThrowIfCancellationRequested();

        stopWatch.Reset();
        stopWatch.Start();

        var videoFrame = new VideoFrame(
          BitmapPixelFormat.Bgra8,
          (int)previewProperties.Width,
          (int)previewProperties.Height);

        await this.captureSource.GetPreviewFrameAsync(videoFrame);

        using (videoFrame)
        {
          if (buffer == null)
          {
            buffer = new byte[
              4 * videoFrame.SoftwareBitmap.PixelWidth * videoFrame.SoftwareBitmap.PixelHeight];
          }
          videoFrame.SoftwareBitmap.CopyToBuffer(buffer.AsBuffer());

          processed = this.ProcessFrame(
            buffer, (int)previewProperties.Width, (int)previewProperties.Height,
            videoFrame.SoftwareBitmap.BitmapPixelFormat);
        }
        stopWatch.Stop();

        this.UpdateCounters(stopWatch.Elapsed.Milliseconds);
      }
    }

The important change is that on line 26 above I’m using the new API GetPreviewFrameAsync instead of the previous CapturePhotoToStreamAsync(). As part of making that change, I made a bunch of other changes to take away the separate processing Task and remove a lot of more complex code that dealt with making that work.

So…all of the processing that my code is responsible for here is now being done time-sliced on the Dispatcher thread via async/await rather than on a separate thread from the thread pool via a separate Task.

My initial hope for this new API was that it might offer me the promise of being able to capture a video frame from the camera and pass it all the way through ZXing for QR code scanning without duplicating the image data but I don’t think that I’ve got there.

The GetPreviewFrameAsync API populates a VideoFrame which has both a SoftwareBitmap property and a Direct3DSurface property.

The NuGet version of ZXing seems (in as much as I’ve explored its API) to require frames as either a WriteableBitmap or as a byte[].

In gluing those 2 things together I have a requirement to go from a SoftwareBitmap to either a WriteableBitmap or byte[] and, for the life of me, I couldn’t find a way to do that without copying the image data at least once.

I’m very pleased to see SoftwareBitmap in the UWP because I think the app platform has been seriously lacking in these kinds of capabilities up ( although Win2D is a big, recent, help ).

However, I couldn’t find a way to get a SoftwareBitmap to give me a byte[] representing its pixels and nor could i find any relationship between SoftwareBitmap and WriteableBitmap.

That mean that I had to compromise and make what I hope is a single copy of the pixel data as on line 35 above where I pretend that my pre-allocated byte array is really an IBuffer and ask the SoftwareBitmap to copy the pixels across into it (I do hope that the extension method AsBuffer doesn’t do some copying that I’m not aware of).

Having got those pixels into a byte[] I then pass them across into ZXing for it to scan for QR codes in that frame from the video.

Making these changes greatly simplified my code base and made it so that you didn’t have to drink strong coffee before trying to reason about what was going on but it did have a minor knock-on effect…

Fixing the App as a Share Target

The code that I’d built to decode QR codes from images on top of ZXing wasn’t just used from my camera capture page. It was also used at the point where a user tried to share an image into kwiQR from another app. kwiQR supports receiving data from other apps either in Bitmap format or as a single shared storage item (which is what the built-in Photos app shares) and it tries to decode the shared data into a QR code.

Naturally, it uses the same code for this as it uses when processing previewed video frames and so I broke the share target functionality as part of making the camera changes.

It took me a long time to fix this because of a minor glitch that I ran into along the way but, otherwise, the changes would have been fairly minor and I had the share target functionality back up and running fairly quickly once I got past that blocker.

Here’s a video of the new code-base dealing with being a share target;

Trying to do a Better Job with the Camera

In the 8.x code-base I had some fairly ‘quirky’ code which monitored the window size changing in order to attempt to rotate (and stretch) the CaptureElement such that when the user rotated their device the video would rotate accordingly in the opposite direction such that it provided a ‘see through’ view of what was on the other side of the device.

That code seemed to work but it was fiddly to debug (debugging while rotating devices can be a challenge) and I think I had a few attempts at writing it at the time. In the Windows 10 code-base I take a different approach to this.

I essentially use the DisplayInformation.OrientationChanged event to know when the display orientation changes and to work out a quick angle of rotation for the Portrait, LandscapeFlipped and PortraitFlipped variants.

I then use the MediaCapture.SetEncodingProperties API call to rotate the previewed video. The code looks like this;

    var props = _mediaCapture.VideoDeviceController.GetMediaStreamProperties(MediaStreamType.VideoPreview);
        props.Properties.Add(RotationKey, angle);
        await _mediaCapture.SetEncodingPropertiesAsync(MediaStreamType.VideoPreview, props, null);

and the important thing to know (which I had to look into the official camera samples for ) is that the RotationKey there looks like this and I even copied the comment with the code from the sample to try and remember where I got it from;

    // Rotation metadata to apply to the preview stream (MF_MT_VIDEO_ROTATION)
    // Reference: http://msdn.microsoft.com/en-us/library/windows/apps/xaml/hh868174.aspx
    static readonly Guid RotationKey = new Guid("C380465D-2271-428C-9B83-ECEA3B4A85C1");

and so that took away some code that I never really wanted to write back in Windows 8.x which is great.

In terms of trying to capture QR codes, kwiQR can be pretty quick to recognise a big QR code. Here’s what the Windows 10 code looks like doing that below;

and that’s great but, in testing, I found that it was hard to get the camera to focus on QR codes that are small in size and need to be presented close up to the device.

One of the API areas that seemed very appealing was the area of being able to offer some control over the focus of the camera. For example, I’m currently making calls like;

    async Task SetFocusToMacroAsync()
    {
      if (this._mediaCapture.VideoDeviceController.FocusControl.Supported)
      {
        if (this._mediaCapture.VideoDeviceController.FocusControl.SupportedPresets.Contains(FocusPreset.AutoMacro))
        {
          await this._mediaCapture.VideoDeviceController.FocusControl.SetPresetAsync(
            FocusPreset.AutoMacro);
        }
      }
    }

in an attempt to switch the camera into macro focus mode and I may add more functionality to allow the user to take control of this but, at the time of writing, this is something of an experiment because the Surface Pro 3 that I’m testing on doesn’t seem to support this level of focus control (I think it’s a fixed focus camera on the device) but I thought it was interesting to see the converged APIs here around Focus, IsoSpeed, HdrVideo, Torch, Zoom, WhiteBalance and so on. They all look to follow this pattern of asking whether the control level is supported before trying it out.

I’ll return to this once I’m running the app on a device that supports this level of control and will see whether I can do more here.

Dealing with Multiple Cameras

While the app always preferred the camera that claims to be on the rear of the device, it always offered the user the possibility of switching to a different camera.

In the 8.x code-base, this was done via the app-bar – the screenshot below shows how this used to be although it never looked quite as bad as this on Windows 8.x;

image

For Windows 10, I decided to take away the app bar on this page altogether and, instead, I copied the built-in camera app which has this little button;

image

that simply toggles through your cameras and so I added my own button that did the same thing to my camera capture page;

image

and I figured out what the right glyph was to use here by copying it from the Camera app using this technique.

The Windows 8.x code also had a ‘camera options’ button which raised the CameraOptionsUI dialog. As I wrote in my original post, when I first compiled the 8.x code for UWP I found that the CameraOptionsUI class wasn’t in the UWP but was, instead, part of the Desktop Extensions SDK.

As you can see from the screenshot above, I still have that settings button but I’ve made it such that it would not appear on a device that wasn’t a PC/desktop which I’m hoping that I can do with code that ultimately binds to;


        Windows.Foundation.Metadata.ApiInformation.IsTypePresent(typeof(CameraOptionsUI).FullName))

although I’ve yet to run this code on anything other than a PC/desktop device but I will be doing in the future to see how it works out.

Pinning Secondary Tiles

Again, in my original post I’d mentioned that the app was using deprecated APIs to add a secondary tile to the user’s start screen.

The intention of this was to allow the user to pin a navigation link to the camera capture page to their start screen such that they could launch the app straight onto that page.

In looking at this again, I decided that it was probably overkill and simply added another setting to control which page the app launches on;

image

and that seemed to be enough to me.

Other Bits and Pieces

While making these changes, I also added some code such that the app does a reasonable job of going around the suspend/terminate/re-launch cycle and tidied up some styles here and there.

In doing so, I got rid of the SuspensionManager that I’d taken from the Windows 8.x templates as it was way too complicated for what I need to achieve here.

All told, these changes probably took around 16 hours as I decided to do a bit of a rewrite on the camera capture and I got bogged down with share targets for quite a while.

Next steps here would be to look into camera focus in more detail and to start to think of trying the device on a few more devices. I’m also tempted to dig into the way in which the app does tile notifications because I think that could be improved in the light of how Windows 10 UWP has introduced the new mechanisms for adapative tiles.