Intel RealSense Camera (F200): Facial Recognition

Adding to this set of posts, I thought that I’d see if I could get faces to be recognised rather than just located within an image and I’m impressed by the way in which the SDK makes this pretty simple.

I took the approach that I’d tried in this post where I’d defined a UI as a set of controls layered on top of each other with each pulling data from a common PXCMSenseManager object but I amended it a little such that the code explicitly ‘pulled’ frames rather than receiving them into an event handler. The UI for my main window then remains as;

<Window x:Class="WpfApplication2.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:controls="clr-namespace:WpfApplication2.Controls"
        Title="MainWindow"
        Height="720"
        Width="1280">
  <Grid x:Name="parentGrid">
    <Grid.ColumnDefinitions>
      <ColumnDefinition />
      <ColumnDefinition Width="Auto" />
      <ColumnDefinition />
    </Grid.ColumnDefinitions>
    <Grid.RowDefinitions>
      <RowDefinition />
      <RowDefinition Height="Auto"/>
      <RowDefinition />
    </Grid.RowDefinitions>
    <controls:ColorVideoControl Grid.Row="1"
                                Grid.Column="1" />
    <controls:FaceControl Grid.Row="1"
                          Grid.Column="1" />
    <controls:RecognitionControl Grid.Row="1"
                                 Grid.Column="1" />
  </Grid>
</Window>

but the code that lives with this has changed such that it achieves pretty much what it did in the reference post but, now, it does it by explicitly polling for frames of data from the SDK on a separate thread (via Task.Run);

namespace WpfApplication2
{
  using System;
  using System.Collections.Generic;
  using System.Windows;
  using System.Linq;
  using System.Diagnostics;
  using System.Threading;
  using System.Windows.Threading;
  using System.Threading.Tasks;

  public partial class MainWindow : Window
  {
    public MainWindow()
    {
      InitializeComponent();
      this.cancelTokenSource = new CancellationTokenSource();
      this.Loaded += OnLoaded;
      this.Closing += OnClosing;      
    }
    void OnClosing(object sender, System.ComponentModel.CancelEventArgs e)
    {
      this.cancelTokenSource.Cancel();
    }
    void OnLoaded(object sender, RoutedEventArgs e)
    {
      this.senseManager = PXCMSenseManager.CreateInstance();

      this.senseManager.captureManager.SetRealtime(false);

      this.senseManager.captureManager.FilterByStreamProfiles(
        PXCMCapture.StreamType.STREAM_TYPE_COLOR, 1280, 720, 0);

      this.InitialiseRenderers();

      this.senseManager.Init().ThrowOnFail();

      this.RunTask();
    }
    async void RunTask()
    {
      try
      {
        await Task.Run(async () =>
        {
          while (true)
          {
            this.cancelTokenSource.Token.ThrowIfCancellationRequested();

            if (this.senseManager.AcquireFrame(true).Succeeded())
            {
              var sample = this.senseManager.QuerySample();

              this.ForAllRenderers(r => r.ProcessSampleWorkerThread(sample));

              await this.Dispatcher.InvokeAsync(
                () =>
                {
                  this.ForAllRenderers(r => r.RenderUI(sample));
                }
              );
              this.senseManager.ReleaseFrame();
            }
          }
        });
      }
      catch (OperationCanceledException)
      {
      }
    }
    void InitialiseRenderers()
    {
      this.renderers = this.BuildRenderers();

      foreach (var renderer in this.renderers)
      {
        renderer.Initialise(this.senseManager);
      }
    }
    void ForAllRenderers(Action<ISampleRenderer> action)
    {
      foreach (var renderer in this.renderers)
      {
        action(renderer);
      }
    }
    List<ISampleRenderer> BuildRenderers()
    {
      List<ISampleRenderer> list = new List<ISampleRenderer>();

      foreach (var control in this.parentGrid.Children)
      {
        ISampleRenderer renderer = control as ISampleRenderer;
        if (renderer != null)
        {
          list.Add(renderer);
        }
      }
      return (list);
    }
    CancellationTokenSource cancelTokenSource;
    IEnumerable<ISampleRenderer> renderers;
    PXCMSenseManager senseManager;
  }
}

and if you were following along very closely you might spot that every implementation of ISampleRenderer has changed a little to remove the module identifier but, other than that, it’s pretty much the same code.

Rendering Video

The control that I have to render the video (called ColorVideoControl in the XAML above) is the same as it was in this post so I won’t repeat it here.

Rendering Face Location

I wrote a modified control to display the position of a face. I called that FaceControl and it’s just a Canvas;

<UserControl x:Class="WpfApplication2.Controls.FaceControl"
             xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
             xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
             xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
             xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
             mc:Ignorable="d"
             d:DesignHeight="300"
             d:DesignWidth="300">
  <Grid>
    <Canvas x:Name="canvas" />
  </Grid>
</UserControl>

and it has some code which grabs the location information from the SDK pieces and draws a rectangle around the face having (at the Initialise phase) asked for face detection to be configured for 1 face;

namespace WpfApplication2.Controls
{
  using System;
  using System.Linq;
  using System.Text;
  using System.Windows.Controls;
  using System.Windows.Media;
  using System.Windows.Shapes;
  using WpfApplication2.Utility;

  public partial class FaceControl : UserControl, ISampleRenderer
  {
    public FaceControl()
    {
      InitializeComponent();
      this.uiRectangle = new Rectangle()
      {
        StrokeThickness = 2,
        Stroke = Brushes.White
      };
    }
    public void Initialise(PXCMSenseManager senseManager)
    {
      this.senseManager = senseManager;

      this.senseManager.EnableFace();

      this.faceModule = this.senseManager.QueryFace();

      using (var config = faceModule.CreateActiveConfiguration())
      {
        config.detection.isEnabled = true;
        config.detection.maxTrackedFaces = 1;

        config.ApplyChanges().ThrowOnFail();
      }
      this.faceData = this.faceModule.CreateOutput();
    }
    public void ProcessSampleWorkerThread(PXCMCapture.Sample sample)
    {
      this.faceRectangle = null;

      if (this.faceData.Update().Succeeded())
      {
        this.CaptureFaceLocation();
      }
    }
    public void RenderUI(PXCMCapture.Sample sample)
    {
      RenderFaceRectangle();
    }
    void CaptureFaceLocation()
    {
      var faces = this.faceData.QueryFaces();
      var first = faces.SingleOrDefault();

      if (first != null)
      {
        var detection = first.QueryDetection();
        if (detection != null)
        {
          PXCMRectI32 localRectangle;

          if (detection.QueryBoundingRect(out localRectangle))
          {
            this.faceRectangle = localRectangle;
          }
        }
      }
    }
    void RenderFaceRectangle()
    {
      if (this.faceRectangle != null)
      {
        this.uiRectangle.Width = this.faceRectangle.Value.w;
        this.uiRectangle.Height = this.faceRectangle.Value.h;
        Canvas.SetLeft(this.uiRectangle, this.faceRectangle.Value.x);
        Canvas.SetTop(this.uiRectangle, this.faceRectangle.Value.y);

        if (!this.canvas.Children.Contains(this.uiRectangle))
        {
          this.canvas.Children.Add(this.uiRectangle);
        }
      }
      else if (this.canvas.Children.Contains(this.uiRectangle))
      {
        this.canvas.Children.Remove(this.uiRectangle);
      }
    }
    Rectangle uiRectangle;
    PXCMRectI32? faceRectangle;
    PXCMFaceData faceData;
    PXCMFaceModule faceModule;
    PXCMSenseManager senseManager;
  }
}

this isn’t radically different from what I did in the earlier post where I was also asking for facial landmarks and plotting them on the face but I’ve held back from adding that to the code this time around because I wanted to do…

Facial Recognition

The last control in use here is one that I called RecognitionControl and it’s just a Canvas and a couple of TextBlocks;

<UserControl x:Class="WpfApplication2.Controls.RecognitionControl"
             xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
             xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
             xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
             xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
             mc:Ignorable="d"
             d:DesignHeight="300"
             d:DesignWidth="300">
  <Grid>
      <TextBlock x:Name="txtRecognitionStatus"
                 Foreground="White"
                 FontSize="48" 
                 HorizontalAlignment="Left"
                 VerticalAlignment="Bottom"/>
    <TextBlock x:Name="txtDbStatus"
               Foreground="White"
               FontSize="24"
               HorizontalAlignment="Right"
               VerticalAlignment="Bottom" />
  </Grid>
</UserControl>

and the code behind this is a little more involved but not that involved given what it’s doing for me;

namespace WpfApplication2.Controls
{
  using System.Windows.Controls;
  using System.Linq;
  using System.IO;
  using WpfApplication2.Utility;
  using System;

  public partial class RecognitionControl : UserControl, ISampleRenderer
  {
    public RecognitionControl()
    {
      InitializeComponent();
      this.currentUserId = -1;
    }
    public void Initialise(PXCMSenseManager senseManager)
    {
      this.senseManager = senseManager;

      this.senseManager.EnableFace();

      using (var faceModule = this.senseManager.QueryFace())
      {
        using (var faceModuleConfig = faceModule.CreateActiveConfiguration())
        {
          faceModuleConfig.SetTrackingMode(PXCMFaceConfiguration.TrackingModeType.FACE_MODE_COLOR_PLUS_DEPTH);

          var recognitionConfig = faceModuleConfig.QueryRecognition();
          recognitionConfig.Enable();

          faceModuleConfig.EnableAllAlerts();

          faceModuleConfig.SubscribeAlert(alert =>
            {
              switch (alert.label)
              {
                case PXCMFaceData.AlertData.AlertType.ALERT_NEW_FACE_DETECTED:
                  this.userHasBeenRegistered = false;
                  this.currentUserId = -1;
                  this.frameCount = 0;
                  break;
                default:
                  break;
              }
            }
          );

          PXCMFaceConfiguration.RecognitionConfiguration.RecognitionStorageDesc storageDesc =
            new PXCMFaceConfiguration.RecognitionConfiguration.RecognitionStorageDesc();

          storageDesc.isPersistent = true;
          storageDesc.maxUsers = 50;

          // Massive note - this function call returns 'UNSUPPORTED' and yet it seems
          // to work...
          recognitionConfig.CreateStorage(DATA_STORE, out storageDesc);

          // So does this one...
          recognitionConfig.UseStorage(DATA_STORE);

          // Load database if persisted
          txtDbStatus.Text =
            recognitionConfig.ReadDatabaseFromFile(DATA_STORE_FILE) ? "loaded DB" : "new DB";

          recognitionConfig.SetRegistrationMode(
            PXCMFaceConfiguration.RecognitionConfiguration.RecognitionRegistrationMode.REGISTRATION_MODE_ON_DEMAND);

          faceModuleConfig.ApplyChanges().ThrowOnFail();

          this.faceData = faceModule.CreateOutput();
        }
      }
    }
    public void ProcessSampleWorkerThread(PXCMCapture.Sample sample)
    {
      this.visibleFaces = false;

      if (this.faceData.Update().Succeeded())
      {
        var count = this.faceData.QueryNumberOfDetectedFaces();

        if (count > 0)
        {
          this.visibleFaces = true;

          var face = this.faceData.QueryFaceByIndex(0);

          var recognition = face.QueryRecognition();

          // I find that if a user hasn't been registered then I can end up
          // calling RegisterUser() repeatedly for them so this check is
          // intended to say "don't register the user until we've had a
          // decent (300 frame) look at them and don't register them
          // more than twice".
          if (!recognition.IsRegistered() && !this.userHasBeenRegistered && 
            (++this.frameCount > USER_STABILISATION_FRAME_COUNT))
          {
            recognition.RegisterUser();

            this.userHasBeenRegistered = true;

            var recogModule = this.faceData.QueryRecognitionModule();
            recogModule.WriteDatabaseToFile(DATA_STORE_FILE);
          }
          else
          {
            this.currentUserId = recognition.QueryUserID();
          }
        }
        else
        {
          this.frameCount = 0;
        }
      }
    }
    public void RenderUI(PXCMCapture.Sample sample)
    {
      if (!this.visibleFaces)
      {
        this.txtRecognitionStatus.Text = "no faces";
      }
      else
      {
        this.txtRecognitionStatus.Text =
          string.Format("user id [{0}]",
          this.currentUserId == -1 ? "unidentified" : this.currentUserId.ToString()); 
      }
    }
    static readonly string DATA_STORE_FILE = "TestStore.bin";
    static readonly string DATA_STORE = "TestStore";
    const int USER_STABILISATION_FRAME_COUNT = 300;
    PXCMFaceData faceData;
    PXCMSenseManager senseManager;
    int currentUserId;
    bool userHasBeenRegistered;
    bool visibleFaces;
    int frameCount;
  }
}

with the general structure here being;

  • at Initialise time we make sure that we have enabled the face module and the recognition module and also all alerts (see post on alerts).
  • also at Initialise time we use the PXCMFaceConfiguration.RecognitionConfiguration type in order to set up a little data store for up to 50 users’ worth of recognition data. That’s via the CreateStorage and UseStorage methods.
  • finally at Initialise time we replace the contents of that data store from a file named TestStore.bin in the file system if it happens to exist from a previous session of the app so as to provide a persistent store of users across invocations of the app.
  • as frames arrive from the SDK we then
    • update our UI based on whether we see any faces in there or not
    • when we do have a face, we query the recognition data to see if we have a face that has been recognised and either;
      • already exists in our registration database
      • needs adding to our registration database in which case we’ll add it and save the database out to disk for the next time the app runs.

In doing those last steps I found that I could get into a situation where a face ‘arrives’ at the camera and isn’t immediately recognised and so needs registering into the database.

Without care, it seemed that I could end up being ‘overly keen’ to register what appeared to be a new user and registering them many 10s of times. The unique identifiers seem to start at 100 and tick upwards so I could find that the same user became identified as 100,101,102… until things stabilised.

Consequently, I have a little bit of logic in the code that tries to make sure that;

  • when a new face arrives, we only attempt to register it once
  • when a new face arrives, we do not attempt to register it until it’s had “a little time” (300 frames of data) in front of the sensor just in case the SDK is going to change its mind about whether it has/hasn’t seen that face before.

This seemed to work out reasonably well but it’s more from experimentation than from anything else so apply a pinch of salt but it seems to work for me.

Here’s a quick video of me sitting in front of the camera and trying it out, you’ll see that in the first instance we have a brand new database and there’s a little delay (300 frames) while the code recognises me and registers me. You’ll then see that when I run the app for the second time the framework quite quickly realises that I’m already in its loaded database and recognises me;

If you want the code, it’s here for download.