Adding to this set of posts, I thought that I’d see if I could do something with the facial capabilities of the F200 camera although I’m not sure that I’m ready to yet explore the recognition aspects of that – I’ll leave those to a later post.
In order to do this, I reworked a WPF UI so that it became a ‘container’ for a number of controls. That UI is as below;
<Window x:Class="WpfApplication2.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:controls="clr-namespace:WpfApplication2.Controls"
Title="MainWindow"
Height="350"
Width="525">
<Grid x:Name="parentGrid">
<Grid.ColumnDefinitions>
<ColumnDefinition />
<ColumnDefinition Width="Auto" />
<ColumnDefinition />
</Grid.ColumnDefinitions>
<Grid.RowDefinitions>
<RowDefinition />
<RowDefinition Height="Auto"/>
<RowDefinition />
</Grid.RowDefinitions>
<controls:ColorVideoControl Grid.Row="1" Grid.Column="1"/>
<controls:EmotionControl Grid.Row="1" Grid.Column="1"/>
<controls:FaceControl Grid.Row="1"
Grid.Column="1" />
</Grid>
</Window>
I’ve taken an admittedly simple and not too performant approach here of ‘layering’ data from the RealSense camera such that I have 3 controls that pick up and display that data without much awareness of each other (hence the possibly poor performance);
- ColorVideoControl
- EmotionContol
- FaceControl
and those are just simple UserControls in WPF that implement an interface;
namespace WpfApplication2
{
interface ISampleRenderer
{
void Initialise(PXCMSenseManager senseManager);
void ProcessSampleWorkerThread(PXCMCapture.Sample sample);
void RenderUI(PXCMCapture.Sample sample);
int ModuleId { get; }
}
}
with the idea being that the grid in my MainWindow can contain any number of these controls and can talk to them via this interface in order to feed them data from a single PXCMSenseManager instance and get them to display various aspects of that data.
The code behind the main window then becomes quite generic;
namespace WpfApplication2
{
using System;
using System.Collections.Generic;
using System.Windows;
using System.Linq;
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
this.Loaded += OnLoaded;
}
void OnLoaded(object sender, RoutedEventArgs e)
{
this.senseManager = PXCMSenseManager.CreateInstance();
this.senseManager.captureManager.SetRealtime(false);
this.senseManager.captureManager.FilterByStreamProfiles(
PXCMCapture.StreamType.STREAM_TYPE_COLOR, 1280, 720, 0);
this.InitialiseRenderers();
// this will fail enless we have at least one control
// in the renderer list which switches on some kind
// of modular data.
this.senseManager.Init(
new PXCMSenseManager.Handler()
{
onModuleProcessedFrame = this.OnModuleProcessedFrame
}).ThrowOnFail();
this.senseManager.StreamFrames(false);
}
void InitialiseRenderers()
{
this.renderers = this.BuildRenderers();
foreach (var renderer in this.renderers)
{
renderer.Initialise(this.senseManager);
}
}
void ForAllRenderers(int moduleId, Action<ISampleRenderer> action)
{
foreach (var renderer in this.renderers.Where(
r => (r.ModuleId == -1) || (r.ModuleId == moduleId)))
{
action(renderer);
}
}
List<ISampleRenderer> BuildRenderers()
{
List<ISampleRenderer> list = new List<ISampleRenderer>();
foreach (var control in this.parentGrid.Children)
{
ISampleRenderer renderer = control as ISampleRenderer;
if (renderer != null)
{
list.Add(renderer);
}
}
return (list);
}
pxcmStatus OnModuleProcessedFrame(int mid, PXCMBase module, PXCMCapture.Sample sample)
{
ForAllRenderers(
mid,
r => r.ProcessSampleWorkerThread(sample));
Dispatcher.InvokeAsync(() =>
{
ForAllRenderers(mid, r => r.RenderUI(sample));
}
);
return (pxcmStatus.PXCM_STATUS_NO_ERROR);
}
IEnumerable<ISampleRenderer> renderers;
PXCMSenseManager senseManager;
}
}
and so this code follows a fairly simple pattern;
- Initialise PXCMSenseManager
- Ensure that the color stream is going to be at least 1280×720.
- Take an event based approach to the data by handling the OnModuleProcessedFrame ‘event’.
- If you’ve looked at any of my previous posts, you’d know that this one is new to me and comes from me using modules to process the data rather than just gathering the raw streams which arrive via the OnNewSample event. What I liked about this approach is that (it seems) that the module data also carries the color data so these frames are sync’d.
- Build a list of the controls that are parented by my parentGrid Grid that implement my ISampleRenderer interface and ask them to initialise themselves.
- As data arrives into my OnModuleProcessedFrame method, interrogate any underlying modules that match the passed moduleId and pass the data to them in two ways;
- Once on the calling thread by using ProcessSampleWorkerThread
- Once on the UI thread by using RenderUI
I’d have to say that I’m not at all sure at the time of writing that I have the ‘re-entrancy/threading’ aspects of this 2-phase approach inside of my OnModuleProcessedFrame method right. It’s a work in progress because, clearly, I’m taking a ‘fire and forget’ approach to this call to RenderUI and I suspect that it’s more than possible that a 2nd frame arrives while I’m processing the first one and I need to more than likely revisit that code. There’s also the question of the various modules delivering frames on different frequencies so this more than likely all needs more work.
Layered on top of this I have my ColorVideoControl which simply contains an Image;
<UserControl x:Class="WpfApplication2.Controls.ColorVideoControl"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
mc:Ignorable="d"
d:DesignHeight="300"
d:DesignWidth="300">
<Grid>
<Image x:Name="displayImage" />
</Grid>
</UserControl>
and its implementation of ISampleRenderer which is really just a re-working of code I’ve used in the previous posts;
namespace WpfApplication2.Controls
{
using System.Windows;
using System.Windows.Controls;
using System.Windows.Media;
using System.Windows.Media.Imaging;
public partial class ColorVideoControl : UserControl, ISampleRenderer
{
public ColorVideoControl()
{
InitializeComponent();
}
public int ModuleId
{
get
{
return (-1);
}
}
public void Initialise(PXCMSenseManager senseManager)
{
}
public void ProcessSampleWorkerThread(PXCMCapture.Sample sample)
{
this.currentColorImage = null;
PXCMImage.ImageData colorImage;
if (sample.color.AcquireAccess(PXCMImage.Access.ACCESS_READ,
PXCMImage.PixelFormat.PIXEL_FORMAT_RGB32, out colorImage).Succeeded())
{
this.InitialiseImageDimensions(sample.color);
this.currentColorImage = colorImage;
}
}
public void RenderUI(PXCMCapture.Sample sample)
{
if (this.currentColorImage != null)
{
this.InitialiseImage();
this.writeableBitmap.WritePixels(
this.imageDimensions,
this.currentColorImage.planes[0],
this.imageDimensions.Width * this.imageDimensions.Height * 4,
this.imageDimensions.Width * 4);
sample.color.ReleaseAccess(this.currentColorImage);
this.currentColorImage = null;
}
}
void InitialiseImageDimensions(PXCMImage image)
{
if (!this.imageDimensions.HasArea)
{
this.imageDimensions.Width = image.info.width;
this.imageDimensions.Height = image.info.height;
}
}
void InitialiseImage()
{
if (this.writeableBitmap == null)
{
this.writeableBitmap = new WriteableBitmap(
this.imageDimensions.Width,
this.imageDimensions.Height,
96,
96,
PixelFormats.Bgra32,
null);
this.displayImage.Source = this.writeableBitmap;
}
}
PXCMImage.ImageData currentColorImage;
Int32Rect imageDimensions;
WriteableBitmap writeableBitmap;
}
}
so there’s nothing new there in this post that wasn’t in the previous posts except that I’m now receiving this data as part of a OnModuleProcessedFrame handler rather than an OnNewSample handler.
Capturing Emotion
What is new is the addition of what my EmotionControl does for me. This one is just a big TextBlock;
<UserControl x:Class="WpfApplication2.Controls.EmotionControl"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
mc:Ignorable="d"
d:DesignHeight="300"
d:DesignWidth="300">
<Grid>
<TextBlock Foreground="White"
HorizontalAlignment="Left"
VerticalAlignment="Top"
FontSize="48"
x:Name="txtEmotion"
FontFamily="Segoe UI"
Margin="10"/>
<Canvas x:Name="canvas" />
</Grid>
</UserControl>
and then the implementation of ISampleRenderer that lives with that ‘UI’;
namespace WpfApplication2.Controls
{
using System.Collections.Generic;
using System.Windows.Controls;
using System.Linq;
using System.Windows.Shapes;
using System.Windows.Media;
using WpfApplication2.Utility;
public partial class EmotionControl : UserControl, ISampleRenderer
{
public EmotionControl()
{
InitializeComponent();
}
public int ModuleId
{
get
{
return (PXCMEmotion.CUID);
}
}
public void Initialise(PXCMSenseManager senseManager)
{
this.senseManager = senseManager;
this.senseManager.EnableEmotion();
}
public void ProcessSampleWorkerThread(PXCMCapture.Sample sample)
{
this.emotions = new Dictionary<int, PXCMEmotion.EmotionData>();
using (var emotion = this.senseManager.QueryEmotion())
{
if (emotion != null)
{
var faceCount = emotion.QueryNumFaces();
for (int face = 0; face < faceCount; face++)
{
PXCMEmotion.EmotionData[] emotionData;
if (emotion.QueryAllEmotionData(face, out emotionData).Succeeded())
{
var candidate =
emotionData
.Where(
e => ((e.eid <= PXCMEmotion.Emotion.EMOTION_PRIMARY_SURPRISE) &&
(e.evidence > MIN_EVIDENCE_VALUE))
)
.OrderBy(e => e.evidence)
.FirstOrDefault();
if (candidate != null)
{
this.emotions[face] = candidate;
}
}
}
}
}
}
public void RenderUI(PXCMCapture.Sample sample)
{
List<string> displayItems = new List<string>();
this.txtEmotion.Text = string.Empty;
// Only going to do something with the first face for the moment.
if (this.emotions.ContainsKey(0))
{
var emotion = this.emotions[0];
this.txtEmotion.Text = string.Format("{0}, intensity ({1:N2})",
emotion.GetName(),
emotion.intensity);
Rectangle rectangle;
if (this.canvas.Children.Count > 0)
{
rectangle = (Rectangle)this.canvas.Children[0];
}
else
{
rectangle = new Rectangle()
{
StrokeThickness = 1,
Stroke = Brushes.White
};
this.canvas.Children.Add(rectangle);
}
rectangle.Width = emotion.rectangle.w;
rectangle.Height = emotion.rectangle.h;
Canvas.SetLeft(rectangle, emotion.rectangle.x);
Canvas.SetTop(rectangle, emotion.rectangle.y);
}
else
{
this.canvas.Children.Clear();
}
}
const int MIN_EVIDENCE_VALUE = 0;
PXCMSenseManager senseManager;
Dictionary<int, PXCMEmotion.EmotionData> emotions;
}
}
and so this code queries the PXCMEmotion module from the PXCMSenseManager and then calls QueryAllEmotionData() which returns a set of PXCMEmotion.EmotionData;
public class EmotionData
{
public PXCMEmotion.Emotion eid;
public PXCMEmotion.Emotion emotion;
public int evidence;
public int fid;
public float intensity;
public PXCMRectI32 rectangle;
public long timeStamp;
public EmotionData();
}
and the PXCMEmotion.Emotion is a bit mask;
{
EMOTION_PRIMARY_ANGER = 1,
EMOTION_PRIMARY_CONTEMPT = 2,
EMOTION_PRIMARY_DISGUST = 4,
EMOTION_PRIMARY_FEAR = 8,
EMOTION_PRIMARY_JOY = 16,
EMOTION_PRIMARY_SADNESS = 32,
EMOTION_PRIMARY_SURPRISE = 64,
EMOTION_SENTIMENT_POSITIVE = 65536,
EMOTION_SENTIMENT_NEGATIVE = 131072,
EMOTION_SENTIMENT_NEUTRAL = 262144,
where I think that (as the names suggest) the values up to SURPRISE are the primary emotions and then the top 3 values can be combined with those in order to give some sense of ‘sentiment’. I’m not entirely sure how you have NEUTRAL/FEAR and so on but there you go.
The RenderUI function simply chooses the emotions associated with the first face and displays them in a text block. It also uses the rectangle that’s provided (which seems to be in 2D co-ords to match the image) to draw a white rectangle around the face co-ordinates as well.
So, a tiny bit of code with the emotion module and I can tell whether someone is expressing ‘disgust’ at my software and I can find their face in the video frame in order to direct my remote-controlled robot arm to squirt them with a water pistol 
Capturing Face
I also wrote this little FaceControl user control which introduces yet another XAML Canvas to draw on (hey, why use one Canvas when you can use many?
).
Here’s the UI portion;
<UserControl x:Class="WpfApplication2.Controls.FaceControl"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
mc:Ignorable="d"
d:DesignHeight="300"
d:DesignWidth="300">
<Grid>
<Canvas x:Name="canvas" />
<TextBlock x:Name="txtPulse"
FontSize="48"
HorizontalAlignment="Left"
VerticalAlignment="Bottom"
Foreground="Red"
FontFamily="Segoe UI"
Margin="10"/>
<TextBlock x:Name="txtExpressions"
FontSize="24"
HorizontalAlignment="Right"
VerticalAlignment="Top"
Foreground="Red"
FontFamily="Segoe UI"
Margin="10"
TextAlignment="Right"/>
</Grid>
</UserControl>
and so it’s just a Canvas and a bunch of text blocks and here’s the code that deals with the data;
namespace WpfApplication2.Controls
{
using System;
using System.Linq;
using System.Text;
using System.Windows.Controls;
using System.Windows.Media;
using System.Windows.Shapes;
using WpfApplication2.Utility;
public partial class FaceControl : UserControl, ISampleRenderer
{
public FaceControl()
{
InitializeComponent();
}
public int ModuleId
{
get { return (PXCMFaceModule.CUID); }
}
public void Initialise(PXCMSenseManager senseManager)
{
this.senseManager = senseManager;
// TODO: I'm not so sure about which objects I'm meant to keep around here
// From experimentation it seemed that I needed to keep a handle on the
// PXCMFaceData otherwise I don't seem to get any data when I query later
// on (i.e. I can't keep creating/disposing it). I'm unsure about the
// other objects but I found that I seem to be able to dispose the
// config once I'm done with it.
this.senseManager.EnableFace();
this.faceModule = this.senseManager.QueryFace();
using (var config = faceModule.CreateActiveConfiguration())
{
config.detection.isEnabled = true;
config.detection.maxTrackedFaces = 1;
config.landmarks.isEnabled = true;
config.landmarks.maxTrackedFaces = 1;
var pulseConfig = config.QueryPulse();
pulseConfig.properties.maxTrackedFaces = 1;
pulseConfig.Enable();
var expressionConfig = config.QueryExpressions();
expressionConfig.EnableAllExpressions();
expressionConfig.Enable();
config.ApplyChanges().ThrowOnFail();
}
this.faceData = this.faceModule.CreateOutput();
}
public void ProcessSampleWorkerThread(PXCMCapture.Sample sample)
{
this.heartRate = 0.0f;
this.landmarks = null;
this.expressionDescription = string.Empty;
if (this.faceData.Update().Succeeded())
{
var faces = this.faceData.QueryFaces();
var first = faces.FirstOrDefault();
if (first != null)
{
// pulse
this.heartRate = first.QueryPulse().QueryHeartRate();
// facial landmarks
PXCMFaceData.LandmarkPoint[] localLandmarks;
var landmarks = first.QueryLandmarks();
if ((landmarks != null) && landmarks.QueryPoints(out localLandmarks))
{
this.landmarks = localLandmarks;
}
// facial expressions
var expressions = first.QueryExpressions();
if (expressions != null)
{
PXCMFaceData.ExpressionsData.FaceExpressionResult result;
StringBuilder builder = new StringBuilder();
foreach (PXCMFaceData.ExpressionsData.FaceExpression value in
Enum.GetValues(typeof(PXCMFaceData.ExpressionsData.FaceExpression)))
{
if (expressions.QueryExpression ( value, out result ) &&
(result.intensity > MIN_INTENSITY))
{
builder.AppendFormat(
"{0}{1} ({1:G2})",
builder.Length == 0 ? string.Empty : Environment.NewLine,
value.GetName(),
result.intensity);
}
}
this.expressionDescription = builder.ToString();
}
}
}
}
public void RenderUI(PXCMCapture.Sample sample)
{
this.canvas.Children.Clear();
this.txtPulse.Text =
string.Format("{0} bpm", this.heartRate > 0 ? this.heartRate.ToString() : "N/A");
if (this.landmarks != null)
{
foreach (var landmark in this.landmarks.Where(l => l.confidenceImage > MIN_CONFIDENCE))
{
this.canvas.Children.Add(this.MakeEllipseAtImagePoint(landmark.image));
}
}
this.txtExpressions.Text = this.expressionDescription;
}
Ellipse MakeEllipseAtImagePoint(PXCMPointF32 point)
{
Ellipse ellipse = new Ellipse()
{
Width = LANDMARK_ELLIPSE_WIDTH,
Height = LANDMARK_ELLIPSE_WIDTH,
Fill = Brushes.Red
};
Canvas.SetLeft(ellipse, point.x);
Canvas.SetTop(ellipse, point.y);
return (ellipse);
}
const int MIN_INTENSITY = 80;
const int MIN_CONFIDENCE = 50;
const int LANDMARK_ELLIPSE_WIDTH = 3;
string expressionDescription;
PXCMFaceData.LandmarkPoint[] landmarks;
float heartRate;
PXCMFaceData faceData;
PXCMFaceModule faceModule;
PXCMSenseManager senseManager;
}
}
I felt a lot more shaky on the object model here because it seems that I have to deal with the PXCMFaceModule but then there’s also some PXCMFaceConfiguration which I attempt to set up to;
- ask for face detection of 1 face
- ask for facial ‘landmarks’ (i.e. interesting facial points) to be captured
- ask for a pulse/heartRate estimate to be captured
- ask for all facial expressions to be captured
Once that is set up, it seems that I need to get hold of a PXCMFaceData instance by calling the CreateOutput() member on the PXCMFaceModule and then, as frames arrive, it seems the order of the day is to called an Update() method on that object.
This wasn’t very intuitive to me and I’m not at all sure that I have it right beyond “it seems to work”. It didn’t seem quite in step with way I’ve gone about getting data so far but, nonetheless, you can see that in my ProcessSampleWorkerThread method I essentially;
- Use the PXCMFaceData.QueryPulse method to see if I can get an estimate of the pulse rate (of the first face)
- Use QueryLandmarks to return what seem to be ~80 points of interest on the face
- Use QueryExpressions to return which of the following expressions are visible and with what intensity;
public enum FaceExpression
{
EXPRESSION_BROW_RAISER_LEFT = 0,
EXPRESSION_BROW_RAISER_RIGHT = 1,
EXPRESSION_BROW_LOWERER_LEFT = 2,
EXPRESSION_BROW_LOWERER_RIGHT = 3,
EXPRESSION_SMILE = 4,
EXPRESSION_KISS = 5,
EXPRESSION_MOUTH_OPEN = 6,
EXPRESSION_EYES_CLOSED_LEFT = 7,
EXPRESSION_EYES_CLOSED_RIGHT = 8,
EXPRESSION_HEAD_TURN_LEFT = 9,
EXPRESSION_HEAD_TURN_RIGHT = 10,
EXPRESSION_HEAD_UP = 11,
EXPRESSION_HEAD_DOWN = 12,
EXPRESSION_HEAD_TILT_LEFT = 13,
EXPRESSION_HEAD_TILT_RIGHT = 14,
EXPRESSION_EYES_TURN_LEFT = 15,
EXPRESSION_EYES_TURN_RIGHT = 16,
EXPRESSION_EYES_UP = 17,
EXPRESSION_EYES_DOWN = 18,
EXPRESSION_TONGUE_OUT = 19,
EXPRESSION_PUFF_RIGHT = 20,
EXPRESSION_PUFF_LEFT = 21,
}
the RenderUI function then draws ellipses to the screen for each of the facial landmarks and updates some text blocks with the details of the captured pulse rate and expressions.
Bringing that Together
Pulling together that sketchy code then gives me a screen that displays colour video, emotional data, expression data, a bounding rectangle for a single recognised face along with ~80 landmarks around that face.
It looks like this where the white items are coming from the EmotionControl and the red items are coming from the FaceControl and this was my attempt to do a ‘FEAR face’ which seemed to turn into more of a ‘STUPID face’ 

and you’ll spot that the SDK things that I have my left and right brows raised and so on.
The code is here is pretty rough and ready but I’m impressed that I can get quite a lot of data here from a pretty small amount of code and, naturally, it’d be possible to tidy this up.
In the meantime, the code’s here for download