Note – these posts are put together after a short time with Silverlight 4 as a way of providing pointers to some of the new features that Silverlight 4 has to offer. I’m posting these from the PDC as Silverlight 4 is announced for the first time so please bear that in mind when working through these posts.
One of the things that was announced early on about Silverlight 4 was that it would have support for capture from web cams and microphones.
If you right mouse on a Silverlight 4 control you’ll find a new tab;
where you can specify which video and audio sources on your machine are the default for Silverlight to use ( naturally, this will look different on OS X ).
There are then new classes in System.Windows.Media – I put them onto a diagram as a way of trying to understand how they fit together and it helped me a bit so I’ve reproduced that here;
If I want to get a picture of what devices I’ve got that support capture then I can go ahead and build a UI something like;
<UserControl x:Class="SilverlightApplication35.MainPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:dg="clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls.Data" mc:Ignorable="d" d:DesignHeight="300" d:DesignWidth="400"> <Grid x:Name="LayoutRoot" Background="White"> <Grid.RowDefinitions> <RowDefinition Height="Auto"/> <RowDefinition /> </Grid.RowDefinitions> <ComboBox x:Name="comboDeviceType" ItemsSource="{Binding}" DisplayMemberPath="Name" Margin="5" /> <dg:DataGrid Grid.Row="1" ItemsSource="{Binding ElementName=comboDeviceType,Path=SelectedValue.CaptureDevices}" AutoGenerateColumns="false"> <dg:DataGrid.Columns> <dg:DataGridTextColumn Header="Friendly Name" Binding="{Binding FriendlyName}" /> <dg:DataGridTemplateColumn> <dg:DataGridTemplateColumn.CellTemplate> <DataTemplate> <dg:DataGrid ItemsSource="{Binding SupportedFormats}"> </dg:DataGrid> </DataTemplate> </dg:DataGridTemplateColumn.CellTemplate> </dg:DataGridTemplateColumn> <dg:DataGridTextColumn Header="Desired Format" Binding="{Binding DesiredFormat}" /> <dg:DataGridTextColumn Header="Audio Frame Size" Binding="{Binding AudioFrameSize}" /> <dg:DataGridCheckBoxColumn Header="Default Device" Binding="{Binding IsDefaultDevice}" /> </dg:DataGrid.Columns> </dg:DataGrid> </Grid> </UserControl>
with a little code behind it;
using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Windows; using System.Windows.Controls; using System.Windows.Documents; using System.Windows.Input; using System.Windows.Media; using System.Windows.Media.Animation; using System.Windows.Shapes; using System.Collections; namespace SilverlightApplication35 { public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); this.Loaded += (s, e) => { this.DataContext = new DeviceType[] { new DeviceType() { Name = "Audio Devices", CaptureDevices = (IEnumerable)CaptureDeviceConfiguration.GetAvailableAudioCaptureDevices() }, new DeviceType() { Name = "Video Devices", CaptureDevices = (IEnumerable)CaptureDeviceConfiguration.GetAvailableVideoCaptureDevices() } }; }; } } public class DeviceType { public string Name { get; set; } public IEnumerable CaptureDevices { get; set; } } }
note that the “Audio Frame Size” doesn’t apply to video sources but I left the column in as I was being lazy and that’ll give me a pretty ugly UI that’ll enumerate the audio/video devices on my machine for me;
So, it’s easy to figure out what audio/video devices you’ve got and the capabilities of them and you can use the DesiredFormat to pick one of the SupportedFormats for capture ( if, unlike me, you’ve got enough-of-a-clue about audio/video formats to make an informed choice around which one to choose 🙂 ).
How to go about getting some input from one of them? First, I need to ask if it’s going to be ok;
if (!CaptureDeviceConfiguration.AllowedDeviceAccess) { bool ok = CaptureDeviceConfiguration.RequestDeviceAccess(); if (ok) { } }
So…if we haven’t already been told that it’s ok to access these devices then we ask if it is ok to do so and the user gets a consent dialog, currently;
and the code gets the return value from the Yes/No option as a boolean.
Having been granted access, we need to do something with the device. One of the easiest things to do is to snap a photo. With a little UI like this;
<UserControl x:Class="SilverlightApplication35.MainPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:dg="clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls.Data" mc:Ignorable="d" d:DesignHeight="300" d:DesignWidth="400"> <Grid x:Name="LayoutRoot" Background="White"> <Grid.RowDefinitions> <RowDefinition /> <RowDefinition Height="Auto"/> </Grid.RowDefinitions> <Image x:Name="snapImage" Margin="10" Stretch="Fill"/> <Button Grid.Row="1" Content="Snap" Margin="10" Click="OnSnap" /> </Grid> </UserControl>
and with a little code behind which asks for permission to use the devices and then sets up a new CaptureSource to use the default video device and then calls AsyncCaptureImage on it to grab an image;
public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); } private void OnSnap(object sender, RoutedEventArgs e) { bool ok = CaptureDeviceConfiguration.AllowedDeviceAccess; if (!ok) { ok = CaptureDeviceConfiguration.RequestDeviceAccess(); } if (ok) { CaptureSource cs = new CaptureSource() { VideoCaptureDevice = CaptureDeviceConfiguration.GetDefaultVideoCaptureDevice() }; cs.Start(); cs.AsyncCaptureImage((bitmap) => { Dispatcher.BeginInvoke(() => { cs.Stop(); snapImage.Source = bitmap; }); }); } } }
and I took a nice picture of my phone that way;
Or if I want to build a quick UI that displays what’s currently coming from the web-cam then that’s pretty easy. Changing the UI slightly;
<UserControl x:Class="SilverlightApplication35.MainPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:dg="clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls.Data" mc:Ignorable="d" d:DesignHeight="300" d:DesignWidth="400"> <Grid x:Name="LayoutRoot" Background="White"> <Grid.RowDefinitions> <RowDefinition /> <RowDefinition Height="Auto" /> </Grid.RowDefinitions> <Rectangle x:Name="rectVideo" Stroke="Gray" StrokeThickness="2" HorizontalAlignment="Stretch" VerticalAlignment="Stretch" RadiusX="5" RadiusY="5" Margin="10"> </Rectangle> <StackPanel Orientation="Horizontal" Grid.Row="1"> <Button Content="Start" Margin="10" Click="OnStart" /> <Button Content="Stop" Margin="10" Click="OnStop" /> </StackPanel> </Grid> </UserControl>
and the code behind so that we use a CaptureSource again but this time we set its VideoCaptureDevice and then we feed the CaptureSource into a VideoBrush and paint a Rectangle with that Brush;
public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); } private void OnStart(object sender, RoutedEventArgs e) { bool ok = CaptureDeviceConfiguration.AllowedDeviceAccess; if (!ok) { ok = CaptureDeviceConfiguration.RequestDeviceAccess(); } if (ok) { if (source == null) { source = new CaptureSource() { VideoCaptureDevice = CaptureDeviceConfiguration.GetDefaultVideoCaptureDevice() }; VideoBrush brush = new VideoBrush(); brush.SetSource(source); rectVideo.Fill = brush; } source.Start(); } } void OnStop(object sender, RoutedEventArgs args) { source.Stop(); } CaptureSource source; }
giving me a live view from my webcam;
( I know, it looks like the previous screenshot but this one was video, honest 🙂 ).
But I think that if I want to get to the actual captured audio or video then I need to look to an AudioSink or VideoSink implementation to grab the sampled bits and do something with them ( e.g. stream them over a network ).
The way that this looks to work is that you derive from AudioSink, do something with the captured bytes as they come along. What I decided to try first off was to push the bytes into a MemoryStream as in;
/// <summary> /// This class is going to eat a tonne of memory... /// </summary> public class MemoryStreamAudioSink : AudioSink { protected override void OnCaptureStarted() { stream = new MemoryStream(); } protected override void OnCaptureStopped() { } public AudioFormat AudioFormat { get { return (audioFormat); } } public MemoryStream AudioData { get { return (stream); } } protected override void OnFormatChange(AudioFormat audioFormat) { if (this.audioFormat == null) { this.audioFormat = audioFormat; } else { throw new InvalidOperationException(); } } protected override void OnSamples(long sampleTime, long sampleDuration, byte[] sampleData) { stream.Write(sampleData, 0, sampleData.Length); } MemoryStream stream; AudioFormat audioFormat; }
so my particular AudioSink just takes all the audio and shoves it into a MemoryStream. It then exposes ( bad idea ) that MemoryStream as a property for a caller to grab hold of. It also makes a note of the AudioFormat that comes into OnFormatChange and exposes that as a property as well and it doesn’t allow it to change ( as that’d involve more complexity for me ).
With that in place, I can keep my existing UI and hijack the Start/Stop buttons that I’ve already got on screen in order to make use of my new MemoryStreamAudioSink class;
public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); } private void OnStart(object sender, RoutedEventArgs e) { bool ok = CaptureDeviceConfiguration.AllowedDeviceAccess; if (!ok) { ok = CaptureDeviceConfiguration.RequestDeviceAccess(); } if (ok) { if (audioSink == null) { CaptureSource source = new CaptureSource() { AudioCaptureDevice = CaptureDeviceConfiguration.GetDefaultAudioCaptureDevice() }; audioSink = new MemoryStreamAudioSink(); audioSink.CaptureSource = source; } audioSink.CaptureSource.Start(); } } void OnStop(object sender, RoutedEventArgs args) { audioSink.CaptureSource.Stop(); using (FileStream stream = File.OpenWrite( Environment.GetFolderPath(Environment.SpecialFolder.MyMusic) + "\\test.wav")) { byte[] wavFileHeader = WavFileHelper.GetWavFileHeader(audioSink.AudioData.Length, audioSink.AudioFormat); stream.Write(wavFileHeader, 0, wavFileHeader.Length); // Now write the rest of the data... byte[] buffer = new byte[4096]; int read = 0; audioSink.AudioData.Seek(0, SeekOrigin.Begin); while ((read = audioSink.AudioData.Read(buffer, 0, buffer.Length)) > 0) { stream.Write(buffer, 0, read); } stream.Flush(); stream.Close(); } } MemoryStreamAudioSink audioSink; }
so, now, when we click the Start button we go ahead and grab the default audio capture device and we set that as the CaptureSource of my new audioSink and then we tell that CaptureSource to Start() which looks to give me a call to OnAudioFormatChange on my MemoryStreamAudioSink class and then subsequent calls to OnSamples as audio comes in.
In the Stop button handler we tell the MemoryStreamAudioSink.CaptureSource to Stop() and then I use the MemoryStream ( available in property AudioData ) and the AudioFormat ( available in property AudioFormat ) on my MemoryStreamAudioSink class in order to write the data out to a file in the MyMusic folder.
Note – in order to have access to that file, this application would have to run out-of-browser and elevated which has not been necessary up until now and is only for the file access.
In order to get the data out in a format that can be played, I tried to write it out as a WAV file and after a little reading around what those files look like I added the WavFileHelper class here to put together the right file header. That’s included below;
// Caveat - I know nothing about WAV files, I just read the header spec // on the web public static class WavFileHelper { public static byte[] GetWavFileHeader(long audioLength, AudioFormat audioFormat) { // This code could use some constants... MemoryStream stream = new MemoryStream(44); // "RIFF" stream.Write(new byte[] { 0x52, 0x49, 0x46, 0x46 }, 0, 4); // Data length + 44 byte header length - 8 bytes occupied by first 2 fields stream.Write(BitConverter.GetBytes((UInt32)(audioLength + 44 - 8)), 0, 4); // "WAVE" stream.Write(new byte[] { 0x57, 0x41, 0x56, 0x45 }, 0, 4); // "fmt " stream.Write(new byte[] { 0x66, 0x6D, 0x74, 0x20 }, 0, 4); // Magic # of PCM file - not sure about that one stream.Write(BitConverter.GetBytes((UInt32)16), 0, 4); // 1 == Uncompressed stream.Write(BitConverter.GetBytes((UInt16)1), 0, 2); // Channel count stream.Write(BitConverter.GetBytes((UInt16)audioFormat.Channels), 0, 2); // Sample rate stream.Write(BitConverter.GetBytes((UInt32)audioFormat.SamplesPerSecond), 0, 4); // Byte rate stream.Write(BitConverter.GetBytes((UInt32) ((audioFormat.SamplesPerSecond * audioFormat.Channels * audioFormat.BitsPerSample) / 8)), 0, 4); // Block alignment stream.Write(BitConverter.GetBytes((UInt16) ((audioFormat.Channels * audioFormat.BitsPerSample) / 8)), 0, 2); // Bits per sample stream.Write(BitConverter.GetBytes((UInt16)audioFormat.BitsPerSample), 0, 2); // "data" stream.Write(new byte[] { 0x64, 0x61, 0x74, 0x61 }, 0, 4); // Length of the rest of the file stream.Write(BitConverter.GetBytes((UInt32)audioLength), 0, 4); return (stream.GetBuffer()); } }
and so with that in place, I can get audio captured from the microphone and store it into a WAV file. Naturally, I could just as easily send it over the network to some server for some other kind of processing or relay.
Video works in a similar way in that I can write a VideoSink and use that to capture the video as it comes in. A dummy VideoSink that puts everything into a MemoryStream ( even less practical for video ) might be;
public class MemoryStreamVideoSink : VideoSink { public VideoFormat CapturedFormat { get; private set; } public MemoryStream CapturedVideo { get; private set; } protected override void OnCaptureStarted() { CapturedVideo = new MemoryStream(); } protected override void OnCaptureStopped() { } protected override void OnFormatChange(VideoFormat videoFormat) { if (CapturedFormat != null) { throw new InvalidOperationException("Can't cope with change!"); } CapturedFormat = videoFormat; } protected override void OnSample(long sampleTime, long frameDuration, byte[] sampleData) { CapturedVideo.Write(sampleData, 0, sampleData.Length); } }
so it’s basically the same class derived from a different base class and then I could re-purpose my existing UI code-begin once again to do something like;
public partial class MainPage : UserControl { public MainPage() { InitializeComponent(); } private void OnStart(object sender, RoutedEventArgs e) { bool ok = CaptureDeviceConfiguration.AllowedDeviceAccess; if (!ok) { ok = CaptureDeviceConfiguration.RequestDeviceAccess(); } if (ok) { if (videoSink == null) { captureSource = new CaptureSource() { VideoCaptureDevice = CaptureDeviceConfiguration.GetDefaultVideoCaptureDevice() }; videoSink = new MemoryStreamVideoSink(); videoSink.CaptureSource = captureSource; } videoSink.CaptureSource.Start(); } } void OnStop(object sender, RoutedEventArgs args) { captureSource.Stop(); // Do something with the captured bytes in videoSink.CapturedVideo... } CaptureSource captureSource; MemoryStreamVideoSink videoSink; }
Now, that all seems to work fine for me – note that there’s a difference in what I did here to the Audio code. On the call to captureSource.Stop() I found that I got “Capture Source Is Not Stopped” exceptions and I wondered whether that was because I was reaching into the VideoSink whilst it was still running to try and grab the CaptureSource and stop it. So,I kept a separate reference to the CaptureSource and that seemed to resolve it – not sure at all on that one at the time of writing.
Beyond that, it looks like the audio example except I haven’t written anything that attempts to store this data into a playable file. On my machine the VideoFormat that I get by default was Format32bppArgb running at 640×480 and 30 frames per second but I don’t know ( and haven’t looked ) at how I’d write that into a file that I can play with Media Player – I suspect that’d be more work so I’m leaving right now as I guess that the primary use for this functionality will not be to capture to local files but, instead, to capture video and send it asynchronously over the network somewhere…