Yeah, I know – another cheesy title for a blog post
I downloaded the CTP of asynchronous support in C# along with the corresponding specification as described by Anders in his recent PDC 2010 session.
This has already been explained elsewhere so this post is mainly just my own thoughts on tinkering around with this as I find that I need to work through these sorts of language features for myself in order to try and understand them and so this post is really just a write-up of my own notes.
The thing that initially threw me is that the various bits (including the assemblies to reference) are present in;
- My Documents\Microsoft Visual Studio Async CTP
which was probably flashed up on the screen somewhere during the installation but I must have missed it and so I had to go and hunt for them later on.
I read through the spec first and the things that stood out to me were;
- a common pattern for making asynchronous work all look the same based around a notion of wrapping it up behind some method calls;
- waiter = GetAwaiter( t )
- bool needsCompletion = waiter.BeginWait( someAction )
- waiter.EndWait();
- async methods return void, Task or Task<T>
- new keywords are async and await
and, of all of those, the one that really stuck out for me was the first one – this notion of a common pattern for expression asynchrony.
I then went on to read Stephen Toub’s whitepaper on the Task-based Asynchronous Pattern (in the download) which is a really good read. I need to print it out and read it more thoroughly as I can only read about 40% of a long document on a computer screen.
Using the full power of hindsight I guess it’s fair to say that .NET needed a common pattern for asynchronous work from the start.
It did have one in the Begin/End and IAsyncResult model that later on become called the Asynchronous Programming Model (or APM) but most developers (including me) find that a bit tedious to program against and so various other patterns have crept in over time;
- An event based pattern usually involving some DoXyzAsync() method call which fires some XyzCompleted event. This seems to have become known as the Event-based Asynchronous Pattern (or EAP).
- A futures style pattern that came along with the Task Parallel Library that looks to be known as the Task-based Asynchronous Pattern (or TAP).
and so we’re left with 3 or more ways to handle similar scenarios – APM, EAP and TAP.
In experimenting around this, I thought I’d start simple. Really simple. Let’s say that I have this method;
static int Divide(int nominator, int denominator) { return (nominator / denominator); }
and to fake this operation doing some real work, let’s just make it take a long time;
static int Divide(int nominator, int denominator) { Thread.Sleep(10000); return (nominator / denominator); }
Now it takes a long time and so would (e.g.) block my UI thread if I was calling it from one and consequently I might want to make the invocation of this method “asynchronous” which for this type of operation means moving it to another thread so that it becomes asynchronous with respect to the calling thread even though it’s really synchronous with respect to the lucky thread that actually gets to sleep for 10 seconds.
I could use something like the ThreadPool for this;
static int Divide(int nominator, int denominator) { ThreadPool.QueueUserWorkItem(cb => { Thread.Sleep(10000); // TODO: erm? return (nominator / denominator); }); // TODO: erm? return (0); }
but the ThreadPool leaves me with an immediate problem.
I’m stuck for how to transport the value back from the ThreadPool thread to the calling thread and I’m also stuck with what to return to the caller on the UI thread in the time between when the ThreadPool work begins and when it completes.
There’s also the small matter of exceptions on the ThreadPool thread to deal with and a stretch goal might involve cancellation.
Existing Models
The Asynchronous Programming Model (APM)
I could attempt to fix this via the IAsyncResult means. There’s a high level of tax to pay but I might end up with something like;
static void Main(string[] args) { IAsyncResult result = BeginDivide(10, 2, iar => { try { int value = EndDivide(iar); Console.WriteLine("Value is {0}", value); } catch (Exception ex) { Console.WriteLine("Failed {0}", ex.Message); } finally { if (iar is IDisposable) { ((IDisposable)iar).Dispose(); } } }, null); Console.ReadLine(); } static int EndDivide(IAsyncResult result) { MyAsyncResult internalResult = (MyAsyncResult)result; if (internalResult.Exception != null) { throw internalResult.Exception; } return (internalResult.Result); } static IAsyncResult BeginDivide(int nominator, int denominator, AsyncCallback callback, object state) { MyAsyncResult result = new MyAsyncResult(callback, state); ThreadPool.QueueUserWorkItem(cb => { Thread.Sleep(10000); try { result.Result = nominator / denominator; } catch (Exception ex) { result.Exception = ex; } result.Complete(); }); return (result); }
and there’s a little too much of “catch Exception” going on in there and it also relies on me implementing IAsyncResult as well;
class MyAsyncResult : IAsyncResult, IDisposable { public MyAsyncResult(AsyncCallback callback, object state) { this._callback = callback; this._asyncState = state; this._waitHandle = new Lazy<ManualResetEventSlim>(true); } ~MyAsyncResult() { Dispose(false); } internal int Result { get; set; } internal Exception Exception { get; set; } public object AsyncState { get { return (_asyncState); } } public WaitHandle AsyncWaitHandle { get { return (this._waitHandle.Value.WaitHandle); } } public bool CompletedSynchronously { get { return (false); } } public bool IsCompleted { get { return (this._isCompleted); } internal set { this._isCompleted = value; } } internal void Complete() { this._isCompleted = true; if (this._waitHandle.IsValueCreated) { this._waitHandle.Value.Set(); } if (this._callback != null) { this._callback(this); } } void Dispose(bool disposing) { if (disposing) { if (this._waitHandle.IsValueCreated) { this._waitHandle.Value.Dispose(); } } } public void Dispose() { Dispose(true); GC.SuppressFinalize(this); } object _asyncState; AsyncCallback _callback; volatile Exception _exception; volatile bool _isCompleted; Lazy<ManualResetEventSlim> _waitHandle; }
which I probably got wrong
So, it’s not cheap to go via this route . I end up with the Begin method the End method and the IAsyncResult that maintains the context between them.
The main problem though is that if I want to compose my Begin/End divide function into some other functions or classes then I have to keep paying this IAsyncResult tax all over the place and the signal/noise ratio gets way out of whack.
The Event Based Asynchronous Pattern (EAP)
Maybe I should have gone with the event style model? Something a bit more like;
static event EventHandler<DivideCompletedEventArgs> DivideCompleted; static void DivideAsync(int nominator, int denominator, object state=null) { ThreadPool.QueueUserWorkItem(cb => { Thread.Sleep(10000); int? result = null; Exception exception = null; try { result = nominator / denominator; } catch (Exception ex) { exception = ex; } DivideCompletedEventArgs args = new DivideCompletedEventArgs(exception, false, state) { Result = result ?? 0 }; var handler = DivideCompleted; if (handler != null) { handler(null, args); } }); } static void Main(string[] args) { DivideCompleted += (s, e) => { if (e.Error != null) { Console.WriteLine("Exception {0}", e.Error.Message); } else { Console.WriteLine("Result {0}", e.Result); } }; DivideAsync(10, 2); Console.ReadLine(); }
with a little class derived from AsyncCompletedEventArgs;
class DivideCompletedEventArgs : AsyncCompletedEventArgs { public DivideCompletedEventArgs(Exception error, bool cancelled, object state) : base(error, cancelled, state) { } public int Result { get; internal set; } }
that seemed a whole lot easier although I bet I messed it up again
Usually, these methods and events would be instance methods/events rather than static methods/events as I have here.
Once again, if I now want to put this single, simple asynchronous operation into some other code then (assuming that calling code isn’t going to block) I’ll have to define yet more event argument types and events and so on in the classes that make use of this code.
The tax isn’t as bad as the APM version but it’s still not as good as it could be.
The Task-based Asynchronous Pattern (TAP)
There’s still quite a lot of ceremony involved in the whole thing so maybe I could base my support around Tasks from the TPL;
static Task<int> TaskDivideAsync(int nominator, int denominator) { Task<int> task = new Task<int>(() => { Thread.Sleep(10000); return (nominator / denominator); }); task.Start(); return (task); } static void Main(string[] args) { Task<int> asyncCall = TaskDivideAsync(10, 0); asyncCall.ContinueWith(t => { if (t.IsFaulted) { Console.WriteLine("Errored {0}", t.Exception.Flatten().InnerExceptions[0].Message); } else { Console.WriteLine("Result {0}", t.Result); } }); Task<int> syncCall = TaskDivideAsync(10, 2); Console.WriteLine("Sync call {0}", syncCall.Result); Console.ReadLine(); }
and that all feels lot better to me so as long as my caller was prepared to accept a Task<T> I’d be happy because there’s just a method that I have to write and no events or anything like that.
It means that I can write another method like this TaskAddAsync;
static Task<int> TaskAddAsync(int x, int y) { Task<int> task = new Task<int>(() => { Thread.Sleep(10000); return (x + y); }); task.Start(); return (task); }
then combining that with the TaskDivideAsync isn’t so much of a Herculian task in that I can write the new and amazing TaskAddDivideAsync;
static Task<int> TaskAddDivideAsync(int x, int y, int denominator) { return (TaskAddAsync(x, y).ContinueWith( ti => TaskDivideAsync(ti.Result, denominator).Result)); }
and Task<T> is doing all the heavy lifting for me and signal is now very definitely louder than noise.
It’s worth saying that I’m not dealing with any notions of cancellation here although Task<T> is more than capable of it and I’m also not dealing with reporting progress although, again, Task<T> can do it.
It seems fair to say then that the Task based approach is the clear winner and ( as far as I can tell ) it’s Task that becomes the model for the new asynchronous language support in C# 5.0.
The Language is not Wired to Task
As has happened before in the C# language, the new keywords introduced into the language are not being wired directly to the Task/Task<T> classes.
From the spec;
The expression t of an await-expression await t is called the task of the await expression. The task t is required to be awaitable, which means that all of the following must hold:
· (t).GetAwaiter() is a valid expression of type A.
· Given an expression a of type A and an expression r of type System.Action, (a).BeginAwait(r) is a valid boolean-expression.
· Given an expression a of type A, (a).EndAwait() is a valid expression.
and so the language is tied to anything that is awaitable and it looks like there’s a class TaskAwaiter in System.Runtime.CompilerServices that makes a Task into an awaitable and hence Task can be wired in by a level of indirection rather than tying the language directly to it.
The return value of BeginAwait is meant to be false if the awaitable has already completed and true if it has not.
John has a great post on this over here.
Building a Model around Task or Task<T>
It’s not too hard to see how the compiler can do some lifting work around Task or Task<T>. Let’s say that I add a reference to the new AsyncCtpLibrary.dll and rewrite my existing method;
async static Task<int> TaskDivideAsync(int nominator, int denominator) { await TaskEx.Run(() => Thread.Sleep(10000)); return (nominator / denominator); }
and the compiler sees that and writes quite a lot of extra code for me. Being a compiler, the code that it generates isn’t particularly readable as it goes off and makes a state machine which wouldn’t have been my first thought as a human
From Reflector, here’s my method;
and here’s the Lambda I wrote;
Clearly, the replacement body of my method is hiding away the relevant state in an instance of a generated class. That class is a state machine and its MoveNext method is;
The first time we hit this, the <>1_state value will be 0 and so we will;
- call TaskEx.Run( myLambda ).GetAwaiter() to get an awaitable Task
- set the <>1_state value to1 before calling waiter.BeginWait with this function itself as the callback routine.
- if this returns true then we have waiting to do and will get called back so we return to the caller right now waiting for the callback to occur.
- we either get to the last block of code before the catch handler because
- we fell straight through the first time around because BeginWait returned true
- we have been called back and the <>1_state value is set to 1 which causes us to skip the if clause
- either way, we call the EndAwait function because we’ve finished and we then set the result to our complex calculation
all the while, exceptions are being caught here.
It seems like the value of 0 is used for “need to execute”, 1 for “executing” and –1 for “broken/finished” or similar.
I suspect this is a relatively simple case of the compiler’s generation and introducing clauses like try/catch/finally would greatly expand what’s happening here.
Not Everything Returns Task or Task<T>
As we noted earlier on in the post, there are many areas where the framework (and developer’s own code) uses either the APM or the EAP.
Initially, when I saw the async support I suspected that there would be some magic that would change an existing APM or EAP approach into something that could be supported via the async/await keywords and code generation techniques.
As far as I know, that’s not the case and, in a way, I’m relieved.
As Stephen points out in his whitepaper, it’s not too hard to go from an APM implementation to a Task based one in that I could take my original BeginDivide and EndDivide and I could wrap them either in the class definition itself or using extension methods (if my original methods weren’t static ). So I can take these;
static int EndDivide(IAsyncResult result) { MyAsyncResult internalResult = (MyAsyncResult)result; if (internalResult.Exception != null) { throw internalResult.Exception; } return (internalResult.Result); } static IAsyncResult BeginDivide(int nominator, int denominator, AsyncCallback callback, object state) { MyAsyncResult result = new MyAsyncResult(callback, state); ThreadPool.QueueUserWorkItem(cb => { Thread.Sleep(10000); try { result.Result = nominator / denominator; } catch (Exception ex) { result.Exception = ex; } result.Complete(); }); return (result); }
and augment them with this new Task<int> based version which just makes use of the existing method pair;
static Task<int> BeginDivideTask(int nominator, int denominator) { return (Task<int>.Factory.FromAsync(BeginDivide(nominator, denominator, null, null), EndDivide)); }
then I’m in business.
If my existing class takes the EAP approach then the wrapping up with a Task-based approach is less automatic as it’s easy to see that the APM based approach is pretty uniform whereas the EAP approach has differently shaped events with differently shaped arguments and so on.
It’s still doable though via regular methods or extension methods.
APM and EAP to TAP Extension Methods in the CTP
That sub-section heading has to win me some kind of acronym bingo?
In the CTP there’s a class called AsyncCtpExtensions with a whole slew of extension methods to types such as;
- Socket, TcpListener, TcpClient, WebClient, DataServiceQuery, SqlCommand, HttpListener, WebRequest, TextReader, UdpClient
and so what does that mean? It means that a bunch of this work in converting APM/EAP methods to TAP methods is already done for common classes that do async work.
The other day I wrote some code which attempted to read from the network and copy to a local file. This was Silverlight code but I ported it to a console application where it looked something like this;
Various supporting EventArgs classes;
public class FileDownloadStartedEventArgs : EventArgs { public FileDownloadStartedEventArgs(string fileUrl, double kBTotalSize) { this.FileUrl = fileUrl; this.KBTotalSize = kBTotalSize; } public string FileUrl { get; private set; } public double KBTotalSize { get; private set; } } public class FileDownloadProgressedEventArgs : EventArgs { public FileDownloadProgressedEventArgs(double kbDownloaded) { this.KBDownloaded = kbDownloaded; } public double KBDownloaded { get; private set; } } public class FileDownloadCompletedEventArgs : AsyncCompletedEventArgs { public FileDownloadCompletedEventArgs(Exception error, bool cancelled) : base(error, cancelled, null) { } }
My SingleFileDownloader class;
public class SingleFileDownloader { public event EventHandler<AsyncCompletedEventArgs> DownloadCompleted; public event EventHandler<FileDownloadStartedEventArgs> DownloadStarted; public event EventHandler<FileDownloadProgressedEventArgs> DownloadProgressed; public SingleFileDownloader(string source, string destination, int bufferSize = 1024 * 1024 * 4) { this._buffer = new byte[bufferSize]; this._source = source; this._destination = destination; } public void StopDownload() { this._stopRequested = true; } public void DownloadAsync() { try { // TODO: This is ugly but an async read in Silverlight simply hangs // when it hits an error (documented) and so I have a timer which // tries to check for signs of life and cancels things if it notices // a hang. _localStream = File.OpenWrite(this._destination); HttpWebRequest request = (HttpWebRequest)WebRequest.Create(_source); request.BeginGetResponse(iar => { try { _response = (HttpWebResponse)request.EndGetResponse(iar); _bytesToRead = _response.ContentLength; _remoteStream = _response.GetResponseStream(); if (this.DownloadStarted != null) { this.DownloadStarted(this, new FileDownloadStartedEventArgs( this._source, _bytesToRead / 1024)); } NextRead(); } catch (Exception ex) { _exception = ex; Done(); } }, null); } catch (Exception ex) { this._exception = ex; Done(); } } void NextRead() { if (this._stopRequested) { Done(); } else { try { this._remoteStream.BeginRead( this._buffer, 0, this._buffer.Length, iar => { try { int bytes = this._remoteStream.EndRead(iar); _bytesRead += bytes; if (this.DownloadProgressed != null) { this.DownloadProgressed(this, new FileDownloadProgressedEventArgs( this._bytesRead / 1024)); } if (bytes == 0) { Done(); } else { this._localStream.BeginWrite(this._buffer, 0, bytes, iarW => { try { this._localStream.EndWrite(iarW); NextRead(); } catch (Exception ex) { this._exception = ex; Done(); } }, null); } } catch (Exception ex) { this._exception = ex; Done(); } }, null); } catch (Exception ex) { this._exception = ex; Done(); } } } void Done() { if (_localStream != null) { _localStream.Close(); } if (_remoteStream != null) { _remoteStream.Close(); } if (_response != null) { _response.Close(); } if (this._stopRequested || (this._exception != null)) { try { File.Delete(this._destination); } catch { } } if (this.DownloadCompleted != null) { this.DownloadCompleted(this, new AsyncCompletedEventArgs(this._exception, this._stopRequested, false)); } } bool _stopRequested; HttpWebResponse _response; Stream _remoteStream; Stream _localStream; byte[] _buffer; long _bytesRead; long _bytesToRead; string _source; string _destination; Exception _exception; }
and the code that makes use of it as a simple test case;
static void Main(string[] args) { // BTW - these slides I'm downloading here are old, I just picked *a* file. SingleFileDownloader downloader = new SingleFileDownloader( "https://mtaulty.com/downloads/WhatIsSilverlight.zip", @"c:\temp\whatissilverlight.zip"); AutoResetEvent evt = new AutoResetEvent(false); downloader.DownloadStarted += (s, e) => { Console.WriteLine("Started download of {0} size {1}KB", e.FileUrl, e.KBTotalSize); }; downloader.DownloadProgressed += (s, e) => { Console.WriteLine("Downloaded progressed to {0}", e.KBDownloaded); }; downloader.DownloadCompleted += (s, e) => { Console.WriteLine("Download done - exception [{0}]", e.Error); evt.Set(); }; downloader.DownloadAsync(); evt.WaitOne(); Console.WriteLine("Done"); Console.ReadLine(); }
I had a stab at moving that across to the new async model. This is my second attempt (the first one was worse, trust me);
class ProgressReporter : IProgress<long> { public void Report(long value) { Console.WriteLine("{0}KB downloaded", value); } } public class SingleFileDownloader { public SingleFileDownloader(string source, string destination, int bufferSize = 1024 * 1024 * 4) { this._buffer = new byte[bufferSize]; this._source = source; this._destination = destination; } async public Task DownloadAsync( CancellationToken token, IProgress<long> bytesCopiedProgress) { HttpWebRequest request = (HttpWebRequest)WebRequest.Create(_source); try { WebResponse response = await request.GetResponseAsync(); using (Stream remoteStream = response.GetResponseStream()) using (Stream localStream = File.OpenWrite(this._destination)) { int bytesRead = 0; long totalBytesCopied = 0; while ((bytesRead = await remoteStream.ReadAsync( this._buffer, 0, this._buffer.Length)) > 0) { await localStream.WriteAsync(this._buffer, 0, bytesRead); totalBytesCopied += bytesRead; if (bytesCopiedProgress != null) { bytesCopiedProgress.Report(totalBytesCopied / 1024); } if (token.IsCancellationRequested) { localStream.Close(); DeleteDestination(); token.ThrowIfCancellationRequested(); } } } } catch { DeleteDestination(); throw; } } void DeleteDestination() { File.Delete(this._destination); } byte[] _buffer; string _source; string _destination; } class Program { static void Main(string[] args) { // BTW - these slides I'm downloading here are old, I just picked *a* file. SingleFileDownloader downloader = new SingleFileDownloader( "https://mtaulty.com/downloads/WhatIsSilverlight.zip", @"c:\temp\whatissilverlight.zip"); CancellationTokenSource source = new CancellationTokenSource(); Task t = downloader.DownloadAsync(source.Token, new ProgressReporter()); Task w = new Task(() => { Console.ReadLine(); source.Cancel(); }); w.Start(); Task.WaitAny(t, w); Console.WriteLine("Done"); } }
I’m not at all sure that I got it even nearly right at this point but I got rid of a lot of code.
My first impression is that the code looks so much simpler and yet I find harder to reason about this new structure of the code at this stage.
I wonder if it’s because the model is new to me or whether it’s because the gap between what I’m writing and what is actually executing is now getting wide enough that I lose track of what it is that I’m actually doing? I write a method and the compiler generates a lot of code and instantiates a lot of objects behind my back. Hmm.
As an example, my first attempt at that function closed the streams I was reading/writing to way too early and long before the reading/writing was complete but it looked perfectly natural as a piece of code – it felt right but running it in the debugger showed me straight away that the underlying state machine was doing something other than what I’d imagined.
Conclusion
I think my main conclusion for now is that this feels like it will be a major step forward but, also, that it’s going to need some digging around in order to be able to feel entirely comfortable with this Task-based way of working and the way that the compiler moves code around to support it.
I suspect that main areas that need thought are around control flow and specifically thinking about how exceptions work as I don’t have that clear coming out of my first attempt at using it. No doubt, it gets easier.
So, some serious reading and experimentation are required…one thing’s for sure – knowing Task and friends like the back of your hand will make life much easier when C# 5.0 comes along