Mike Taulty's Blog
Bits and Bytes from Microsoft UK
Deconstructing LINQ to SQL (Part 1)


Mike Taulty's Blog



I had the dubious pleasure :-) of driving Daniel back from the MSDN Roadshow in Harrogate to his hotel in Manchester and in the car on the way back we were chatting about LINQ and one of the things he was saying was that he "wanted to know how LINQ to SQL works" in the sense of he wanted to know which piece of code actually generated the SQL that goes to the database and did it use SqlCommand and so on or did it do something else?

(As an aside, I hope Daniel found his hotel as I last saw him in the middle of Manchester and I've no real evidence yet that he survivied :-)).

I've written quite a bit about Deconstructing LINQ in the past but I've not spent much time on IQueryable<T> before so I want to start off with that.

I've read this post quite a few times in the past and I think that it has all of the detail in it but I must admit that I don't find it very easy to understand without an awful lot of extra thinking.

To me, the primary difference between IQueryable and IEnumerable with respect to LINQ is that I view IQueryable as offering the potential for "capturing the whole query and executing it in one go" whereas I view IEnumerable as "executing a set of functions in sequence on lists in order to produce more lists". The latter is great for querying objects in memory but it would not work for (e.g.) a query that needs to go to SQL Server whereas the former does. That is - you need to have the whole query captured and translated into T-SQL rather than just having each piece of it executed against the SQL Server (which would result, for example, in entire tables being sent from the server back to the "client").

If we have a query like this;

   int[] numbers = { 10, 20, 30, 40, 50 };
   IQueryable<int> q = numbers.AsQueryable();

   IQueryable t = q.Where(i => i > 10).Where(i => i < 50);


then this offers us the potential to behave differently from hitting against the numbers array directly because q here is IQueryable where numbers is IEnumerable.

Note that I say "potential" here because in this particular case (i.e. for integers in memory) I don't think there's any difference between IQueryable and IEnumerable other than the way in which we end up calling the implementation which ultimately comes from a class called Enumerable. If we queried numbers directly above then we'd just call Enumerable.Where straight-out whereas because we use IQueryable we go via a very different route to end up at the same place.

As far as I can work out, what happens here is as below;

  1. We have an extension method Queryable.AsQueryable which we use directly above.
  2. This method, Queryable.AsQueryable returns a new instance of the class SequenceQuery<T> which is itself IQueryable. In this case, the instance of SequenceQuery<T> will have an Expression property of type ConstantExpression and that will be point at the int[] that we have above.
  3. So, we have a SequenceQuery<int> where its Expression property is set to a ConstantExpression and the value of that expression is our integer array.
  4. The next thing that happens is that we call Where(i => i > 10). This is calling into SequenceQuery<T>.Where which does a most odd thing;
    1. Creates a MethodCallExpression passing the existing expression (int[]) and the predicate Lambda (i => i > 10).
    2. The MethodCallExpression needs a method to call :-) This code wires this method to point at itself !!!! That is, it wires the MethodCallExpression to point to SequenceQuery<T>.Where. Initially, this seems like an odd thing to do but we'll come back to it.
    3. Calls SequenceQuery<T>.CreateQuery passing that MethodCallExpression.
  5. So, now we have another SequenceQuery<int> which has its Expression property set to a MethodCallExpression with 2 arguments. The first is the first ConstantExpression pointing to our int[] and the second is a Lambda (i => i > 10).
  6. Now, we call Where(i => i < 50) and a similar thing happens in that we end up with another SequenceQuery<int> which has its Expression property set to a MethodCallExpression with 2 arguments. The first is the MethodCallExpression we got out of step 5 and a Lambda (i => i < 50).


So, we have captured the intent of the original code and we have called Where a couple of times on SequenceQuery<T> and we have created 3 SequenceQuery<T> instances and we end up with the last instance having an Expression property which is set to a MethodCallExpression as in;

MethodCallExpression (

  Method=SequenceQuery<T>.Where, Arguments={ MethodCallExpression, i => i < 50 })

  the first argument looks like;


    Method=SequenceQuery<T>.Where, Arguments= { int[], i => i > 10 })

What's really quite weird/radical/clever is that those MethodCallExpressions look like they will go and execute the very same methods on SequenceQuery<T> that created them in the first place (i.e. Where() in this case).

However, this isn't what happens. These MethodCallExpressions are really "stubs" in the sense that they'll get replaced at a later point.

So...having captured the definition of the query this way, we then get to enumerating it which is where "further magic" happens :-)


foreach (int i in t)


Here we go off and call GetEnumerator on the SequenceQuery<T> that we've got. What does this do?


private IEnumerator<T> GetEnumerator()
    if (this.sequence == null)
        Expression body = new SequenceRewriter().Visit(this.expression);
        ExpressionCompiler compiler = new ExpressionCompiler();
        Expression<Func<IEnumerable<T>>> lambda = Expression.Lambda<Func<IEnumerable<T>>>(body, (IEnumerable<ParameterExpression>) null);
        this.sequence = compiler.Compile<Func<IEnumerable<T>>>(lambda)();
    return this.sequence.GetEnumerator();


The interesting thing here is the SequenceRewriter. What does this do?

The SequenceRewriter goes and visits the expression tree and it does a lot of work. One of the things that it seems to do is to find any MethodCallExpressions and "move" them to point at methods different from the "stubs" that they originally point to.

This gets done in a method called VisitMethodCall which looks to essentially look at the type that is actually being queried and looks for methods that match the signature of (in this case) our Where method. However, if the original type was Queryable as it was for us then it short-circuits the whole process;


    if (m.Method.DeclaringType == typeof(Queryable))
        MethodInfo mi = FindSequenceMethod(m.Method.Name, source, typeArgs);
        source = this.FixupQuotedArgs(mi, source);
        return Expression.Call(instance, mi, source);

so if we started off this process with a Queryable then the method calls will get re-routed back to methods on the Enumerable class.

So...if I have something like;


   int[] numbers = { 10, 20, 30, 40, 50 };

   var query = 
     i => i > 10).Where(
      i => i < 50);

   IEnumerator<int> ie = query.GetEnumerator();

then the variable ie is of type System.Linq.Enumerable.WhereIterator<int>.

With all that said, I'll write some more in a second part to this post that deals more in terms of SQL than in terms of objects in memory and, hopefully, the reason why this redirection to the type that is actually being queried is really powerful.

(All mistakes mine - got this by ferreting around with Reflector so it might not be quite right).

Posted Fri, Mar 16 2007 2:56 AM by mtaulty


Mike Taulty's Blog wrote Deconstructing LINQ to SQL (Part 2)
on Fri, Mar 16 2007 3:01 AM
Following up on this previous post, I wanted to play a bit more with how this works for LINQ to SQL....
DotNetKicks.com wrote Deconstructing LINQ to SQL (Part 1)
on Fri, Mar 16 2007 5:23 AM
You've been kicked (a good thing) - Trackback from DotNetKicks.com
Craig Nicholson wrote Deconstructing LINQ to SQL
on Mon, Mar 19 2007 2:54 PM
I happened to stumble upon the blog of Mike Taulty and the following three very interesting articles