TechEd Europe – Follow Up 1. In LINQ, "Where" == "AndWhere"

Someone at TechEd yesterday asked me a great question about LINQ which is something that I’ve puzzled about before (and I was pretty sure that I’d written it up on this website but I can’t find it so hopefully I’m not duplicating myself).

Let’s say we have a query like this;

var entries = new[] {
        new { Name = "Albert", Age = 10 },
        new { Name = "Brian", Age = 20 },
        new { Name = "Charlie", Age = 30 }
      };

      var query = from e in entries
                  where e.Age < 20
                  select e;

      query = from q in query
              where q.Age > 20
              select q;

      foreach (var item in query)
      {
        Console.WriteLine(item);
      }

 

What does this print out? Nothing. Why? Because the Where clause in LINQ is really a filter.

Now, I’m using an array of anonymous types as the source of my query here which means we have IEnumerable<ANON_TYPE> and so this is probably what you would expect. The first Where clause acts as a filter on the array and that produces one element where (Age = 10). We then take that filtered list and we apply a second filter to it looking for (Age > 20) and we therefore end up with an empty list.

Maybe it’s more obvious if it’s restated as this;

   static void Main(string[] args)
    {
      var entries = new[] {
        new { Name = "Albert", Age = 10 },
        new { Name = "Brian", Age = 20 },
        new { Name = "Charlie", Age = 30 }
      };

      var query = 
        entries.Where(e => e.Age < 20).Where(e => e.Age > 20).Select(e => e);

      foreach (var item in query)
      {
        Console.WriteLine(item);
      }
    }

 

As it “looks” more obvious that here what happens is that we execute the first Where and then we execute the second Where.

Now, if we re-formulate this in the world of LINQ to SQL then (imho) this doesn’t necessarily have to be the case although it is the case. Imagine (using LINQ to SQL) I have;

 static void Main(string[] args)
    {
      using (NorthwindDataContext ctx = new NorthwindDataContext())
      {
        var query = from c in ctx.Customers
                    where c.Country == "Germany"
                    select c;

        query = from q in query
                where q.Country == "Spain"
                select q;

        foreach (var item in query)
        {
          Console.WriteLine(item.CustomerID);
        }
      }
    }

Which produces no results either.

Now, a customer asked me yesterday;

How come this (effectively) produces the following

SELECT * FROM CUSTOMERS WHERE COUNTRY=’Germany’ AND COUNTRY=’Spain’

rather than

SELECT * FROM CUSTOMERS WHERE COUNTRY=’Germany’ OR COUNTRY=’Spain’

and (to me) it’s because, whilst it would be possible to insert an OR clause in the case of an IQueryable implementation of LINQ such as LINQ to SQL it would not be possible to insert an OR clause in an IEnumerable implementation of LINQ and so you would lose consistency and things become messy.

Now, in order to preserve that consistency the query expression pattern doesn’t do this but the LINQ to SQL API does. So, one way of doing it with LINQ to SQL is by doing something like;

 static void Main(string[] args)
    {
      using (NorthwindDataContext ctx = new NorthwindDataContext())
      {
        var query = from c in ctx.Customers
                    where c.Country == "Germany"
                    select c;

        query = from q in
                  query.Union(
                    from c in ctx.Customers
                    where c.Country == "Spain"
                    select c)
                select q;

        foreach (var item in query)
        {
          Console.WriteLine(item.CustomerID);
        }
      }
    }

and that gets you to where you want to be, albeit in a longer way around 🙂