Grouping in LINQ is weird (IGrouping<K,T> is your friend)

Ok, it’s not weird really – I was just trying for an “attention grabbing headline” 🙂

However, it does seem to throw people when they play with it so I thought I’d break it down a little here.

Say I’ve got some data such as;

  class Fruit
  {
    public string Name { get; set; }
    public string Type { get; set; }
    public decimal Price { get; set; }
    public int Quantity { get; set; }
  }
     Fruit[] data = new Fruit[]
      {
        new Fruit { Name="Gala", Type="Apple", Price=0.75m, Quantity=10 },
        new Fruit { Name="Granny Smith", Type="Apple", Price=0.80m, Quantity=7 },
        new Fruit { Name="Tasty", Type="Strawberries", Price=1.90m, Quantity=20 }
      };

and I want to group up by Type. This is easy enough to do;

 var grouped = from fruit in data
                    group fruit by fruit.Type;

but what the heck is grouped? Well, it’s (in my case) as GroupedEnumerable but that’s not a public type so it doesn’t help much. In many ways, it’s much easier to figure out what’s going on here when you look at this kind of query “long hand”, for example;

var grouped = data.GroupBy(record => record.Type);

Now, what’s the return type of this? It should look like this;

   IEnumerable<IGrouping<string, Fruit>> grouped = 
        data.GroupBy(record => record.Type);

And then things become a little more obvious. We get back an enumeration of IGrouping<K,T>. Now, IGrouping<K,T> looks like this;

 

image

so an IGrouping<K,T> essentially has a Key property of type K and it also implements IEnumerable<T> so it’s pretty obvious what it is we’re now doing with this – we get back a list of entries (one for each group) and each of those entries allows us to get access to the Key and it also then allows us to enumerate the grouped values for that Key value. So, we can do;

 IEnumerable<IGrouping<string, Fruit>> grouped = 
        data.GroupBy(record => record.Type);

      foreach (IGrouping<string,Fruit> group in grouped)
      {
        Console.WriteLine("Key is {0}", group.Key);

        foreach (Fruit fruit in group)
        {
          Console.WriteLine("\t{0}", fruit.Name);
        }
      }

and, of course, you can make all this look a lot smaller taking a lot of the explicit typing away;

    static void Main(string[] args)
    {
      var data = new []
      {
        new { Name="Gala", Type="Apple", Price=0.75m, Quantity=10 },
        new { Name="Granny Smith", Type="Apple", Price=0.80m, Quantity=7 },
        new { Name="Tasty", Type="Strawberries", Price=1.90m, Quantity=20 }
      };

      var query = from fruit in data
              group fruit by fruit.Type;

      foreach (var group in query)
      {
        Console.WriteLine("Key {0}", group.Key);

        foreach (var item in group)
        {
          Console.WriteLine("\t{0}", item.Name);
        }
      }
    }

and even go ahead and group up different properties if the whole “object” isn’t of interest;

   var data = new []
      {
        new { Name="Gala", Type="Apple", Price=0.75m, Quantity=10 },
        new { Name="Granny Smith", Type="Apple", Price=0.80m, Quantity=7 },
        new { Name="Tasty", Type="Strawberries", Price=1.90m, Quantity=20 }
      };

      var query =
              from fruit in data
              group fruit.Price * fruit.Quantity by fruit.Type
                into grouped
                select new
                {
                  Name = grouped.Key,
                  Total = grouped.Sum(),
                  Entries = grouped.Count()
                };

      foreach (var group in query)
      {
        Console.WriteLine(group);
      }

 

But I think it all becomes a lot clearer if you’ve had a look at GroupBy and IGrouping<K,T>.