Published Thursday, February 17, 2005 7:48 AM by mtaulty

On Garbage Collection, Scope and Object Lifetimes

I posted this the other day and then Marcus only went and read it and spotted the glaring error I'd left in the middle of it so this is a re-post... (thanks Marcus!!)

For no obvious reason I've sat through a few mentions or even full explanations of .NET Garbage Collection in the past few weeks. In quite a few of those explanations I've heard people link the idea of object lifetimes and scope and (although this explained elsewhere) I wanted to write something down about that stuff.

Here we go...

In plain old C++, I found scope to be a lovely thing and not only because of its possibilities for encapsulation but mostly because of it's possibilities for offering deterministic finalisation.

As an example, I might have something like;

#include "stdafx.h"

#include <iostream>

 

using namespace std;

 

class InScopeCounter

{

public:

     InScopeCounter()

     {

           ++_count;

     }

     ~InScopeCounter()

     {

           --_count;

     }

     static int GetCount()

     {

           return(_count);

     }

 

private:

     static int _count;

};

 

int InScopeCounter::_count = 0;

 

int _tmain(int argc, _TCHAR* argv[])

{

     InScopeCounter one;

 

     cout << "In scope " << InScopeCounter::GetCount() << endl;

 

     {

           InScopeCounter two;

 

           cout << "In scope " << InScopeCounter::GetCount() << endl;

 

           {              

                InScopeCounter three;

 

                cout << "In scope " << InScopeCounter::GetCount() << endl;

           }

           cout << "In scope " << InScopeCounter::GetCount() << endl;

     }

}

and in doing that I scope my three variables to be the whole function, the blue block and the yellow block respectively and I know that my desctructor runs at the end of the scopes and, naturally, I'm getting deterministic finalisation because I know exactly when my destructor is meant to run.

The code prints:

    In scope 1

    In scope 2

    In scope 3

    In scope 2

Now, everyone pretty much knows that in .NET this isn't the case and so if I write some comparable C#;

using System;

 

namespace ConsoleApplication6

{

     public class InScopeCounter

     {

           public InScopeCounter()

           {

                ++_count;

           }

           ~InScopeCounter()

           {

                --_count;

           }

           public static int Count

           {

                get

                {

                      return(_count);

                }

           }   

           private static int _count = 0;

     }

     class Class1

     {

           [STAThread]

           static void Main(string[] args)

           {

                InScopeCounter one = new InScopeCounter();

 

                Console.WriteLine(InScopeCounter.Count);

                {

                      InScopeCounter two = new InScopeCounter();

 

                      Console.WriteLine(InScopeCounter.Count);

 

                      {

                           InScopeCounter three = new InScopeCounter();

 

                           Console.WriteLine(InScopeCounter.Count);

                      }

                }

           }

     }

}

 

Then I've no real idea as to when my finalizers are going to run and so the example becomes a bit meaningless even though my scope blocks are still nicely encapsulating the variables from the point of only allowing me to access them within their respective scope.

 

The code prints;

 

    1

    2

    3

 

I think pretty much everybody out there is happy with that idea. I think what seems to confuse people more is that the lifetime of the variable two in the above fragment is not necessarily bound to the scope block that contains it (the highlighted yellow block above).

 

I've seen a lot of people talk about how two in the above example is not eligible for GC until the end of that scope block and that's not right. Now as far as I know, with debug code you'll find that is the case but you can't assume it because with release code it could be different.

 

Let's see if we can illustrate this with a smaller piece of code;

 

 

           static void Main(string[] args)

           {

                object o = new object();

               

                WeakReference r = new WeakReference(o);

 

                GC.Collect();

 

                Console.WriteLine("Still alive? {0}",

                      r.IsAlive ? "yes" : "no");

           }

 

 

WeakReference is a little used class which will surrender the object reference that it is holding on my behalf if the GC asks it to.

 

If I compile this code up using debug mode in Visual Studio .NET then it prints out;

 

    Still alive? yes

 

whereas is I compile it up in release mode it prints out;

 

    Still alive? no

 

Which means that the GC must have collected my object before the end of the scope. That is,

 

 

          static void Main(string[] args)

           {

                object o = new object();

               

                WeakReference r = new WeakReference(o);

 

                GC.Collect();

 

                Console.WriteLine("Still alive? {0}",

                      r.IsAlive ? "yes" : "no");

           }

 

 

The compiler is smart enough to work out that beyond the red block of code above the object o is never "touched" and, therefore, it's eligible for GC within the method and before the scope ends.

 

Naturally, if we tweak the code so we make use of o after the collection like this;

 

           static void Main(string[] args)

           {

                object o = new object();

               

                WeakReference r = new WeakReference(o);

 

                GC.Collect();

 

                Console.WriteLine("Still alive? {0} {1}",

                      r.IsAlive ? "yes" : "no", o);

           }

 

Then the code will always print "yes" because o survives as we're using it after the GC call.

 

So, in the previous example how did the compiler work with the GC to determine that o was no longer needed after the highlighted red code? Is the C# compiler generating different CIL based on debug/release mode?

 

The code for the main method is almost identical between the debug version and the release version of the code so it would seem that the CIL isn't doing this for us other than perhaps taking note of the basic settings as to whether the code is meant to be debugged and/or optimised or not.

 

So, if it's not the CIL then it has to be the JITter doing different work depending on those settings. How can we see what's happening?

 

The "easiest" way I've found of demonstrating this is to compile up the CIL. For the code that we had previously, you can compile it up with something like;

 

    csc /t:exe /debug:full mycode.cs

 

Then take the executable that you've got mycode.exe and disassemble it to IL;

 

    ildasm /adv /text mycode.exe > mycode.il

 

Then reassemble it;

 

    ilasm /debug mycode.il

 

Now we've got some CIL source code that we can debug. Loading it up with cordbg (command line debugger from the .NET Framework SDK) we can see more about what's going on. Here's a session;

 

Microsoft (R) Common Language Runtime Test Debugger Shell Version 1.1.4322.573

Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.

 

(cordbg) run app.exe

Process 6012/0x177c created.

Warning: couldn't load symbols for c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

[thread 0x15d4] Thread created.

 

116:       IL_0000:  newobj     instance void [mscorlib]System.Object::.ctor()

 

(cordbg) JIT's will produce debuggable (non-optimized) code

 

 

(cordbg) Still alive? yes

[thread 0x304] Thread created.

[thread 0x15d4] Thread exited.

Process exited.

(cordbg)   }

 

and here's a second session where we've switched "mode JITOptimizations 1" to get the JIT to optimise the code for us;

 

Microsoft (R) Common Language Runtime Test Debugger Shell Version 1.1.4322.573

Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.

 

(cordbg) run app.exe

Process 4100/0x1004 created.

Warning: couldn't load symbols for c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

Failed to enable JIT Optimizations for c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

[thread 0x1020] Thread created.

 

116:       IL_0000:  newobj     instance void [mscorlib]System.Object::.ctor()

 

(cordbg) JIT's will produce optimized code

 

 

(cordbg) Terminating current process...

Process exited.

Process 5480/0x1568 created.

Warning: couldn't load symbols for c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

Failed to enable JIT Optimizations for c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

[thread 0x146c] Thread created.

 

116:       IL_0000:  newobj     instance void [mscorlib]System.Object::.ctor()

 

(cordbg) Still alive? no

[thread 0x1484] Thread created.

[thread 0x146c] Thread exited.

Process exited.

(cordbg)

 

So we can see that JIT optimizations are changing the behaviour of when these variable references are being released during the lifetime of the method and you can imagine that when the optimizations are switched on the JIT might choose to use registers for certain variables rather than put them in explicit slots on the stack.

 

That's all I wanted to write here - variable lifetimes are not tied to scope and we can see that illustrated with these examples. If you want to know more then there's a performance webcast from Gregor Noriskin over here on the Microsoft web site and one of the points that Gregor makes is that;

 

    "JIT tracks the lifetime of a set number of locals and formal arguments (64 for V1 of the CLR)".

 

    Where possible local variables are kept in registers rather than putting them onto the stack which requires a store and load. Only 64 local variables or arguments get tracked.

 

So, we can imagine that in our case here the JIT is tracking our variable lifetimes when it's doing optimizations and it's putting things into registers and probably overwriting the values in there when it works out that our method never touches a particular variable again and, hence, our GC picks up a variable which is still in scope but no longer in use.

 

 

 

# RE: On Garbage Collection, Scope and Object Lifetimes @ Friday, February 18, 2005 3:47 AM

... but it was still an interesting post! :)

I tried this on Windows Mobile and got some interesting results - need to do a bit more investigation and then I will post about what I found.

Marcus

mtaulty