Wednesday, May 12, 2010

Micromanaging Memory Consumption

by Eduardo Rodrigues

As we all know, specially since Java 5.0, the JVM guys have been doing good job and have significantly improved a lot of key aspects, specially performance and memory management, which basically translates into our good old friend, the Garbage Collector (a.k.a. GC).

In almost all articles I’ve read on the memory subject, including those from Sun itself, a particular comment was always present. That briefly is:

The JVM loves small-and-short-living objects.

Don’t “nullify” variables (myObject = null;) when you decide they aren’t needed anymore as a way of hinting the GC that the objects once referenced by those variables are OK to be disposed.

I guess, after reading this “message” so many times, I finally internalized it in the form of a programming style, if I may. It’s actually very simple and takes advantage of a very basic structure, which is extremely common and kind of taken for granted. I’m talking about the very well-known code block. Yes, I’m talking about those code snippets squeezed and indented between a pair of curly braces like { <my lines of code go here> }.

In general, most programmers use these structures just because they have to as they are mandatory in so many portions of the Java language’s syntax. You need them when declaring classes, methods, try-catch-finally blocks, multi-line for loops, multi-line if-then-else blocks, etc. But the detail many programmers seem to forget is that these code blocks may actually be defined anywhere in a method body, unattached to any particular keyword or command. Even more, code blocks can be nested as well.

Besides being syntactically mandatory, the use of code blocks demarcated by opening and closing curly braces also imply a very important feature of the language. Code blocks also define variables scopes! I’ll explain…

Any variable that happens to declared inside a curly-braces-pair-demarcated code block will “exist” only within the context of that particular code block (or scope). Such variables are said to be “local variables”. Actually, if we try to use a local variable outside of its scope, we’ll most certainly get a compilation error, because that variable literally doesn’t exist outside that code block (or scope) where it was declared. And right there lies the very code of this best-practice tip.

Specifying well-defined scopes for all your local variables is actually the best way of hinting the GC about what strong references are or not in use when it kicks in. Simple enough, any strong reference coming from a variable declared inside a scope that is currently not being executed, is clearly to be considered as not in use by the GC, thus increasing the chances of proper and prompt disposal of the referenced object (if no other strong references to it exist, of course).

So, in order to better illustrate my case, here is a simple example. First let’s consider this very innocent piece of code:

public class Foo
{
   public final static void main(final String args[])
   {
        try 
        {
            DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = builderFactory.newDocumentBuilder();
            URL cfgUrl = this.getClass().getClassLoader().getResource("config.xml");
            File cfgFile =  new File(cfgUrl.toURI());

            Document cfg = builderFactory.newDocumentBuilder().parse(cfgFile);
            XPath xpath = XPathFactory.newInstance().newXPath();
            Node cfgNode = (Node)xpath.evaluate("//*[local-name()='config']", cfg, XPathConstants.NODE);

            (...)
           
        } catch (Exception ex) {
           ex.printStackTrace();
        }
    }
}

If we consider that in the very first part, variables builderFactory, builder and cfgUrl are not really needed after cfgFile is instantiated, rewriting that part like this would be preferred:

public class Foo
{
   public final static void main(final String args[])
   {
        try 
        {
            Document cfg;

            {
                DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
                DocumentBuilder builder = builderFactory.newDocumentBuilder();
                URL cfgUrl = this.getClass().getClassLoader().getResource("config.xml");
                File cfgFile =  new File(cfgUrl.toURI());
                cfg = builder.parse(cfgFile);
            }

            XPath xpath = XPathFactory.newInstance().newXPath();
            Node cfgNode = (Node)xpath.evaluate("//*[local-name()='config']", cfg, XPathConstants.NODE);

            (...)
           
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

With that, when the execution is passed the red code block, all local variables declared only in that context will cease to exist for all practical means. This simple example is a mere illustration and certainly doesn’t represent any major benefit but, believe me, in a real life code, using this approach of well-defined scopes for local variables may have a significant and positive impact on your application’s memory consumption profile.

As you can see, this is indeed a very simple Java best-practice tip. It’s easy to adopt, has no collaterals whatsoever and can prove to be very powerful. So, why not use it?

Enjoy!

6 comments:

ahmet ertan said...

That could be very effective. Thanks for sharing.

manish4u@gmail.com said...

Hi,

Nice post, and very well explained.

Just for the correctness of the post, you may want to replace the stmt:
Document cfg = builderFactory.newDocumentBuilder().parse(cfgFile);

with stmt:
Document cfg = builder.parse(cfgFile);

Thanks,
Manish.

Eduardo Ribeiro Rodrigues said...

Thanks for pointing that out.

Du bleu et du marron-vert said...

Hi,

I've been using this way of coding since a few years and I can confirm that it helps the way of thinking how to organize the code as well, but as logical as it seems, but I'm not sure it really helps the JVM. Sometimes, trying to help something very sophisticated does not help it at all.

+1 anyway :)
Thx,
Pierre
PS : and by the way, you can put a final in front of Document cfg;

Eduardo Ribeiro Rodrigues said...

Thanks for your comments Pierre!

Although you're absolutely right about the "final", this was not the focus of the post. Furthermore, as said, this was just a very simple example only intended to illustrate the use of code blocks as pure scope delimiters, nothing else.

Cheers,
Eduardo.

Metanet said...

I think you can refactor this code by taking the red-colored code block into a separate method. Then it is still on its own scope and clearer.