Java 2 Go!: javaone

Showing posts with label javaone. Show all posts

Sunday, September 2, 2007

JavaOne 2007 - Performance Tips 2 - Finish the finalizers!

by Eduardo Rodrigues

Continuing from my last post about some lessons learned at JavaOne'07 on Java performance since JDK 1.5, there's something we usually do not pay much attention to but which can get us some trouble: object finalizers.

Every time we override the protected void finalize() throws Throwable method, we are implicitly creating a postmortem hook to be called by the Garbage Collector after it finds that the object is unreachable and before it actually reclaims the object's memory space. In general, we override finalize() with the best of the intentions which is to ensure that all necessary disposal of system resources and any other cleanup will be performed before the object is permanently discarded. So why is that an issue?

Well, we all should know that finalize() is an empty method declared in java.lang.Object class, therefore, inherited by any existing Java class. So, when it's overridden, the JVM can't assume the default trivial finalization for the object anymore which means that "fast allocation" won't happen here. In fact, "finalizable" objects have much slower allocation simply because the VM must keep track of all finalize() hooks. Besides, those objects also give much more work to the GC. It takes at least 2 GC cycles (which are also slower) to reclaim a "finalizable" object. The first is the usual one when the GC identifies the object as garbage. The difference is that now it has to enqueue the object on finalization queue. Only during a next cycle GC will dequeue and call the object's finalize() method and, if we're lucky, discard the object and reclaim its space, or else, it may take another cycle just to finally get rid of that object.

If we look closer, we'll notice that putting more pressure on the GC and slowing down both initialization and finalization processes are not the only problems here. Let's take a quick look at the J2SE 5.0 API Javadoc for the Object.finalize() method:

"(...) After the finalize method has been invoked for an object, no further action is taken until the Java virtual machine has again determined that there is no longer any means by which this object can be accessed by any thread that has not yet died, including possible actions by other objects or classes which are ready to be finalized, at which point the object may be discarded. The finalize method is never invoked more than once by a Java virtual machine for any given object. Any exception thrown by the finalize method causes the finalization of this object to be halted (...)"

It is quite clear to me that there's a potential temporary (or even permanent) "memory leak" matter hidden in that piece of Javadoc. Since the JVM is obligated to execute the finalize() method before discarding any object overriding it, in fact, due to the additional GC cycles described above, not only that specific object will be retained longer in the heap but also any other objects that are still reachable from it. In the other hand, even after executing finalize(), the VM will not reclaim an object's space if, by any means, it may still be accessed by any object or class, in any living thread, even if they're also ready to be finalized. Like it isn't enough, if any exception is thrown uncaught during finalize() execution, the finalization of the object is halted and there's a good chance that, in this case, this object will be retained forever as garbage.

At last, the fact that the finalize() method should never be invoked more that once for any given object certainly implies the use of synchronization which is one more performance threatening element.

So, next time you consider writing a finalizer in a class, please, take a second look at it. And if you really have to do that, be really careful with the code you write and try to follow these tips:

Use finalizers only as a last resort!
Even if you do not explicitly override the finalize() method, library classes you extend may have done it. Look at the example bellow:

class MyFrame extends JFrame {
private byte[] buffer = new byte[16*1024*1024];
(...)
}

In JDK 1.5 and earlier, the 16MB buffer will survive, at least, 2 GC cycles before any MyFrame instance is discarded. That's because JFrame library class does declare a finalizer. So, try to split objects in cases like this:

class MyFrame {
private JFrame frame;
private byte[] buffer = new byte[16*1024*1024];
(...)
}
Even if you're considering to use a finalizer to dispose expensive and scarce resources, keep in mind that, being scarce, it's very likely that they will be exhausted before memory (assuming that memory is usually plentiful). So, in these cases, prefer to pool scarce resources instead.

To be continued...

Sunday, June 24, 2007

JavaOne 2007 - Performance tips

by Eduardo Rodrigues

Hello everybody!

I know I've promised more posts with my impressions on JavaOne 2007. So, here it goes...

Some of the most interesting technical sessions I've attended to were on J2SE performance and monitoring. In fact, I would highlight TS-2906: "Garbage Collection-Friendly Programming" by John Coomes, Peter Kessler and Tony Printezis from the Java SE Garbage Collection Group at Sun Microsystems. They certainly gave me a new vision on the newest GCs available.

And what does GC-friendly programming have to do with performance? Well, if you manage to write code that doesn't needlessly spend GC processing, you'll be implicitly avoiding major performance impacts to your application.

Today there are different kinds of GCs and a variety of approaches for them too. We have generational GCs which keeps young and old objects separetely in the heap and uses specific algorithms for each generation. We also have the incremental GC which tries to minimize GC disruption working in parallel with the application. There's also the possibility of mixing both using a generational GC with the incremental approach being applied only to the old generation space. Besides, we have campacting and non-compacting GCs; copying, mark-sweep and mark-compact algorithms; linear and free lists allocation and so on. Yeah... I know... another alphabet soup. If you want to know further about them, here are some interesting resources:

"Tuning Garbage Collection with the 5.0 Java Virtual Machine"
"Java HotSpot^tm VM Options"
and for the most curious
"Memory Management in the Java HotSpot^tm Virtual Machine"

The first and basic question should be "how do I create work for the GC?" and the most common answers would be: allocating new memory (higher allocation rate implies more frequent GCs), "live data" size (more work to determine what's live) and reference field updates (more overhead to the application and more work for the GC, especially for generational or incremental). With that in mind, there are some helpful tips for writing GC-friendly code:

Object Allocation

In recent JVMs, object allocation is usually very cheap. It takes only 10 native instructions in fast common cases. As a matter of fact, if you think that C/C++ has faster allocation you're wrong. Reclaiming new objects is very cheap too (especially for young generation spaces in generational GCs). So, do not be affraid to allocate small objects for intermediate results and remember the following:

GCs, in general, love small immutable objects and gerational GCs love small and short-lived ones;

Always prefer short-lived immutable objects instead of long-lived mutable ones;

Avoid needless allocation but keep using clearer and simpler code, with more allocations instead of more obscure code with fewer allocations.

As a simple and great example of how the tiniest details may jeopardize performance, take a look at the code bellow:

public void printVector(Vector v) {    for (int i=0; v != null && i < v.size(); i++) {       String s = (String) v.elementAt(i);       System.out.println(s.trim());    } }

This must look like a very inocent code but almost every part of it may be optimized for performance. Let's see... First of all, using the expression "v != null && i < v.size()" as the loop condition generates a totally unecessary overhead. Also, declaring the String s inside the loop implies needless allocation and, last but not least, using System.out.println is always an efficient way of making you code really slow (and that's inside the loop!). So, we could rewrite the code like this:

public void printVector(Vector v) {    if (v != null) {       StringBuffer sb = new StringBuffer();       int size = v.size();       for (int i=0; i < size; i++) {          sb.append(((String)v.elementAt(i)).trim());          sb.append("\n");       }       System.out.print(sb);    } }

And if we're using J2SE 1.5, we could do even better:

public void printVector(Vector<String> v) { //using Generics to define the vector's content type    if (v != null) {       StringBuilder sb = new StringBuilder();       //faster than StringBuffer since       //it's not synchronized and thread-safety       //is not a concern here       for (String s : v) { //enhanced for loop          sb.append( s.trim() );          //we're using Generics, so          //there's no need for casting          sb.append( "\n" );       }       System.out.print(sb);    } }

Large Objects

Very large objects are obviously more expensive to allocate and to initalize (zeroing). Also, large objects of different sizes can cause memory fragmentation (especially if you're using a non-compacting GC). So, the message here is: always try to avoid large objects if you can.

Reference Field Nulling

Differently of what many may think, nulling references rarely helps the GC. The exception is when you're implementing array-based data structures.

Local Variable Nulling

This is totaly unecessary since the JIT (Just In-Time compiler) is able to do liveness analysis for itself. For example:

void foo() { int[] array = new int[1024]; populate(array); print(array); //last use of array in method foo() array = null; //unnecessary! array is no //longer considered live by the GC ... }

Explicit GCs

Avoid them at all costs! Applications does not have all the information needed to decide when a garbage colletion should take place, besides, a call to System.gc() at the wrong time can hurt performance with no benefit. That's because, at least in HotSpottm, System.gc() does a "stop-the-world" full GC. A good way of preventing this is using -XX:+DisableExplicitGC option to ignore System.gc() calls when starting the JVM.

Libraries can also make explicit System.gc() calls. An easy way to find out is to run FindBugs to check on them.

If you're using Java RMI, keep in mind that it uses System.gc() for its distributed GC algorithm, so, try to decrease its frequency and use -XX:+ExplicitGCInvokesConcurrent option when starting the JVM.

Data Structure Sizing

Avoid frequent resizing and try to size data structures as realistically as possible. For example, the code bellow will allocate the associated array twice:

ArrayList list = new ArrayList(); list.ensureCapacity(1024);

So, the correct should be:

ArrayList list = new ArrayList(1024);

Object Pooling

This is another old paradigm that must be broken since it brings terrible allocation performance. As you must remember from the first item above, GC loves short-lived immutable objects, not long-lived and highly mutable ones. Unused objects in pools are like bad tax since they are alive and the GC must process them. Besides, they provide no benefit because the application is not using them.

If pools are too small, you have allocations anyway. If they are too large, you have too much footprint overhead and more pressure on the GC.

Because any object pool must be thread-safe by default, the use of synchronized methods and/or blocks of code are implicit and that defeats the JVM's fast allocation mechanism.

Of course, there are some exceptions like pools of objects that are expensive to allocate and/or initialize or that represent scarse resources like threads and database connections. But even in these cases, always prefer to use existing well-known libraries.

to be continued...

Monday, May 21, 2007

JavaOne 2007 - Web 2.0

by Eduardo Rodrigues

As promised, here goes my first summary on JavaOne 2007. The first topic will be Web 2.0.

I was greatly impressed with the quality we can achieve with respect to user interfaces in web based systems nowadays. To build real world applications provided with extremely rich interfaces like GMail or Yahoo! seems not to be so difficult as one could imagine. At first, it may seem a bit scary to face the challenge of widely adopting AJAX in our projects but my feeling (which has certainly been confirmed during J1) is that we've already reached the point of no return. Web developers should already be capacitating themselves in order to be able to provide web applications with a much more modern, interactive and richer user interface. Those who choose to ignore this fact are likely to be left behind. Offering only the minimum is about to become unacceptable.

Several libraries, plug-ins for the largest used IDEs (unfortunately JDeveloper is not included in that list, but that's another matter) and other stuff are emerging so fast that we must pay attention to them now or it might be more and more difficult to catch up later.

I'm quite sure this is not going to be a very smooth transition. All those concepts and approaches that have being ruling our user interface design and implementation (such as very well defined life-cycles and the old comfortable synchrony) are to be brought to the ground by absolute asynchrony and an avalanche of timers and messages triggering events, etc. This new approach is the foundation of the freedom and interactivity that are bringing the web to the next level.

Well, the good news for us working with Oracle JDeveloper is that Oracle is certainly aware of this process. A brand new set of ADF Faces components are coming along with JDeveloper 11g and promise to enable this great new technology in a very easy way. I'm talking about ADF Faces Rich Client which, just like ADF Faces itself, has just been donated by Oracle to the Apache foundation. For those interested in getting a taste of it, here are some links from OTN:

ADF Faces Rich Client
JDeveloper 11g Tech Preview

Another thing that caught my attention was jMaki. It's a framework for web 2.0 development that seems to make the task much faster and easier. It comes with most of the main widgets libraries, like Google, Yahoo! and DOJO out of the box. jMaki was created under Java.net GlassFish community seal and, until now, there's only a NetBeans plug-in available; however, it may be used with any J2EE IDE (running JDK 5 or greater). A very interesting characteristic of this framework is the solution given to communication between widgets. It's called GLUE and uses JavaScript functions to deliver a message bus to publish/subscribe events which makes it possible for a full decoupling of the various components (or widgets) since there's no need for argument passing between them at all. It's really worth looking at:

jMaki Project

Well... I think that's enough for now. Next subject will be "JVM performance and monitoring".

Best regards for all!

Sunday, May 20, 2007

JavaOne 2007... I was there

by Eduardo Rodrigues

One week late but... never too late.

Yes! I went to JavaOne 2007. And it was great! People (lots of) from all over the world were there. The most important players, the men behind the curtains, they were all there.

There were too many sessions for a single human being to attend to. So I had to filter them hoping to choose the best ones. Of course my filter wasn't very accurate all the time.

I focused my interests in the following subjects: web 2.0, JVM's performance and monitoring, mobile and SOA. Of all sessions I attended (an average of 4/day), those which have enriched me the most were on web 2.0, performance and monitoring and mobile. I wasn't so lucky with the SOA sessions I chose. One was too comercial and the other was too boring. A pity because that's a subject in which I have great interest. So let's skip the bad parts and stick to the good ones...