Java 2 Go!: June 2007

Sunday, June 24, 2007

JavaOne 2007 - Performance tips

by Eduardo Rodrigues

Hello everybody!

I know I've promised more posts with my impressions on JavaOne 2007. So, here it goes...

Some of the most interesting technical sessions I've attended to were on J2SE performance and monitoring. In fact, I would highlight TS-2906: "Garbage Collection-Friendly Programming" by John Coomes, Peter Kessler and Tony Printezis from the Java SE Garbage Collection Group at Sun Microsystems. They certainly gave me a new vision on the newest GCs available.

And what does GC-friendly programming have to do with performance? Well, if you manage to write code that doesn't needlessly spend GC processing, you'll be implicitly avoiding major performance impacts to your application.

Today there are different kinds of GCs and a variety of approaches for them too. We have generational GCs which keeps young and old objects separetely in the heap and uses specific algorithms for each generation. We also have the incremental GC which tries to minimize GC disruption working in parallel with the application. There's also the possibility of mixing both using a generational GC with the incremental approach being applied only to the old generation space. Besides, we have campacting and non-compacting GCs; copying, mark-sweep and mark-compact algorithms; linear and free lists allocation and so on. Yeah... I know... another alphabet soup. If you want to know further about them, here are some interesting resources:

"Tuning Garbage Collection with the 5.0 Java Virtual Machine"
"Java HotSpot^tm VM Options"
and for the most curious
"Memory Management in the Java HotSpot^tm Virtual Machine"

The first and basic question should be "how do I create work for the GC?" and the most common answers would be: allocating new memory (higher allocation rate implies more frequent GCs), "live data" size (more work to determine what's live) and reference field updates (more overhead to the application and more work for the GC, especially for generational or incremental). With that in mind, there are some helpful tips for writing GC-friendly code:

Object Allocation

In recent JVMs, object allocation is usually very cheap. It takes only 10 native instructions in fast common cases. As a matter of fact, if you think that C/C++ has faster allocation you're wrong. Reclaiming new objects is very cheap too (especially for young generation spaces in generational GCs). So, do not be affraid to allocate small objects for intermediate results and remember the following:

GCs, in general, love small immutable objects and gerational GCs love small and short-lived ones;

Always prefer short-lived immutable objects instead of long-lived mutable ones;

Avoid needless allocation but keep using clearer and simpler code, with more allocations instead of more obscure code with fewer allocations.

As a simple and great example of how the tiniest details may jeopardize performance, take a look at the code bellow:

public void printVector(Vector v) {    for (int i=0; v != null && i < v.size(); i++) {       String s = (String) v.elementAt(i);       System.out.println(s.trim());    } }

This must look like a very inocent code but almost every part of it may be optimized for performance. Let's see... First of all, using the expression "v != null && i < v.size()" as the loop condition generates a totally unecessary overhead. Also, declaring the String s inside the loop implies needless allocation and, last but not least, using System.out.println is always an efficient way of making you code really slow (and that's inside the loop!). So, we could rewrite the code like this:

public void printVector(Vector v) {    if (v != null) {       StringBuffer sb = new StringBuffer();       int size = v.size();       for (int i=0; i < size; i++) {          sb.append(((String)v.elementAt(i)).trim());          sb.append("\n");       }       System.out.print(sb);    } }

And if we're using J2SE 1.5, we could do even better:

public void printVector(Vector<String> v) { //using Generics to define the vector's content type    if (v != null) {       StringBuilder sb = new StringBuilder();       //faster than StringBuffer since       //it's not synchronized and thread-safety       //is not a concern here       for (String s : v) { //enhanced for loop          sb.append( s.trim() );          //we're using Generics, so          //there's no need for casting          sb.append( "\n" );       }       System.out.print(sb);    } }

Large Objects

Very large objects are obviously more expensive to allocate and to initalize (zeroing). Also, large objects of different sizes can cause memory fragmentation (especially if you're using a non-compacting GC). So, the message here is: always try to avoid large objects if you can.

Reference Field Nulling

Differently of what many may think, nulling references rarely helps the GC. The exception is when you're implementing array-based data structures.

Local Variable Nulling

This is totaly unecessary since the JIT (Just In-Time compiler) is able to do liveness analysis for itself. For example:

void foo() { int[] array = new int[1024]; populate(array); print(array); //last use of array in method foo() array = null; //unnecessary! array is no //longer considered live by the GC ... }

Explicit GCs

Avoid them at all costs! Applications does not have all the information needed to decide when a garbage colletion should take place, besides, a call to System.gc() at the wrong time can hurt performance with no benefit. That's because, at least in HotSpottm, System.gc() does a "stop-the-world" full GC. A good way of preventing this is using -XX:+DisableExplicitGC option to ignore System.gc() calls when starting the JVM.

Libraries can also make explicit System.gc() calls. An easy way to find out is to run FindBugs to check on them.

If you're using Java RMI, keep in mind that it uses System.gc() for its distributed GC algorithm, so, try to decrease its frequency and use -XX:+ExplicitGCInvokesConcurrent option when starting the JVM.

Data Structure Sizing

Avoid frequent resizing and try to size data structures as realistically as possible. For example, the code bellow will allocate the associated array twice:

ArrayList list = new ArrayList(); list.ensureCapacity(1024);

So, the correct should be:

ArrayList list = new ArrayList(1024);

Object Pooling

This is another old paradigm that must be broken since it brings terrible allocation performance. As you must remember from the first item above, GC loves short-lived immutable objects, not long-lived and highly mutable ones. Unused objects in pools are like bad tax since they are alive and the GC must process them. Besides, they provide no benefit because the application is not using them.

If pools are too small, you have allocations anyway. If they are too large, you have too much footprint overhead and more pressure on the GC.

Because any object pool must be thread-safe by default, the use of synchronized methods and/or blocks of code are implicit and that defeats the JVM's fast allocation mechanism.

Of course, there are some exceptions like pools of objects that are expensive to allocate and/or initialize or that represent scarse resources like threads and database connections. But even in these cases, always prefer to use existing well-known libraries.

to be continued...

Wednesday, June 6, 2007

JDeveloper Tips #2: Fine-tuning the configuration

by Eduardo Rodrigues

Yet another great tip - this one is specially directed to those using JDeveloper on Windows.

It may seem strange but the amount of programmers aware of the possibility of customizing JDev's initialization settings isn't so big as you may expect. Many don't even know about the existence of a configuration file. Well, there is a configuration file and it's located at %JDEV_HOME%\jdev\bin\jdev.conf (%JDEV_HOME% being the directory where you've installed JDeveloper). If you open this file you'll see a great number of options, properties, etc. The guys at Oracle did their job and commented on every one, so it won't be difficult to figure out their purpose.

Having said that, I'd like to share with you some lessons learned through my own experience that have certainly made my work with JDeveloper much smoother:


#
# This is optional but it's always
# interesting to keep your JDK up to date
# as long you stay in version 1.5
#
SetJavaHome C:\Program Files\Java\jdk1.5.0_12

#
# Always a good idea to set your User Home
# appropriately. To do so, you must
# configure an environment variable in
# the operating system and set its value
# with the desired path
# (i.e. JDEV_USER_HOME=D:\myWork\myJDevProjs).
# Then you must set the option bellow with
# the variable's name.
#
# You'll notice that when you change
# the user home directory, JDev will ask
# you if you want to migrate from a
# previous version. That's because it
# expects to find a "system" subdirectory.
# If you don't wanna loose all your config
# I recommend that you copy the "system"
# folder from its previous location
# (%JDEV_HOME%\jdev\system is the default) to
# your new JDEV_USER_HOME before restarting
# JDev.
#
SetUserHomeVariable JDEV_USER_HOME

#
# Set VFS_ENABLE to true if your
# projects contain a large number of files.
# You should use this specially if
# you're using a versioning system.
#
AddVMOption    -DVFS_ENABLE=true

#
# Try to make JDev always fit in your available
# physical memory.
# I really don't recommend setting the maximum
# heap size to less than 512M but sometimes it's
# better doing this than having to get along with
# unpleasant Windows memory swapping.
#
# Just a reminder: this option does not establish
# an upper limit for the total memory allocated
# by the JVM. It limits only the heap area.
#
AddVMOption    -Xmx512M

#
# Use these options bellow ONLY IF you're
# running JDeveloper on a multi-processor or
# multi-core machine.
#
# These options are designed to optimize the pause
# time for the hotspot VM.
# These options are ignored by ojvm with an
# information message.
#
AddVMOption    -XX:+UseConcMarkSweepGC
AddVMOption    -XX:+UseParNewGC
AddVMOption    -XX:+CMSIncrementalMode
AddVMOption    -XX:+CMSIncrementalPacing
AddVMOption    -XX:CMSIncrementalDutyCycleMin=0
AddVMOption    -XX:CMSIncrementalDutyCycle=10

#
# On a multi-processor or multi-core machine you
# may uncomment this option in order to
# limit CPU consumption by Oracle JVM client.
#
# AddVMOption     -Xsinglecpu

#
# This option isn't really documented but
# it's really cool!
# Use this to prevent Windows from paging JDev's memory
# when you minimize it.
# This option should have the same effect as
# the KeepResident plug-in with the advantage
# of being a built-in feature in Sun's JVM 5.
#
AddVMOption -Dsun.awt.keepWorkingSetOnMinimize=true

Tuesday, June 5, 2007

JDeveloper Tips #1: Managing your libraries

by Felippe Oliveira

Hi folks! This post is directed to Oracle JDeveloper users and was originally written by Felippe Oliveira who is a consultant for Oracle Brazil.

Do you have a hard time tying to figure out the best way of configuring your projects' libraries so they're truly portable? Well, the lack of an easy-to-use "environment variables" setting mechanism (like the one we may find in Eclipse) can make it even harder. So here's a userful suggestion to address this issue.

Basically, the JDeveloper workspace consists of applications that are composed of projects which, in fact, contain packages, classes, resources and other files. This structure is normally reflected in the filesystem.

Let's say you're working on 2 ADF applications and your local work directory is c:\mywork. The directory structure should look like this:

The question is: where should you place your custom and/or external libraries? The best answer is as follows:

Step 1: create a child subdirectory of c:\mywork and put them all there, like this:

Step 2: back to your JDev, select "Tools -> Manage Libraries..." in the menu and click the "Load Dir..." button and select the lib directory created before.

Note that a new "lib" folder will appear in the "Libraries" tab.

Step 3: Click the "New..." button to create each of your new libraries, referencing the corresponding JAR or ZIP files in the c:\mywork\lib directory:

The main advantage of doing this relies in that JDev puts one file with a ".library" extension in c:\mywork\lib for each of the libraries you've created. Plus, all paths referenced in those files will be relative to c:\mywork. Now, if you need to recreate the whole workspace in another JDev installation, all you have to do is copy c:\mywork to any other location in the destination machine and repeat step 2. This time you'll notice that all libraries will be automatically listed under the "lib" folder in the "Libraries" tab and that's it. Your libraries are ready to go!

Another interesting advantage to consider is that this structure is ideal for versioning systems. Just import the entire structure under c:\mywork into the repository. Whoever checks out the same structure won't have to reconfigure all projects' libraries nor adjust them to their local directories.

That's all for now. Thanks again to Felippe. Good stuff!