Archive for the ‘java’ Category

Printing Java garbage collection time stamps

Tuesday, September 1st, 2009

I’ve been spending a bunch of time lately looking at Java garbage collection (GC) performance & tuning.  One of the old standby options for GC logging is:

-XX:+PrintGCTimeStamps

this option tells the JVM to print a time stamp in the front of each GC line that is the number of seconds since the Java process started.  You end up with lines like:

1.449: [GC 76756K->5892K(504320K), 0.0226648 secs]
1.472: [Full GC 5892K->5752K(504320K), 0.1099638 secs]
4.276: [GC 137848K->14895K(504320K), 0.0333021 secs]
5.653: [GC 146991K->63707K(504320K), 0.0858191 secs]
38.536: [GC 195803K->77393K(636416K), 0.0683676 secs]
144.875: [GC 341585K->97580K(637440K), 0.1131954 secs]

While these are better than nothing, it can be a PITA to translate these to something useful, like say wall clock time, such as when you want to correlate a GC time to an external alarm.

Starting in Sun’s JDK version 1.6u4, there is another option that can be enabled that actually prints both the wall clock time as well as the seconds since process start.  This is way more user-friendly.  I suspect there is a extremely minute performance penalty to add this extra info, but I’d be shocked if it is actually perceptable.  If anyone knows differently, please let me know.  This option is:

-XX:+PrintGCDateStamps

If you have it enabled, the output looks like:

2009-09-02T23:02:07.151-0400: 1.463: [GC 79398K->5887K(504320K), 0.0236701 secs]
2009-09-02T23:02:07.175-0400: 1.486: [Full GC 5887K->5753K(504320K), 0.1095894 secs]
2009-09-02T23:02:10.028-0400: 4.339: [GC 137849K->14987K(504320K), 0.0345331 secs]
2009-09-02T23:04:31.707-0400: 146.018: [GC 147083K->23143K(504320K), 0.0455040 secs]
2009-09-02T23:14:28.502-0400: 742.813: [GC 155239K->28406K(504320K), 0.0912850 secs]

As you can see, this is much more easily digestable to  humans.  Since you still have the ‘seconds since program start’ info handy, you can continue to use scripts/programs to easily calculate the time deltas between GCs.

Some useful links I’ve found for Java tuning in general are:

  • Resources from Matty’s presentation:

http://prefetch.net/blog/index.php/2008/02/05/java-performance-presentation/

  • Java performance forum:

http://forums.java.net/jive/forum.jspa?forumID=60

  • Jon Masamitsu’s blog: (the last post was over a year ago, starting to age, but some good material)

http://blogs.sun.com/jonthecollector/

  • Garbage Collection Tuning in the Java HotSpot Virtual Machine presentation at Java One 2009

http://developers.sun.com/learning/javaoneonline/sessions/2009/pdf/TS-4887.pdf

Java garbage collection gone horribly wrong

Thursday, August 28th, 2008

I was at a customer site and saw a java app that was chewing CPU.  Using prstat -L  we saw the busy threads were 2 through 9.  Running pstack $pid/$thread showed that the spinning threads were all running garbage collection.  Luckily verbose GC logging was turned on so we were able to look at the log to see how long the previous full GC cycles lasted:

20493.297: [Full GC [PSYoungGen: 2040K->0K(240064K)] [ParOldGen: 697548K->397070K(1077248K)] 699589K->397070K(1317312K) [PSPermGen: 243883K->243054K(262144K)], 9.3868202 secs]
23490.529: [Full GC [PSYoungGen: 6063K->0K(663616K)] [ParOldGen: 1004912K->588931K(1400832K)] 1010975K->588931K(2064448K) [PSPermGen: 247384K->245927K(262144K)], 613.1687549 secs]
25069.416: [Full GC [PSYoungGen: 660290K->0K(664704K)] [ParOldGen: 961941K->1287172K(1400832K)] 1622231K->1287172K(2065536K) [PSPermGen: 246538K->246128K(262144K)], 14492.5636839 secs]

14492 seconds =  just over 4 hours! Yikes!

DTrace

Tuesday, February 12th, 2008

I attended a Sun presentation on DTrace today. Here are links to some good resources:

Some anecdotes that were mentioned:

Buffer sizes

When DTrace was fired up on some monster sized boxes (like a 144 core E25k), it took a while for DTrace to kick in. The issue turned out to be that, by default, DTrace allocates 4MB of memory for each core. On a 144 core machine, this means you need to get ~ 600MB allocated The presenter said he decided to lower the buffer size to 256k or so per CPU and that made DTrace start up much faster.

DTrace aborts on Opteron machines

If you are using dual-core Sun x4100 M2 or X4200 M2 servers and Solaris 10u3 or earlier, you will almost definitely get errors when trying to run DTrace where it aborts almost instantly. You are hitting:

bug id 6507659 tsc differences between CPU’s give dtrace_gethrtime() serious problems

The workaround (-w option) is mentioned in the ticket, but that also disables DTrace safety features, so use caution. If you are on Solaris 10u4 (aka 2008/07) or patched to 120012-14, you should be fine and won’t need the workaround.

Which probes are safe for production?

Probe effect is almost entirely a function of how often they get called. The presenter mentioned that using the syscall, io, and profile providers are almost always fine in production. He did mention that he would not recommend using the pid provider in most cases on a busy process in production, although the the only affect will be slowing the specific traced process down.

DTrace Toolkit

I really wanted to emphasize how good I think this collection of scripts/examples are. I consider it to be almost like the SE Toolkit of the DTrace world. Even if you have no desire to wade through the DTrace docs, or don’t consider yourself much of a scripter, take a look at the DTrace Toolkit and try running some of the examples.

Java developers – stack traces aren’t helpful to end users

Tuesday, March 27th, 2007

I just tried registering a Solaris 10 11/06 machine using sconadm and here is the resulting output:

/usr/sbin/sconadm register -a -r /tmp/regprofile
sconadm is running
javax.management.remote.JMXProviderException: Connection refused
at com.sun.cacao.rmi.impl.CacaoRMIConnectorProvider.newJMXConnector(Caca oRMIConnectorProvider.java:415)
at javax.management.remote.JMXConnectorFactory.getConnectorAsService(JMX ConnectorFactory.java:415)
at javax.management.remote.JMXConnectorFactory.newJMXConnector(JMXConnec torFactory.java:307)
at com.sun.scn.client.MSCConnection.doMscConnect(MSCConnection.java:155)
at com.sun.scn.client.MSCConnection.mscConnectWithKeystorePassword(MSCCo nnection.java:67)

(about 30 lines snipped out by me)

at com.sun.cns.basicreg.BasicReg.<init>(BasicReg.java:100)
at com.sun.cns.basicreg.BasicRegCLI.main(BasicRegCLI.java:523)

How about something more helpful like:

ERROR 68: Can’t connect to the fuzzbat service on server server99.

Please check that the fuzzbat service is enabled with ‘svcadm enable fuzzbat; svcs fuzzbat’.

While stack traces are helpful to developers, they can largely obfuscate the problem to the end-user and certainly don’t provide guidance on what the likely problem is and what the next step should be. If you want to print stack traces, at least try to provide some actionable information to the end user.

(My error was apparently generated because I had renamed the machine, but hadn’t bounced some services. After I rebooted, the sconadm command worked fine.)


Copyright © 2010 williamhathaway.com. All Rights Reserved.
No computers were harmed in the 0.387 seconds it took to produce this page.

Designed/Developed by Lloyd Armbrust & hot, fresh, coffee.