blog.smart-java.nl
Ordina J-Technologies – Java Blog

Archief ‘performance’ categorie




Reflection slow? Class vs. Constructor newInstance

Door: Jan-Kees van Andel, 1 May 2010

My last article triggered a lot of responses of people, which I, of course, try to answer.

One question, by Tofarr, was:

You are not caching the contructor! class.newInstance() internally looks for the no arg constructor each time – I would be curious to see how doing a Constructor c = class.getConstructor() would affect the timing…

This is not for 100% true, since (at least in my Java version) Class.newInstance() caches the no-arg constructor after it’s first lookup. But, the suggestion is worth trying.

So, let’s go!

Caching and instantiating the Class

In the original article, this was the task with the cached Class object.

Caching and invoking the Constructor

So I added a simple implementation which caches the Constructor object, like this:

The code should speak for itself. We don't just cache the Class object, but also the Constructor object. Caching only the Constructor in not enough, since we still need the Class to reflectively lookup methods. This wouldn't be necessary if we cache the Methods. But, for the sake of a fair comparison, both implementations don't cache the methods.

Running the benchmark

Let's run it, using the following command:

In the previous article, I didn't show you the run command, so let's catch up.

  • -jar
    Nothing special, I just packaged the benchmark in a JAR to easily transfer it across machines
  • -server
    This option is used to enable the high performance "server" VM. Since Java SE 5, Ergonomics are used to automatically pick an appropriate VM, based on the characteristics of the hardware, OS, etc. We explicitly choose server to make the test more reliable.
  • -XX:+TraceClassLoading
    Output each loaded class to the console. Used to make sure that the test isn't disrupted by class loading, which would skew the results.
  • -XX:+PrintCompilation
    Output a line to the console when a method is compiled by the VM. Used to make sure that the test isn't disrupted by dynamic compilation, which would skew the results.
  • -XX:+PrintGCDetails
    Output a detailled message to the console with each garbage collection run. Used to check that no major collections occur during a test, which would skew the results.
  • -Xms512m -Xmx512m -XX:PermSize=256m -XX:MaxPermSize=256m
    I purposely set high values to delay any garbage collection, because that could decrease performance on the later runs.

The results

NOTE: I've changed the benchmark since the previous article and the results presented here aren't comparable with the old ones!

Hey, that's odd. Constructor.newInstance() is more expensive than Class.newInstance(). Strange... (ps. the result is the same on many runs on multiple machines)

Fixing the problem

But... Constructor.newInstance() is a varargs method. This means a new array is created when it is invoked. Let's fix this line:

into this:

We also need to put this thingie somewhere:

This way, we pass in our own array, which is null, but this is enough to disable the creation of a new array.

Note that I didn't just pass in a null literal. This would lead to a compiler warning and makes the code less obvious.

The results:

Yeah, now we're faster, but just a little bit.

Wrapping up

So, let's combine this with the other optimizations (method caching) and see where we end up:
"Java HotSpot(TM) Server VM (build 10.0-b23, mixed mode)"

(these are the totals of a *lot* of runs)

And, for the fun of it, some OpenJDK results from my little Atom 330 Ubuntu machine: "OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)":

So, invoking Constructor.newInstance(args...) is slightly cheaper than invoking Class.newInstance().

And, Reflection has gotten pretty damn fast if you ask me. :)

Don't forget to pass in a null array to prevent the JVM from automagically creating one for you.

Note: Class.newInstance() and Constructor.newInstance() behave differently with regards to Exceptions. See the Class and Constructor documentation for details. The only thing I'm gonna say here, is that Constructor is the preferred way.




Reflection slow? Well, it depends…

Door: Jan-Kees van Andel, 25 April 2010

In this article, I’ll show you some ways to turn slow reflective code into faster code.

Reasons to use reflection

As some of you may know, many frameworks and libraries use reflection to do their thing. Reflection can be a solution for many sorts of issues, like:

  • Configuration. Many frameworks use text-based configuration, like XML. Classes are often configured in text files and need to be instantiated at some point. Also, method names (like event handlers or callbacks) might be configured in text files.
  • Frameworks in general. Whether you configure your classes in an external text file, annotations or some kind of Java config, the framework can never know the classes it invokes. This framework therefore almost always needs to invoke Class.newInstance() or something similar (Constructor.newInstance() is actually better) to invoke the application classes. And, if you don’t want to put any restrictions on the application code, like requiring application classes to implement a predefined interface, you definitely need to use reflection to invoke methods and access fields.
  • Backwards compatibility. For example, the BeanValidator in MyFaces needs to behave in two different ways, depending on the available libraries. It uses the method ValueExpression.getValueReference(), but only if it is available (available since Unified EL 2.2). Otherwise, it uses a home-brewn mechanism. Reflection is needed, both for compile and at runtime. Otherwise, the code will throw a NoSuchMethodError when ValueExpression.getValueReference() is not available. Reflection is the way to determine method-existence on classes.
  • Tooling, or other kinds of code that need to inspect classes at runtime.
  • Simple class modifications, like making private methods accessible. OR-Mappers like Hibernate do this for field access, since fields are usually private.
  • Non-Java templating engines, like FreeMarker or Facelets. Since templates are not Java bytecode, they rely on reflection to access Java stuff.

Reflection in JSF

As you may know, JSF also uses reflection a lot. Managed Beans, unlike i.e. Struts Actions, don’t have a specific interface, but (mostly) follow the JavaBean convention. This requires the framework to use reflection a lot, for example while resolving EL expressions to method calls.

Unfortunately, reflection is slow. Note that I’m talking about nanoseconds per invocation here. Few reflective invocations in a single web request is nothing to be afraid about, especially compared to database and network latency. But, resolving expressions is a heavily used operation in any JSF request.

Profiling also shows that a significant portion of a JSF application is spent on reflection.

I say, time to improve this. But first, we need measurements to see what the problem is.

Reflection performance, the test setup

So, reflection is slow, right? What does slow really mean and how slow is it?

Let’s start with a simple, but typical, example of how reflection is used in most frameworks.

Consider a simple JavaBean:

A simple JavaBean with a String property. This JavaBean’s methods are to be invoked by the framework.

Now, let’s first write a simple program to access it without reflection. Not a single framework invokes application code directly, since it’s not compiled against it, but this will be our baseline for comparison.

A reliable concurrent test harness requires a lot of boilerplate code, but the essence is between the two comments. It invokes the bean methods in a plain Java way.

On my laptop, with 1000 threads and 1000 iterations, it takes 68ms to complete.

Measuring reflection performance

Now, let's try it out using reflection. After all, not a single framework is compiled against the application code, so the method above is the ideal, but impossible.

We add the following task to the test harness:

This code should look pretty familiar. It loads a class, given its name, instantiates it and invokes its methods, all using the standard Reflection API.

Don't mind the arithmetic stuff. It's used to prevent the JIT from removing too much (dead) code, which would render the test case useless.

Let's include it in the test, so the main method looks like this:

The results are obvious:

So, full blown reflection is about 50 times slower. Not good, right? Let's try to improve this.

Optimizing reflective code: Caching classes

But, of course we can do better. Let's tune the reflective approach a bit.

The first optimization is caching the Class object. Classes are thread safe and can be cached and reused safely.

Caching a Class object saves a lot of repeated overhead, like contacting the SecurityManager, ClassLoader, etc. So this should be a significant improvement.

The code looks like this:

And we of course add the task to the test.
Performance got better with about 40%

So, we can safely conclude that in general, it's a good idea to cache Class instances if you plan to instantiate them a lot. Don't cache too much, because such loitering objects are the number one reason of Java memory leaks. But, a "Map<String, Class>" with a fixed amount of classes (like all your controllers) should do no harm here.

Optimizing reflective code: Caching method handles

There is more reflective code that can be optimized. The Method instances are also good candidates for caching. After all, you obtain the same Method handle every time and invoke it with an instance parameter. So, let's cache the getter and setter methods, like this:

The code is getting a lot messier with every tweak, but don't worry. There are other, more concise ways to do this. I chose this approach because of testability (writing a correct concurrent test is hard).

The results, however, don't lie:

Caching the two methods improves performance with about 400%. But we have two methods, so let's conclude a 200% improvement per method.

At this point, we've performance has improved about ten times, compared to the original reflection implementation. But, even this optimized version is still about 6 to 10 times slower than the original one.

Unfortunately, using standard reflection, we can't make much more improvements. The JIT simply has less ways to optimize reflective invocations than it has with normal bytecode (invokedynamic in JDK 7 improves this).

Optimizing reflective code: Caching instances

This optimization is only applicable to stateless objects! Using it on objects with state leads to correctness problems!

Let's assume that correctness is irrelevant so we can cache instances of the bean. The code would look like this:

And the results:

So, we've won about 100-200ms by not invoking the newInstance method. But, since creating objects also affects garbage collection (in this test we create one million beans per test) and we've refactored the local variable to a static final variable the performance implications of the newInstance call itself are not obvious. But, it's a fact that we've improved performance by caching the instance.

But note, this approach only applies to stateless or immutable objects. Wherever there's state, you need new instances.

Optimizing reflective code: Conclusion

In this article, we've seen how the performance of reflection differs from 'normal' Java bytecode performance.

We've also seen how some simple tweaks like caching Class and Method objects make reflective code a lot faster.

But still, even when caching instances, the performance of reflection doesn't even come close to the performance of plain Java bytecode (where the instances weren't cached).

There are examples of high performance reflection, for example in the java.util.concurrent.atomic.AtomicReferenceFieldUpdater class, but this kind of code should not be written by anyone except Doug Lea. :) (and b.t.w., it hurts portability)

So, should you use reflection in your application? It depends. If you're writing high performance stuff that's invoked very, very often, I would say no. If you're writing a simple method dispatcher that gets invoked once per web request in a normal web application, I wouldn't care about the performance of the reflective invocation. After all, we're still talking about nano-/milliseconds here.

In the next entry, I'll show you how you can use bytecode enhancement to get the same functionality as with reflection, but without a huge performance penalty.

Edit
Note: I just heard that in some cases, HotSpot is able to inline through reflection calls, enabling it to apply all it's JIT tricks on it, effectively reducing the invocation cost to zero! But it only works if it can prove if the invoked method is always the same, which is not always true unfortunately.
http://java.sun.com/products/hotspot/whitepaper.html#performance