Reflection slow? Well, it depends…
By: Jan-Kees van Andel, 25 April 2010In this article, I’ll show you some ways to turn slow reflective code into faster code.
Reasons to use reflection
As some of you may know, many frameworks and libraries use reflection to do their thing. Reflection can be a solution for many sorts of issues, like:
- Configuration. Many frameworks use text-based configuration, like XML. Classes are often configured in text files and need to be instantiated at some point. Also, method names (like event handlers or callbacks) might be configured in text files.
- Frameworks in general. Whether you configure your classes in an external text file, annotations or some kind of Java config, the framework can never know the classes it invokes. This framework therefore almost always needs to invoke Class.newInstance() or something similar (Constructor.newInstance() is actually better) to invoke the application classes. And, if you don’t want to put any restrictions on the application code, like requiring application classes to implement a predefined interface, you definitely need to use reflection to invoke methods and access fields.
- Backwards compatibility. For example, the BeanValidator in MyFaces needs to behave in two different ways, depending on the available libraries. It uses the method ValueExpression.getValueReference(), but only if it is available (available since Unified EL 2.2). Otherwise, it uses a home-brewn mechanism. Reflection is needed, both for compile and at runtime. Otherwise, the code will throw a NoSuchMethodError when ValueExpression.getValueReference() is not available. Reflection is the way to determine method-existence on classes.
- Tooling, or other kinds of code that need to inspect classes at runtime.
- Simple class modifications, like making private methods accessible. OR-Mappers like Hibernate do this for field access, since fields are usually private.
- Non-Java templating engines, like FreeMarker or Facelets. Since templates are not Java bytecode, they rely on reflection to access Java stuff.
Reflection in JSF
As you may know, JSF also uses reflection a lot. Managed Beans, unlike i.e. Struts Actions, don’t have a specific interface, but (mostly) follow the JavaBean convention. This requires the framework to use reflection a lot, for example while resolving EL expressions to method calls.
Unfortunately, reflection is slow. Note that I’m talking about nanoseconds per invocation here. Few reflective invocations in a single web request is nothing to be afraid about, especially compared to database and network latency. But, resolving expressions is a heavily used operation in any JSF request.
Profiling also shows that a significant portion of a JSF application is spent on reflection.
I say, time to improve this. But first, we need measurements to see what the problem is.
Reflection performance, the test setup
So, reflection is slow, right? What does slow really mean and how slow is it?
Let’s start with a simple, but typical, example of how reflection is used in most frameworks.
Consider a simple JavaBean:
A simple JavaBean with a String property. This JavaBean’s methods are to be invoked by the framework.
Now, let’s first write a simple program to access it without reflection. Not a single framework invokes application code directly, since it’s not compiled against it, but this will be our baseline for comparison.
A reliable concurrent test harness requires a lot of boilerplate code, but the essence is between the two comments. It invokes the bean methods in a plain Java way.
On my laptop, with 1000 threads and 1000 iterations, it takes 68ms to complete.
Measuring reflection performance
Now, let's try it out using reflection. After all, not a single framework is compiled against the application code, so the method above is the ideal, but impossible.
We add the following task to the test harness:
This code should look pretty familiar. It loads a class, given its name, instantiates it and invokes its methods, all using the standard Reflection API.
Don't mind the arithmetic stuff. It's used to prevent the JIT from removing too much (dead) code, which would render the test case useless.
Let's include it in the test, so the main method looks like this:
The results are obvious:
So, full blown reflection is about 50 times slower. Not good, right? Let's try to improve this.
Optimizing reflective code: Caching classes
But, of course we can do better. Let's tune the reflective approach a bit.
The first optimization is caching the Class object. Classes are thread safe and can be cached and reused safely.
Caching a Class object saves a lot of repeated overhead, like contacting the SecurityManager, ClassLoader, etc. So this should be a significant improvement.
The code looks like this:
And we of course add the task to the test.
Performance got better with about 40%
So, we can safely conclude that in general, it's a good idea to cache Class instances if you plan to instantiate them a lot. Don't cache too much, because such loitering objects are the number one reason of Java memory leaks. But, a "Map<String, Class>" with a fixed amount of classes (like all your controllers) should do no harm here.
Optimizing reflective code: Caching method handles
There is more reflective code that can be optimized. The Method instances are also good candidates for caching. After all, you obtain the same Method handle every time and invoke it with an instance parameter. So, let's cache the getter and setter methods, like this:
The code is getting a lot messier with every tweak, but don't worry. There are other, more concise ways to do this. I chose this approach because of testability (writing a correct concurrent test is hard).
The results, however, don't lie:
Caching the two methods improves performance with about 400%. But we have two methods, so let's conclude a 200% improvement per method.
At this point, we've performance has improved about ten times, compared to the original reflection implementation. But, even this optimized version is still about 6 to 10 times slower than the original one.
Unfortunately, using standard reflection, we can't make much more improvements. The JIT simply has less ways to optimize reflective invocations than it has with normal bytecode (invokedynamic in JDK 7 improves this).
Optimizing reflective code: Caching instances
This optimization is only applicable to stateless objects! Using it on objects with state leads to correctness problems!
Let's assume that correctness is irrelevant so we can cache instances of the bean. The code would look like this:
And the results:
So, we've won about 100-200ms by not invoking the newInstance method. But, since creating objects also affects garbage collection (in this test we create one million beans per test) and we've refactored the local variable to a static final variable the performance implications of the newInstance call itself are not obvious. But, it's a fact that we've improved performance by caching the instance.
But note, this approach only applies to stateless or immutable objects. Wherever there's state, you need new instances.
Optimizing reflective code: Conclusion
In this article, we've seen how the performance of reflection differs from 'normal' Java bytecode performance.
We've also seen how some simple tweaks like caching Class and Method objects make reflective code a lot faster.
But still, even when caching instances, the performance of reflection doesn't even come close to the performance of plain Java bytecode (where the instances weren't cached).
There are examples of high performance reflection, for example in the java.util.concurrent.atomic.AtomicReferenceFieldUpdater class, but this kind of code should not be written by anyone except Doug Lea.
(and b.t.w., it hurts portability)
So, should you use reflection in your application? It depends. If you're writing high performance stuff that's invoked very, very often, I would say no. If you're writing a simple method dispatcher that gets invoked once per web request in a normal web application, I wouldn't care about the performance of the reflective invocation. After all, we're still talking about nano-/milliseconds here.
In the next entry, I'll show you how you can use bytecode enhancement to get the same functionality as with reflection, but without a huge performance penalty.
Edit
Note: I just heard that in some cases, HotSpot is able to inline through reflection calls, enabling it to apply all it's JIT tricks on it, effectively reducing the invocation cost to zero! But it only works if it can prove if the invoked method is always the same, which is not always true unfortunately.
http://java.sun.com/products/hotspot/whitepaper.html#performance


25 April 2010 om 5:42 pm
Impressive stuff! I’ll remember that for the next time I’m going to build a new (web) framework or templating engine. There can never be too many of those! 8-P
Small correction: you say:
and
Compilation has nothing to do with it. A framework can still directly call your code, even if it doesn’t know it at compilation time.
I see your point though, even if it’s a different one you’re making here
26 April 2010 om 7:43 am
Well, everyone needs to build some custom stuff sometimes.
Like a web framework for a customer…
Depends on the kind of framework. If you require application code to implement a common interface, it can, but if you for example follow the JavaBean convention, the framework needs a different way to invoke the getters/setters. Reflection is one method, but unfortunately a slow one.
26 April 2010 om 10:13 am
Not very impressive. If you just want the accessors use introspection that caches everything already. Also, all the optimizations you do are quite specific, in general, with reflection, you just can’t do them.
26 April 2010 om 1:39 pm
Another trick is to cache reflection misses… ie, if you try to look up a class or method and fail because it doesn’t exist, keep note of that failure so you don’t try to perform the lookup again. This can provide a huge boost if you’re likely to be performing the same failed lookups multiple times.
26 April 2010 om 2:15 pm
Nice post!~ thanks
26 April 2010 om 6:08 pm
That’s true, but I only showed that caching method handles improves performance by orders of magnitude.
Whether or not it’s useful for you, depends on your specific context. My experience is that it’s often possible to preload loads of stuff (like Class and Method instances), enabling lots of runtime optimizations.
I usually prefer to preload as much as possible, because it makes concurrency issues less likely and improves performance.
True, and this also applies to other kinds of caches, like a DB-query or JNDI-lookup cache. If it performs some expensive calculation, always save the result, even if it is a useless result.
27 April 2010 om 3:34 pm
Didn’t your original loop include the creation of the bean? The object instantiation seems to be missing from your final code, and instead you use the same instance repleatedly. Won’t that skew your results?
28 April 2010 om 6:16 am
It’s an optimization that’s not often appropriate, but in some cases it can be used. I use it to show the costs of object creation, but you need to take it with a grain of salt, because it also affects garbage collection.
So yes, the final example is usually not appropriate.
28 April 2010 om 8:01 am
Thanks for nice ideas and test. It shows me that my reflection library Duckapter is yet horribly slow but I’ll try to use your tests to profile and improve it. Can you also publish the implementation of the testConcurrent method? I’m not sure how to do it right.
28 April 2010 om 12:59 pm
You are not caching the contructor! class.newInstance() internally looks for the no arg constructor each time – I would be curious to see how doing a Constructor c = class.getConstructor() would affect the timing…
28 April 2010 om 11:15 pm
[...] Jeder Zugriff über die Reflection API kostet Zeit. Für Konfigurationen und Pläne ist das meist nebensächlich; die JPA-Konfiguration wird beim Start der Applikation eingelesen, analysiert und gespeichert. Nebenbei profitiert das natürlich von der anfänglichen “Warmup-Phase”. [...]
1 May 2010 om 7:53 pm
[...] « Reflection slow? Well, it depends… [...]
8 May 2010 om 10:51 pm
The default CompileThreshold is 10,000 unless you have changed it. e.g. -XX:CompileThreshold=10000 in which case the JIT won’t have fully warmed up until you have called a method 10,000 times.
I suggest runing the test for at least 10,000 times, twice and discarding the first run. I also suggest you run what you believe is the fastest first, to prevent earlier benchmarks from warming up the JVM and making later ones appear faster.
You can improve reflection performance by disabling the security check which is performed on every call. i.e. setAccessible(true) even if its public, which stops it checking its public.
You could also benchmark using POJO’s without accessor methods. Fields via reflection can be faster, esp for primitive typed fields, and if this isn’t fast enough you can try the *Unsafe* class which can be 2-3x faster again. The Unsafe class is approriately named, BTW.
8 May 2010 om 10:55 pm
If each test is performing the same work, shouldn’t the counter have the same value every time?