blog.smart-java.nl
Ordina J-Technologies – Java Blog



Getting on the cloud!

By: Roy van Rijn, 5 June 2009

Google App Engine

You’ve probably heard people talking before about ‘cloud computing’.
But what exacly is this cloud computing you might ask?

To figure this out I decided to create my own Google App Engine project and find out about cloud computing along the way.

Wikipedia states that cloud computing is:

Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud” that supports them”

So, cloud computing doesn’t only have a vague name, the description isn’t very helpful either. But the key ingredients are “scalable” and “virtualized” and “as a service”. And you don’t have control over the infrastructure…

Lets take a look at Google App Engine. It was released in April 2008 as a platform for developing (Python)cloud applications. App Engine hosts these applications virtually on many machines, the programs are distributed and scaled across a vast amount of servers. And now, since the beginning of this year Google App Engine also supports Java!

The concept is simple, you get to build an application, preferably using Google’s App Engine-eclipse-plugin. And the JRE you build on is a slightly stripped-down version to make it usable on a cloud, like a sandbox.

Because you don’t know on what kind of servers your application will run on, or on how many servers, Google has decided you can’t do the following:

  • Start threads
  • Go to the Filesystem, no I/O
  • Open sockets directly, but you can open connections through HTTP/HTTPS
  • Make calls to System (like exit();, gc(); etc)

And there is more. Because of these restrictions you can’t access a database! So you can’t use something like Hibernate and some Oracle/MySQL/Postgress machine. To still be able to save/persist objects Google has teamed up with Datanucleus. Using JDO or (a stripped down version of) JPA you can persist and retrieve objects on the cloud.

With this in mind I started making my own application. And the frameworks I wanted to use are:

  • Wicket (Web Framework)
  • Spring IOC
  • Spring ORM (for transaction management, using annotations)
  • JPA (instead of the default JDO)

The first problems I encountered was getting Wicket to load. Because of the sandbox-restrictions there are a couple of things you can’t do. For example, Wicket can’t save temporary data to disk (what it normally does). And there are problems with Wicket being in ‘development-mode’ where is wants to start Threads to poll for changed resources.

A good overview on what it takes to get Wicket working is explained here:
http://www.danwalmsley.com/2009/04/08/apache-wicket-on-google-app-engine-for-java/

Next up was installing and running Spring. This was relatively easy at first. The core Spring code ran pretty much as expected. I added the JARs to my project and added this to the web.xml:

	<!-- Spring -->
	<context-param>
		<param-name>contextConfigLocation</param-name>
		<param-value>classpath:applicationcontext-*.xml</param-value>
	</context-param>
	<listener>
		<listener-class>
			org.springframework.web.context.ContextLoaderListener
		</listener-class>
	</listener>
	<listener>
		<listener-class>
			org.springframework.web.context.request.RequestContextListener
		</listener-class>
	</listener>

As you can see, I load up multiple XML files. I decided to go with the all-out-annotations method using Spring ORM. This proved to be pretty challenging…

With these annotations you are able to do the following in the code:

@Repository("loginDao")
@Transactional
public class LoginDaoImpl implements LoginDao {
 
	@PersistenceContext
	private EntityManager entityManager;
	... (and more)

As you can see I’m using Spring to inject my EntityManager into the DAO. But you can’t just load the entity manager in Google App Engine, you need a specific piece of configuration. I used the following XML:

	<bean id="data.emf"
		class="org.springframework.orm.jpa.LocalEntityManagerFactoryBean">
		<property name="persistenceUnitName" value="transactions-optional" />
	</bean>

	<bean class="org.springframework.orm.jpa.JpaTemplate">
		<property name="entityManagerFactory" ref="data.emf" />
	</bean>

	<bean id="transactionManager"
		class="org.springframework.orm.jpa.JpaTransactionManager">
		<property name="entityManagerFactory" ref="data.emf" />
	</bean>

To tell Spring to scan for these annotations you need to add the following lines in your applicationcontext:

	<context:annotation-config />
	<context:component-scan base-package="nl.redcode.*" />

And now the problems start… The first problem is that Google App Engine doesn’t support all core classes. When loading these annotations Spring will load its PersistenceAnnotationBeanPostProcessor. But it contains the following piece of code:

try {
	return (EntityManagerFactory) lookup(jndiName, EntityManagerFactory.class);
}
catch (NamingException ex) {
	throw new IllegalStateException("Could not obtain 
		EntityManagerFactory [" + jndiName + "]from JNDI", ex);
}

And the Exception we get is:

org.springframework.beans.factory.BeanCreationException: Error creating
bean with name
'org.springframework.context.annotation.internalPersistenceAnnotationProcessor':
Initialization of bean failed; nested exception is
java.lang.NoClassDefFoundError: javax/naming/NamingException
	at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:480)
	at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory$1.run(AbstractAutowireCapableBeanFactory.java:409)
	at java.security.AccessController.doPrivileged(Native Method)
	at
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:380)
	at
org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:264)
	at
...etc

And when I looked at the white-list it was clear, NamingException isn’t part of the sandbox Google App Engine uses. So I started to google how to solve this. The first thing I encoutered was somebody who added the following lines to his/her applicationContext:

	<bean id="org.springframework.context.annotation.internalPersistenceAnnotationProcessor"
		class="java.lang.String" />

This piece of code, when executed before the annotation-scan, loads a String in the Spring Container under the name “internalPersistenceAnnotationProcessor”. This causes Spring to ignore its own instantiation of the PersistenceAnnotationBeanPostProcessor and we don’t get the Exception anymore.

But this causes some more damage we don’t want in the application. Before my Dao’s received a valid EntityManager, but they are Null now…!

So I took the code of the original Spring PersistenceAnnotationBeanPostProcessor and replaced all the instances of NamingException with just Exception. This removed the dependency to NamingException. I called this new bean “AppEngineJPAPostProcessor”. This is how I configured it in the applicationContext:

	<bean id="org.springframework.context.annotation.internalPersistenceAnnotationProcessor"
		class="nl.redcode.springhack.AppEngineJPAPostProcessor" />

The EntityManager(Factory) is now created, it gets injected into the DAO’s, they have transactions using annotations and everybody is happy!

When I got a little further in my project I decided to deploy my application to the cloud and test it online. Deploying your application to App Engine is very simple, just push the “Deploy” button in the Eclipse plugin and you only need to enter your credentials and a version-number of your release!

But then the old BeanPostProcessor bit me in the back again. On the server I got the following Exception when deploying:

java.lang.SecurityException: Unable to get members for class org.springframework.jndi.JndiLocatorSupport
	at com.google.apphosting.runtime.security.shared.intercept.java.lang.Class_$10.run(Class_.java:357)
	at com.google.apphosting.runtime.security.shared.intercept.java.lang.Class_$10.run(Class_.java:347)
	at java.security.AccessController.doPrivileged(Native Method)
	at com.google.apphosting.runtime.security.shared.intercept.java.lang.Class_.getMembers(Class_.java:347)
	at com.google.apphosting.runtime.security.shared.intercept.java.lang.Class_.getDeclaredMethods(Class_.java:174)
	at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:460)
	at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:443)
	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.findAutowiringMetadata(AutowiredAnnotationBeanPostProcessor.java:299)
	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessMergedBeanDefinition(AutowiredAnnotationBeanPostProcessor.java:179)

It seems the runtime is a bit more strict then the development server Google App Engine uses. For some reason it doesn’t like the JndiLocatorSupport. This is excepted because Google App Engine, due to the nature of the cloud, prohibits the use of JNDI.

Soon I found the problem, the only reference to JndiLocatorSupport is my own BeanPostProcessor:

/**
 * Rewritten for use in Google AppEngine
 *
 * @author Roy van Rijn
 */
public class AppEngineJPAPostProcessor extends JndiLocatorSupport implements
		InstantiationAwareBeanPostProcessor, BeanFactoryAware {
 
	private Map persistenceUnits;
	...

When I removed the ‘extends’ part there were two pieces of code that stopped working, they both look like this:

	try {
	return (EntityManagerFactory) lookup(jndiName, EntityManagerFactory.class);
}

It seems this PostProcessor always does a JNDI lookup to find the correct EntityManager! But we can’t do this if we don’t have access to the JndiLocatorSupport methods anymore. So I decided to hack a little bit in this code, my solution was to load the EntityManagerFactory and EntityManager from the Spring container:

try {
	return ((JpaTransactionManager)beanFactory
		.getBean("transactionManager"))
		.getEntityManagerFactory();
/*	return (EntityManagerFactory) lookup(jndiName,
		EntityManagerFactory.class);*/
} catch (Exception ex) {
...

This solved all the problems and with the changes to the BeanPostProcessor I’m now able to use all the Spring annotations, for persistency and transactions, in Google AppEngine.

I’m now still in the middle of developing my application on Google App Engine, but it seems that Google App Engine works like a charm. The problem is, most frameworks can’t really cope with the sandbox out-of-the-box. But with (some) minor patches and tweaking most frameworks will run using Google App Engine. The only major problem I’m having (which can’t be solved) is Datanucleus. I chose to use JPA because it is much richer and has more features then JDO, but Datanucleus hasn’t implemented much of these features yet.

For example, I had the following annotation on a field: @Column(unique=true)
Datanucleus threw “java.lang.UnsupportedOperationException: No support for uniqueness
constraints
“.

Also, I created a query with this: “username = :username OR emailAddress = :emailAddress
But Datanucleus doesn’t support the operator “OR”.

Other things DataNucleus can’t currently do:

  • Many-to-many relationships
  • Joins in a query (WHAT??)
  • Aggregation queries (group by, having, sum, avg, max, min)
  • Polymorphic queries. You cannot perform a query of a class to get instances of a subclass. Each class is represented by a separate entity kind in the datastore.

So, we’ve seen what a cloud is, what Google App Engine is, and I explained some tweaks/patches needed to get Wicket and Spring working.

Why use Google App Engine? It is free (up to some CPU/mail/data limits) and your project runs on a cloud, and thus is very scalable. You don’t have to worry about the environment or server. Your application will scale when its needed and everything is pre-installed and ready to run.
But be prepared for difficult classloading issues and missing classes. You just can’t expect all frameworks to be working out-of-the-box with the sandbox-limitations. Also don’t expect much support with persisting, the JPA-support is very minimal and won’t let you do much more then persisting and retrieving single objects.

Enough blogging, now its time for me again to tinker on my application, maybe I’ll tell about it here in the future!

8 reacties op “Getting on the cloud!”

  1. Roy van Rijn zegt:

    Sorry for the formatting, can’t seem to get it right :(

    If you have any experience with GAE, please let me know, I’m very interested what your experiences are.

  2. Jan-Kees van Andel zegt:

    Hmm, I understand it works crappy and the advantages are quite unclear?

    You’re not making me very enthusiastic about GAE. ;)

  3. Roy van Rijn zegt:

    Well, its very scalable and its FREE! Come one!

    Seriously, cloud computing would (as a platform) be great for Java programmers, currently the sandbox GAE uses is a tiny bit limiting, but most things can still be done. But its a great way to get to know the cloud infrastructure and the advantages it brings: having multiple version deployed at the same time (switching versions without downtime!) and pretty much unlimited scalability.

  4. Vincent Hartsteen zegt:

    Ok, so you’ve explained how to develop a cloud application. You also showed the kind of issues one might run into when doing so. But what are the (dis)advantages of cloud computing as opposed to let’s say web-services. The ideas behind them seem to be somewhat alike. Any thought on this you want to share?

  5. Roy van Rijn zegt:

    As I said before, the huge advantage of cloud platforms is that the applications are developed for scalability. Every aspect of a program on the cloud is independend of the machine/server its running on. These programs can run on every server plugged into the cloud.

    The big idea is that you can develop an application, host it on a cloud platform and let the cloud decide how many resources you get at any particular moment. And if you are running out of resources, just add more servers to the cloud, and you don’t have to install anything specific on these servers, they just become part of the cloud and may contain all cloud-applications.

    The drawback is obvious, you can’t have access to resources of the server, you can’t access files/the filesystem, you can’t have direct access to a relational database (they don’t work on clouds, they can’t scale) and most clouds won’t let you start threads. This of course limits the kind of applications you can effectively develop.

    For webapplications with simple CRUD the cloud is perfect, for applications with much more data you’ll need to do MapReduce functions on the cloud, but this is something Google AppEngine doesn’t let you do currently. Their cloud-database does work with the MapReduce-paradigm, but as developer you can’t access that, you can only do simple queries.

    That’s my idea… does anybody else have interesting points to add about AppEngine or cloud computing?

  6. Roy van Rijn zegt:

    More comments on GAE (and this blogpost) here:
    http://www.theserverside.com/news/thread.tss?thread_id=54919

    Including a reply from a Datanucleus employee about their product and Google’s JDO & JPA plugin for BigTable.

  7. cometta zegt:

    have you tried on spring 3 ? does configuration need to be done like what you done for spring 2? can share the demo app above?

  8. cometta zegt:

    Hi, really thank you for this great article. maybe u able to comment my post here http://tinyurl.com/yedo6by regarding jpa problem

Laat een reactie achter