Pages

Wednesday, December 1, 2010

Tuning Java applications on Google AppEngine

Why bother using AppEngine?
Introduction to AppEngine
AppEngine is the Platform as a Service (PaaS) offering from Google. It allows you to deploy and run Java and Python applications on Google's infrastructure. Java applications run in a sand boxed Servlet container that scales completely automatically. To deploy an application you just push a button in your IDE, without any worries about setting up and managing an application server.


AppEngine problems
From time to time you’ll find some very negative feedback about AppEngine. In most cases this is not because AppEngine is bad, but because applications where not designed to run on AppEngine. This article will give you some hints about how to design an application that works and performs well on AppEngine. 


Google AppEngine is one of the most interesting solutions for Java developers when looking at cloud platforms. It allows you to deploy applications within seconds and offers a broad range of services to make development easier. And it’s cheap; for small applications you don’t pay anything at all. This makes it the perfect platform to use for small scale applications that can’t be put on a dedicated server. The lack of good shared hosting solutions pushes most developers away from Java into the PHP world for this kind of small applications, but AppEngine solves that problem.
All this goodness comes at a price though. When you write applications just the same way you do for a dedicated server you’ll run into performance problems soon, even with the smallest application. That doesn’t mean AppEngine offers bad performance, but it’s architecture is so fundamentally different that you have to design your applications in a way that matches this architecture. When you write your applications specifically for AppEngine you’ll unleash it’s full power and it will be a great platform.


The two main problems
The problems that you’ll face when using AppEngine come in two flavors which will both be discussed in this article: 
  1. Instance startup times
  2. DataStore related problems
Application startup time
The first problem, instance startup times, is something you normally don’t care about. Most Java frameworks are designed to do as much processing at application startup time because you don’t restart applications often anyway. That’s very different when running on AppEngine. The core feature of AppEngine is it’s scalability. An application that gets a lot of load will automatically start extra virtual machines to spread the load over multiple machines. That’s not something that will happen only on massive loads; you’ll see instances starting very soon already. You’ll even run into this when your application doesn’t get any load at all. To not waste resources on applications that are not doing anything AppEngine will stop all instances for an application when it didn’t get any requests for about a minute. That means no matter what application you have you’ll have to deal with starting new instances. Each time an instance is started you deal with a cold startup of your application including loading and starting all frameworks you use. Every second counts now, because the user will not receive any response as long as the application is starting. 


Google announced that in AppEngine 1.4 there will be the possibility to pay for reserving instances and the availability of an API to “warm up” new instances. This solves part of the startup problem, but you’ll have to pay for it. This might be no problem for large applications, but is exactly the thing we were trying to avoid for smaller applications.


Performance gain 1 - Choose frameworks based on startup time
A lot of the frameworks that are used by a lot of Java developers add up to 25 seconds of startup time. Users will not wait for 25 seconds to see a web page. We’ll have to improve this. The most important step in this is to get rid of frameworks that take very long to startup and configure the framework you use to improve startup time. This means not every framework is a good fit for AppEngine. An unfortunate example of this is Grails. Although Grails is one of my favorite frameworks in other environments, it’s 20+ second startup time is simply unacceptable for AppEngine. So, would I advice to get rid of all frameworks and start using Servlets/JSP directly? Not really. That would set you back on productivity and code maintainability too much and it’s not necessary either. 


The two stacks I used a lot on AppEngine are the following:
  • Weld, JSF2 and JAX-RS (more or less a stripped down Java EE 6 Web Profile)
  • Spring 3.0 including Spring Web MVC
There are many good alternatives to these stacks, but always test on startup time first!
Spring still does offer significantly better startup performance after some tuning at this moment though. The Weld team is working hard on improving the startup time of Weld dramatically which will make it a perfect fit for AppEngine in the upcoming version.


Performance gain 2 - Get rid of JPA/JDO
AppEngine offers two APIs to work with the DataStore. Remember that the DataStore is not a relational database. Because of that both JPA and JDO loose some of their power. 
  • Relationship mappings are very limited.
  • Join queries are not supported
  • Polymorphic queries are not supported
  • Caching support works differently
What’s left is just basic mapping between Java classes to DataStore entities without the real power of JPA/JDO. We still have to deal with the complexity of the APIs however, and worse, with the overhead of the frameworks. Both JPA and JDO add seconds to the startup time of an application. This is bad, and because the frameworks can’t be used in their full potential it’s not really worth it. Instead we need something that more closely matches the possibilities of the DataStore. There’s a framework doing just that: Objectify. The framework uses JPA annotations to map your Java classes to Entities. The whole API to persist and query entities is completely different however and matches the low-level DataStore API much more closely. The programming model is a lot easier because the API doesn’t contain any features that the DataStore doesn’t support anyway. Even better; it only adds milliseconds to the application startup time.


Performance gain 3 - Don’t use classpath scanning
Whenever I use Spring I use annotations as much as possible to keep my XML configuration to a minimum. For declaring components I use @Controller/@Component instead of bean configuration in XML. This means that the framework must scan for annotated classes at startup however which adds some startup time. On AppEngine it’s always better to reduce scanning for classes. 
Another example is RestEasy. Normally I just let the framework scan for @Path annotations, but on AppEngine it’s better to use an explicit Application class instead. This are just two examples of frameworks I use a lot, but there are many different frameworks that give you this choice. 


Performance gain 4 - Use memcache
Caching is useful for most web applications, but AppEngine gives you a great infrastructure for it. On AppEngine you can use MemCache which is a highly scalable distributed cache.
From an API point of view MemCache is very similar to using a HashMap with methods such as put, get, delete and contains. Data in the cache can disappear any moment (it’s not persistent), but will normally live until it expires. The expiration time is something you specify when you put something in the cache. The general idea is to put as much data in MemCache as possible in a useful way. Most web applications are read-mostly, which means there are many more users reading data then writing data. 


Most people start by caching data from the DataStore. The DataStore is relatively slow (compared to a local RDMS) so that’s a quick win. Objectify even supports this declaratively with annotations. You can go a step further though. For RESTful Web Services it’s useful to place JSON strings in the cache. Converting an object graph to a JSON string costs time, so why would you do that over and over again if the data didn’t change? The same thing is true for pages. You could create a Servlet filter that simple returns a cached page (the HTML) instead of re-rendering a page with data that didn’t change.


DataStore usage
AppEngine’s DataStore is a non-relational, schemeless data store. Wait, let me repeat that: The DataStore is NOT relational. This is probably the most important thing to keep in mind while developing AppEngine applications. "No problem" you might say, "those NOSQL data stores are ultra scalable so who would ever bother about performance?" Yes, the DataStore is extremely scalable. It has to store data for a virtually infinite amount of applications that all store a virtually infinite amount of data. To be able to do that the DataStore must be distributed, so yes it's scalable. But that doesn't really go well together with traditional relational data.


Performance gain 5 - Join in-memory
Because the DataStore is so fundamentally different then a relational database you must work with it in a different way too. First of all, there are no joins. The DataStore is basically one very large table, where each row can have it’s own set of columns. If there is only one table, a join doesn’t make much sense. Of course you still need relations between entities in your application, so we have to come up with something for that. Lets take the following simple SQL query as an example:


select emp.name, dep.name FROM employee 
LEFT JOIN department ON department.id =  employee.dep_id 


A first naive approach on AppEngine could be:
  1. select all books
  2. iterate over books
  3. iterate over authorKeys for each book
  4. get author for each key
Objectify ofy = ObjectifyService.begin();
List<Book> books = ofy.query(Book.class).list();

StringBuilder sb = new StringBuilder();

for (Book book : books) {
    sb.append(book.getTitle());
    sb.append(": ");
    for (Key<Author> authorKey : book.getAuthorKeys()) {
        final Author author = ofy.get(authorKey);
        sb.append(author.getFirstname()).append(author.getLastname()).append(", ");
    }

    sb.append("<br>");
}



For each employee we simply just query again for the related department. Now we have a performance problem. If we have 500 employees, we would have 500 + 1 queries (the N + 1 problem). This approach wouldn’t perform on a relational database, and it doesn’t perform on AppEngine either. 
One approach I use a lot in this case is an “in-memory join”:
  1. select all authors
  2. build in-memory map of authors (key=authorId, value=author)
  3. iterate over books
  4. get author for book from in-memory list of authors
Objectify ofy = ObjectifyService.begin();
List<Book> books = ofy.query(Book.class).list();

StringBuilder sb = new StringBuilder();

final List<Author> authors = ofy.query(Author.class).list();
final Map<Long, Author> authorMap = new HashMap<Long, Author>();
for (Author author : authors) {
    authorMap.put(author.getId(), author);
}

for (Book book : books) {
    sb.append(book.getTitle());
    sb.append(": ");

    for (Key<Author> authorKey : book.getAuthorKeys()) {
        final Author author = authorMap.get(authorKey.getId());
        sb.append(author.getFirstname()).append(author.getLastname()).append(", ");
    }

    sb.append("<br>");
}


That seems like something very counter-initiative if you’re from the relational world. Why do something in code that the database can do for you? Well that’s the thing, the database can’t in this case. CPU cycles are relatively cheap on AppEngine, so that’s not really a bottleneck either. And the result can be cached in MemCache. Either the “joined” set of books/authors, or just the author table (e.g, if books change more often). 


This doesn’t work you would have millions of authors. You don’t want (and are impossible) to load millions of authors in memory just link 500 books to their department. In that case you can use a bulk get. This is a normal get operation, but with multiple id’s as arguments. Those objects will be loaded in one batch. The approach would be as follows:
  1. select all books
  2. build set of all required authors for all books
  3. batch get required authors
  4. iterate over books
  5. get author for book from in-memory list of authors
Objectify ofy = ObjectifyService.begin();
List<Book> books = ofy.query(Book.class).list();

StringBuilder sb = new StringBuilder();

Set<Key<Author>> authorKeys = new HashSet<Key<Author>>();
for (Book book : books) {
    authorKeys.addAll(book.getAuthorKeys());
}

final Map<Key<Author>, Author> authorMap = ofy.get(authorKeys);

for (Book book : books) {
    sb.append(book.getTitle());
    sb.append(": ");

    for (Key<Author> authorKey : book.getAuthorKeys()) {
        final Author author = authorMap.get(authorKey);
        sb.append(author.getFirstname()).append(author.getLastname()).append(", ");
    }

    sb.append("<br>");
}

In the graph below you can see the difference in performance is dramatic. For a dataset of 1000 books and 5 authors the first approach takes over 20 seconds, while the other approaches are around 200-300ms.
Performance gain 6 - De-normalize
In some cases you query two related entities so often that you would be better of by de-normalizing the data. In the example above we could get rid of all the extra code if we would just add a departmentName field to the employee entity. Is that a better approach? Well, it depends. It’s definitively faster, but you have the overhead of having to keep the two fields in sync somehow.


I hope this article helps in getting applications to run better on AppEngine. It's not hard at all, just different. And you'll get a great platform for it in return.