![[image]](http://mowser.com/img?url=http%3A%2F%2Fawsmedia.s3.amazonaws.com%2Fhdr_articles.jpg)
Adriaan de Jonge provides some tips on configuring Apache and Tomcat to support load balancing J2EE applications across multiple Amazon EC2 instances.
![[image]](http://mowser.com/img?url=http%3A%2F%2Fdeveloper.amazonwebservices.com%2Fconnect%2Fimages%2Famazon%2Freviews-button.gif)
By Adriaan de Jonge of SDB Java, The Netherlands
How do you configure your Java 2 Platform, Enterprise Edition (J2EE) servers to offer the scalability of Amazon EC2 to your applications? If you have the hardware, J2EE servers already offer support for load balancing. However, in practice, this solution isn't always used to its full potential because of hardware costs. Using Amazon EC2, scalability is standard and servers should be set up for load balancing by default. This tutorial explains the basic procedures for using Amazon EC2 to deploy distributed J2EE applications.
Combining Java with the Tomcat web server is a popular approach for shops that use heavy web applications. If you are running a single server, you have a powerful engine, up to a certain number of concurrent visitors. If you receive more visitors, however, you could run into trouble; Tomcat has a bad reputation for performing poorly under pressure. Connections might fail to respond in time, resulting in many timeouts and errors because the maximum connection limit has been surpassed. Although there are alternatives to Tomcat--such as Jetty--that are better at handling stress, they don't solve structural problems if the capacity is too low.
The best way to handle structural capacity shortages is to employ load balancing techniques. This means running several parallel instances of Tomcat, each of which handles part of the traffic. The nice thing about the Amazon EC2 service model is that it lets you start and stop additional servers as needed without having to pay for the hardware when the capacity is not used.
For example, suppose you are running an online store, requiring as much as five times the usual capacity during the Christmas season, and half the capacity during summer. Your requirements might look like this:
You also have these two additional requirements:
What is the simplest possible way to approach this problem? Some people might think it would be to have a graphical user interface (GUI), in which system administrators can simply click to add or remove servers. I must admit that I don't have such a tool, and it wouldn't be simple to invent one for this purpose right now. In my definition of simple, all I need to set things up and modify them is my favorite text editor. You might do this manually the first few times. A major advantage of the text editor approach is that it allows you to script any manual actions, automating the process over time.
When running parallel Tomcat servers, Apache HTTPd is required on top to distribute the workload over the underlying Tomcat instances. The architecture for two or more instances looks like this:
Translating this picture directly to Amazon EC2 instances, the Apache top layer would require its own dedicated Amazon EC2 instance. In practice, it turns out that the resources required by the Apache top layer are of a different proportion than the Tomcat services. The Amazon EC2 services provide an economical solution to these varying system requirements with heterogeneous instance sizing. For the Apache instance, you can use a small instance, and pay $0.10 per hour. For the Tomcat instances, you could use extra large instances, and pay $0.80 per hour. Considering the minimal system resources required by Apache, you could also share an extra large instance with Tomcat on one of the servers. In this case, the physical architecture of the instances looks like this:
Activating your web application running on the Amazon EC2 servers is a matter of pointing your domain name--for example, www.yourdomain.com--to the Amazon EC2 instance running Apache. To do this, you need to configure the Domain Name System (DNS), to tell it which computer belongs to the domain. In many cases, this configuration is accomplished by specifying the IP address of the computer in an A record for the domain. The alternative is to specify an alias, or CNAME, to another host name that points to your computer. Because the Amazon servers already have their own host names, it is easiest to use CNAMES. For more information on A records and CNAMES, visit Zytrax.com.
The Apache instance running on the computer you are pointing to with the CNAME takes care of spreading the load over the underlying Tomcat instances. An important detail of the CNAME setting is the Time To Live (TTL) value you assign to it. Preferably in this case, you'll set this value to a shorter time than usual: one hour, for example, or maybe even less. You will see why as we continue.
Usually, when you change the load balancing configuration in Apache, you are required to restart Apache to activate the new configuration. This restart would be noticeable to at least a few of your customers, so that is not the best way to switch to a new configuration. Starting an additional instance of Tomcat, or shutting one down, is a change in the load balancing configuration. With the CNAME and short TTL, you have a practical way to avoid restarting Apache and still switch over to a new configuration.
If you are running Apache on a small Amazon EC2 instance, you can start up a second small Amazon EC2 instance running Apache with the new load balancing configuration. If you change your DNS settings to let the CNAME point to the second Amazon EC2 instance, your visitors are gradually moved to the second instance. The time this takes should be roughly equal to the TTL value of your CNAME record. So, if this value is an hour, you need to run the two Amazon EC2 instances in parallel for one hour. This would cost you only $0.10 cents, because you are paying for the Amazon EC2 instance by the hour. After this, you can check the access log of the first Amazon EC2 instance, to see whether any visitors are still using that instance. If there are no visitors, you can shut down this instance and assume the second Amazon EC2 instance is your Apache node from now on.
You can do something similar when running both Apache and Tomcat on an extra large node. Suppose Server1 is running Apache, load balancing over Tomcats on Server1 and Server2. You want to add an additional Tomcat on Server3, the server you just turned on using the Amazon EC2 command-line utilities. There is a simple way to do this without restarting the Apache instance on Server1: Set up the Apache instance on Server2 to balance between Server1, Server2, and Server3. Start Apache on Server2. Then, change your DNS settings to point your domain name to Server2 instead of to Server1. From that point, it should take approximately an hour for all clients to switch from Server1 to Server2. You can check your Apache access logs to see that at some point there are no longer any requests for Server1. At that point, it's safe to reconfigure Apache on Server1 for the next change you want to make. Alternatively, you can set Server1's configuration to match the configuration on Server2 for consistency.
There are many ways to connect Apache with Tomcat. Over the past few years, native Tomcat connectors have quickly superseded each other and there have been options to connect by using the generic HTTP protocol. You will find protocol names such as mod_jk, mod_jk2, mod_ajp, mod_proxy, mod_rewrite, and other variations. Many online help texts are giving outdated or contradictory advice, making it hard to choose the proper connection protocol and settings.
Having tried most of the available connection choices, I know that each connection protocol has its challenges. It turns out that the option that's the simplest and the most consistent with other similar Apache features is the newest option: mod_proxy_ajp. The mod_proxy_ajp protocol is similar to mod_proxy_http except that it saves you from providing ProxyPassReverse lines in addition to ProxyPass, which means the AJP protocol is easier to maintain than the HTTP protocol.
Note: See http://developer.amazonwebservices.com/connect/entry.jspa?entryID=1015 for an Amazon Machine Image (AMI) with Apache and Tomcat preinstalled.
Before setting up Apache, you should start at the bottom: configure Tomcat's [TOMCAT_HOME]/conf/server.xml file. For Tomcat versions up to 5.5, it is wise to start with the example content from server-minimal.xml. Tomcat 6.0 doesn't provide this example, but the content of the default server.xml file is fairly minimal in 6.0. Find the AJP connector declaration:
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />
Then, decide whether port 8009 is suitable for your situation. Let's assume you are running only one Tomcat per server instance. In some situations there are reasons to do otherwise, but they'd complicate this explanation.
Now, it's time to change the Apache configuration, [APACHE_HOME]/conf/httpd.conf. First, find the lines beginning with LoadModule and uncomment them:
LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_balancer_module modules/mod_proxy_balancer.so LoadModule proxy_http_module modules/mod_proxy_ajp.so
Then, at the bottom, add the following:
<Proxy balancer://mycluster>
Order deny,allow
Allow from all
BalancerMember ajp://[SERVER1].amazon.com:8009/myjavaapp
BalancerMember ajp://[SERVER2].amazon.com:8009/myjavaapp
#... as many as you need here ...</Proxy>
ProxyPass /myjavaapp balancer://mycluster
The last configuration detail is the CNAME in your DNS settings. Who your provider is will determine how you set these. Many larger application service providers give you your own dashboard where you can create A records, MX records, and CNAMES. Please refer to the documentation supplied by your provider for details.
Up to this point, we've only looked at load balancing on a request basis, assuming a stateless protocol. In practice, most web applications store session data that covers a series of requests from a single customer--for example, storing items in a shopping basket.
If these items are stored on the Tomcat instance, they would be lost if the next request is referred to another Tomcat instance. There are several ways to resolve this problem. One way is to configure a jvmRoute attribute in the Engine declaration in a Tomcat server. The jvmRoute attribute is appended to the session ID in the cookie sent to the client. In the next request, the Apache load balancer recognizes the jvmRoute and directs the request to the same Tomcat as the last request. This approach is called a "sticky" session.
There is one exception, though, in which using a sticky session would go wrong: After shutting down a Tomcat instance, the session data would be lost. The solution is to allow session data to be copied from one Tomcat server to another. This means that, during the programming phase, any object stored in a session should implement the Serializable interface. However, be careful of references to other, unrelated objects, which would require additional configuration that is beyond the scope of this article.
An alternative is to store session data in the back-end database that's shared by all Tomcat instances. This approach simplifies configuration and programming, but might complicate database management. You should make sure this data is cleaned out after sessions expire, and the database should be optimized for handling the data with a short life.
This article highlights a few aspects of working with AWS. Here are a few more resources available to Java developers to help you learn more.
Here are some web sites using Amazon Web Services and Java:
Adriaan de Jonge is part of a team of Java specialists at SDB Java in The Hague, The Netherlands. His writing career began with a comparison of XForms and Ruby on Rails before he started writing for IBM developerWorks. As a Java developer, he is especially interested in front-end technology, both web-based and client-side. You can reach Adriaan at adriaandejonge@gmail.com.
The 5 most recent discussion messages. View full discussion.
![[image]](http://mowser.com/img?url=http%3A%2F%2Fdeveloper.amazonwebservices.com%2Fconnect%2Fimages%2Frating-none-16x16.gif)

![[image]](http://mowser.com/img?url=http%3A%2F%2Fdeveloper.amazonwebservices.com%2Fconnect%2Fimages%2Frating-none-16x16.gif)

![[image]](http://mowser.com/img?url=http%3A%2F%2Fdeveloper.amazonwebservices.com%2Fconnect%2Fimages%2Frating-none-16x16.gif)

![[image]](http://mowser.com/img?url=http%3A%2F%2Fdeveloper.amazonwebservices.com%2Fconnect%2Fimages%2Frating-none-16x16.gif)

You are viewing a mobilized version of this site...
View original page here