Ph: 19211710624
skip to main | skip to sidebar

Thursday, September 11, 2008

A First Taste of InvokeDynamic


Greetings, readers!

Over the past couple weeks I've had a few departures from typical JRuby development. I consider it a working vacation. I'm hoping to report on all of it soon, but for now we'll focus on one of the most exciting items: JSR-292, otherwise known as "InvokeDynamic".

I've reported on invokedynamic previously (InvokeDynamic: Actually Useful?), and of course the technical bits of John Rose's blog should be required reading for anyone interested in this stuff. What I'm going to try to do today is give you an inside picture of the pieces of InvokeDynamic and how they fit together. It will be technical, but everyone should be able to follow it. Ready?

The Problem

Any description of a solution must first describe the problem.

As you probably know, Java is a statically-typed language. That means the types of all variables, method arguments, method return values, and so on must be known before runtime. In Java's case, this also means all variable types must be declared explicitly, everywhere. A variable cannot be untyped, and a method cannot accept untyped parameters nor return an untyped value. Types are pervasive.

The problem, put simply, is this: Because Java is the primary language on the JVM, almost all language implementations on the JVM are written in Java. When implementing a statically-typed language, especially one with structure and rules similar to Java, this is not much of a problem. But when implementing a dynamic language that stubbornly refuses to yield type information until runtime, all this static-typing is a real pain in the neck. Of course this is pretty much the same situation when implementing a dynamic language on top of C or C++ or C#, since they're all generally statically-typed languages too. Or is it? An example is in order.
public class Hello {
public static void main(String[] args) {
java.util.List list = new java.util.ArrayList();
for (int i = 0; i < 5; i++) {
String newString = args[0] + i;
list.add(newString);
}
System.out.println(list);
}
}
Here we see a short, reasonably simple snippit of Java code. An ArrayList is constructed, populated with five strings based on the incoming first command-line argument and a numeric iteration count, and then displayed as a string on the console. The type declarations (shown in bold) represent a lot of the visual noise, the "ceremony" that dynamic language fans decry. From a usability perspective, they're both a positive and negative influence; they noise up the code and require more typing, but they also make it trivial to determine the type of a variable (in most cases) or build tools that safely restructure your code (so-called "refactoring"). From a technical perspective, they give the "javac" compiler all the information it needs to produce very clean, optimized bytecode, and they give the JVM itself type information it uses to execute and optimize that bytecode at runtime. Ahh, but what about the bytecode?

If we peel the Java layer away, the situation changes a bit. At the JVM bytecode level, types are still visible, but they're not nearly as prevalent. Here's the same code in bytecode, with the type names again in boldface:
public static void main(java.lang.String[]);
Code:
0: new #2; //class java/util/ArrayList
3: dup
4: invokespecial #3; //Method java/util/ArrayList."&ltinit>":()V
7: astore_1
8: iconst_0
9: istore_2
10: iload_2
11: iconst_5
12: if_icmpge 50
15: new #4; //class java/lang/StringBuilder
18: dup
19: invokespecial #5; //Method java/lang/StringBuilder."&ltinit>":()V
22: aload_0
23: iconst_0
24: aaload
25: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: iload_2
29: invokevirtual #7; //Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
32: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
35: astore_3
36: aload_1
37: aload_3
38: invokeinterface #9, 2; //InterfaceMethod java/util/List.add:(Ljava/lang/Object;)Z
43: pop
44: iinc 2, 1
47: goto 10
50: getstatic #10; //Field java/lang/System.out:Ljava/io/PrintStream;
53: aload_1
54: invokevirtual #11; //Method java/io/PrintStream.println:(Ljava/lang/Object;)V
57: return
Since not everyone reads JVM bytecode like their native language, a description of these operations is in order.

Java provides what's called an "operand stack" for bytecode it executes. The stack is analogous to registers in a "real" CPU, acting as temporary storage for values against which operations (like math, method calls, and so on) are to be performed. So most JVM bytecode spends its time either manipulating that stack by pushing, popping, duping, and swapping values, or executing operations that produce or consume values. It's a pretty simple mechanism. So then, with a general understanding of the operand stack, lets look at the bytecode itself:
The "load" and "store" instructions are all local variable accesses. "load" retrieves a local variable and pushes it on the stack. "store" pops a value off the stack and stores it in a local variable. The prefix indicates whether the value is an object or "reference" type (denoted by "a") or one of the primitive types (denoted by "i" for integer, "f" for float, and so on). The standard load and store operations take an argument (embedded along with the operation into the bytecode) to indicate which indexed local variable to work with, but there are specialized bytecodes (denoted by a suffixed underscore and digit) for a "compressed" representation of heavily-used low-index variables.
The "invoke" bytecodes are what you might expect: method invocations. Method invocations consume zero or more arguments from the stack and in some cases a receiver object as well. "virtual" refers to a normal call to a non-interface method on an object receiver. "interface" refers to an interface invocation on an object receiver. "static" refers to a static invocation, or one that does not require an object to call against. The "strange quark" of the bunch is "invokespecial", which is used for calling constructors and superclass implementations of methods. You'll notice a couple invokespecials above paired with "new" operations; "new" instantiates the object and "invokespecial" initializes it.
The "const" instructions are what you might guess: they push a constant on the stack. Again, the prefix and suffix denote type and "compressed" opcodes for specific values, respectively.
"aaload" and all "*aload" operations are retrievals out of an array. As with local variables, the first letter indicates the type of the array. Here, the "aaload" is our retrieval of args[0].
"iinc" is an integer increment operation. The arguments are the index of the local variable and how much to increment it by (usually 1).
"if_icmpge" performs a conditional jump after testing whether the second-topmost int on the stack (indicated by the "i" in "icmpge") is greater than or equal to the topmost int on the stack (the >= relationship represented by the "ge" in "icmpge"). This is our "for" loop test i < 5 reversed to act as a loop exit condition rather than a loop continue condition. The looping itself is provided by the "goto" operation further down (yes, the JVM has goto...it's just Java that doesn't have goto).
Finally, we see the "return" instruction, which represents the void return from main. If it were a return of a specific value or object type, it would be preceded by the appropriate type character.

Now the astute reader may already have noticed that other than being specified as reference or primitive types, the opcodes themselves have no type information. Even beyond that, there are no actual variable declarations at the bytecode level whatsoever. The only types we see come in the form of opcode prefixes (as in aload, iinc, etc) and the method signatures against which we execute invoke* operations. The stack itself is also untyped; we push a reference type (aload) one minute and push a primitive type (iload) the next (though values on the stack do not "lose" their types). And when I tell you that the type signatures shown above for each method invocation or object construction are simply strings stuffed into the class's pool of constants...well...now you may start to realize that Java's sometimes touted, oft-maligned static-typing...is just a façade.

The Greatest Trick

Let's dispense with the formality once and for all. The biggest lie that's been spread about the JVM (ok, maybe the biggest after "it's slow") is that it's never going to be a good host for dynamic languages. "But look at Java," people cry, "it's so staticky and rigid; it's far too difficult to implement a dynamic language on top of that!" And in a very naive way, they're partially correct. Writing a language implementation in Java and following Java's rules can certainly make life difficult for a dynamic language implementer. We end up stripping types (making everything Object, since we don't know types until runtime), boxing types (stuffing primitives in carrier objects, to simplify passing them through our Object-only code), and boxing array arguments (since many dynamic languages also have flexible "arities" or numbers of arguments, and others allow optional, "rest", and other special argument types). With each sacrifice we make, we lose many of the benefits static typing provides us, not to mention confounding the JVM's efforts to optimize.

But it's not nearly as bad as it seems. Because much of the rigid, static nature of Java is in the language itself (and not the JVM) we can in many cases ignore the rules. We don't have to declare local variable types. We can juggle items on the stack at will. We can cheat in clever ways, allowing much of normal code execution to proceed with very little type information. In many cases we can get that code to run nearly as well as statically-typed code of twice the size, because the JVM is so dynamic already at its core. JVM bytecode is our assembly, and it's a powerful tool in the right hands.

Unfortunately, on current JVMs, there's one place we absolutely, positively must follow the rules: method invocation.

Know Thyself

Question: In the bytecode above, all invocations came with a formal "signature" representing the type to call against and the types of the method's arguments and return value. If we do not know those types until runtime, and they may be variant even then...how do we support invocation in a dynamic language?

Answer: Very carefully.

Because we are bound to following Java's method invocation rules, the once sunny and clear forecast turns rather cloudy. Every invocation has to be called against a known type. Its arguments must be known types. Its return value must be a known type. Making matters worse, we can't even provide signatures with similar types; the signatures must exactly match the method we intend to invoke. So we understand limitation #1: invocations are statically typed.

There's another way this affects dynamic languages, especially those that may not present normal Java types or that run in an interpreted mode for some part of execution: Invocations must be against real methods on real types. There's simply no way to tell the JVM that instead of calling method W on type X with param Y and return value Z, I want you to enter this interpreter loop; don't mind the types, we'll figure it out for you. Oh no, you have to be part of the Java club and present a normal Java type to get invocation privileges. That's limitation #2: invocations must be against Java methods on Java types.

Adding insult to injury, JVMs even run verification against the bytecode you feed them to make sure you're following the rules. One little mistake and zooop...off to the exception farm you go. It's downright unfair.

The traditional way to get around all this rigidity (a technique used heavily even by normal Java libraries, since everyone wants to bend the rules sometimes) is to abstract out the act of "invoking" itself, usually by creating "Method" objects that do the call for you. And oddly enough, the reflection capabilities of the JVM come into heavy play here. "Method" happens to be one of the types in the java.lang.reflect package, and it even has an "invoke" method on it. Even better, "invoke" returns Object, and accepts as parameters an Object receiver and an array of Object arguments. Can it truly be this easy? Well, yes and no.

Using reflection to invoke methods works great...except for a few problems. Method objects must be retrieved from a specific type, and can't be created in a general way. You can't ask the JVM to give you a Method that just represents a signature, or even a name and a signature; it must be retrieved from a specific type available at runtime. Oh, but that's at runtime, right? We're ok, because we do actually have types at runtime, right? Well, yes and no.

First off, you're ignoring the second inconvenience above. Language implementations like JRuby or Rhino, which have interpreters, often simply don't *have* normal Java types they can present for reflection. And if you don't have normal types, you don't have normal methods either; JRuby, for example, has a method object type that represents a parsed bit of Ruby code and logic for interpreting it.

Second, reflected invocation is a lot slower than direct invocation. Over the years, the JVM has gotten really good at making reflected invocation fast. Modern JVMs actually generate a bunch of code behind the scenes to avoid a much of the overhead old JVMs dealt with. But the simple truth is that reflected access through any number of layers will always be slower than a direct call, partially because the completely generified "invoke" method must check and re-check receiver type, argument types, visibility, and other details, but also because arguments must all be objects (so primitives get object-boxed) and must be provided as an array to cover all possible arities (so arguments get array-boxed).

The performance difference may not matter for a library doing a few reflected calls, especially if those calls are mostly to dynamically set up a static structure in memory against which it can make normal calls. But in a dynamic language, where every call must use these mechanisms, it's a severe performance hit.

Build a Better Mousetrap?

As a result of reflection's poor (relative) performance, language implementers have been forced to come up with new tricks. In JRuby's case, this means we generate our own little invoker classes at build time, one per core class method. So instead of calling through our DynamicMethod to a java.lang.reflect.Method object, boxing argument lists and performing type checks along the way, we're able to create a fast, specialized bit of bytecode that does the trick for us.
public org.jruby.runtime.builtin.IRubyObject call(org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject,
org.jruby.RubyModule, java.lang.String, org.jruby.runtime.builtin.IRubyObject);
Code:
0: aload_2
1: checkcast #13; //class org/jruby/RubyString
4: aload_1
5: aload 5
7: invokevirtual #17; //Method org/jruby/RubyString.split:(Lorg/jruby/runtime/ThreadContext;
Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/RubyArray;
10: areturn
Here's an example of a generated invoker for RubyString.split, the implementation of String#split, taking one argument. We pass into the "call" method a ThreadContext (runtime information for JRuby), an IRubyObject receiver (the String itself), a RubyModule target Ruby type (to track the hierarchy during super calls), a String method name (to allow aliased methods to present an accurate backtrace), and the argument. Out of it we get an IRubyObject return value. And the bytecode is pretty straightforward; we prepare our arguments and the receiver and we make the call directly. What would normally be perhaps a dozen layers of reflected logic has been reduced to 10 bytes of bytecode, plus the size of the class/method metadata like type signatures, method names, and so on.

But there's still a problem here. Take a look at this other invoker for RubyString.slice_bang, the implementation of String#slice!:
public org.jruby.runtime.builtin.IRubyObject call(org.jruby.runtime.ThreadContext, org.jruby.runtime.builtin.IRubyObject,
org.jruby.RubyModule, java.lang.String, org.jruby.runtime.builtin.IRubyObject);
Code:
0: aload_2
1: checkcast #13; //class org/jruby/RubyString
4: aload_1
5: aload 5
7: invokevirtual #17; //Method org/jruby/RubyString.slice_bang:(Lorg/jruby/runtime/ThreadContext;
Lorg/jruby/runtime/builtin/IRubyObject;)Lorg/jruby/runtime/builtin/IRubyObject;
10: areturn
Oddly familiar, isn't it? What we have here is called "wastefulness". In order to provide optimal invocation performance for all core methods, we must generate hundreds of these these tiny methods into tiny classes with everything neatly tied up in a bow so the JVM will pretty please perform that invocation for us as quickly as possible. And the largest side effect of all this is that we generate the same bytecode, over and over again, with only the tiniest of changes. In fact, this case only changes one thing: the string name of the method we eventually call on RubyString. There are dozens of these cases in JRuby's core classes, and if we attempted to extend this mechanism to all Java types we encountered (we don't, for memory-saving purposes), there would be hundreds of cases of nearly-complete duplication.

I smell an opportunity. Our first step is to trim all that fat.

Hitting the Wall

Let me tell you a little story.

Little Billy developer wanted to freely generate bytecode. He'd come to recognize the power of code generation, and knew his language implementation was dynamic enough that compiling once would not be optimal. He also knew his language needed to do dynamic invocation on top of a statically-typed language, and needed lots of little invokers.

So one day, Billy's happily playing in the sandbox, building invokers and making "vroom, vroom" sounds, when along comes mean old Polly Permgen.

"Get out of my sandbox, Billy," cried Polly, "you're taking up too much space, and this is *my* heap!"

"Oh, but Polly," said Billy, rising to his feet. "I'm having ever so much fun, and there's lots of room to play on that heap over there. It's oh so large, and there's plenty of open space," he desperately replied.

"But I told you...this is MY heap. I don't want to play over there, because I like playing *right here*." She threw her exceptions at Billy, smashing his invokers to dust. Satisfied by the look of horror on Billy's face, she plopped down right where he had been sitting, and smiled terribly up at him.

Dejected, Billy sulked away and became a Lisp programmer, living forever in a land where data is code and code is data and everyone eats butterscotches and rides unicorns. He was never seen nor heard from again.


This story will be very familiar to anyone who's tried to push the limits of code generation on the JVM. The JVM keeps in memory a large, pre-allocated chunk of reserved space called the "heap". The heap is maintained as a contiguous area of space to allow the JVM's garbage collector to move objects around at will. All objects allocated by the system come out of this heap, which is usually split up into "generations". The "young" generation sees the most activity. Objects that are created and immediately dereferenced (like, abandoned?), never make it out of this generation. Objects that persist longer stick around longer. Some objects live forever and get to the oldest generations, but most objects die an early death. And when they die, their bodies become the grass, and the antelope eat the grass. It's a beautiful circle of life. But why are there no butterscotches and unicorns?

The dirty secret of several JVM implementations, Hotspot included, is that there's a separate heap (or a separate generation of the heap) used for special types of data like class definitions, class metadata, and sometimes bytecode or JITted native code. And it couldn't have a scarier name: The Permanent Generation. Except in rare cases, objects loaded into the PermGen are never garbage collected (because they're supposed to be permanent, get it?) and if not used very, very carefully, it will fill up, resulting in the dreaded "java.lang.OutOfMemoryError: PermGen space" that ultimately caused little Billy to go live in the clouds and have tea parties with beautiful mermaids.

So it is with great reluctance that we are forced to abandon the idea of generating a lot of fat, wasteful, but speedy invokers. And it's with even greater reluctance we must abandon the idea of recompiling, since we can barely afford to generate all that code once. If only there were a way to share all that code and decrease the amount of PermGen we consume, or at least make it possible for generated code to be easily garbage collected. Hmmm.

AnonymousClassLoader

Now it starts to get cool.

Enter java.dyn.AnonymousClassLoader. AnonymousClassLoader is the first artifact introduced by the InvokeDynamic work, and it's designed to solve two problems:
Generating many classes with similar bytecode and only minor changes is very inefficient, wasting a lot of precious memory.
Generated bytecode must be contained in a class, which must be contained in a ClassLoader, which keeps a hard reference to the class; as a result, to make even one byte of bytecode garbage-collectable, it must be wrapped in its own class and its own classloader.

It solves these problems in a number of ways.

First, classes loaded by AnonymousClassLoader are not given full-fledged symbolic names in the global symbol tables; they're given rough numeric identifiers. They are effectively anonymized, allowing much more freedome to generate them at will, since naming conflicts essentially do not happen.

Second, the classes are loaded without a parent ClassLoader, so there's no overprotective mother keeping them on a short leash. When the last normal references to the class disappear, it's eligible for garbage collection like any other object.

Third, it provides a mechanism whereby an existing class can be loaded and slightly modified, producing a new class with those modifications but sharing the rest of its structure and data. Specifically, AnonymousClassLoader provides a way to alter the class's constant pool, changing method names, type signatures, and constant values.
   public static class Invoker implements InvokerIfc {
public Object doit(Integer b) {
return fake(new Something()).target(b);
}
}

public static Class rewrite(Class old) throws IOException, InvalidConstantPoolFormatException {
HashMap constPatchMap = new HashMap();
constPatchMap.put("fake", "real");

ConstantPoolPatch patch = new ConstantPoolPatch(Invoker.class);
patch.putPatches(constPatchMap, null, null, true);

return new AnonymousClassLoader(Invoker.class).loadClass(patch);
}
Here's a very simple example of passing an existing class (Invoker) through AnonymousClassLoader, translating the method name "fake" in the constant pool into the name "real". The resulting class has exactly the same bytecode for its "doIt" method and the same metadata for its fields and methods, but instead of calling the "fake" method it will call the "real" method. If we needed to adjust the method signature as well, it's just another entry in the constPatchMap.

So if we put these three items together with our two invokers above, we see first that generating those invokers ends up being a much simpler affairs. Where before we had to be very cautious about how many invokers we created, and take care to stuff them into their own classloaders (in case they need to be garbage-collected later), now we can load them freely, and we will see neither symbolic collisions nor PermGen leaks. And where before we ended up generating mostly the same code for dozens of different classes, now we can simply create that code once (perhaps as normal Java code) and use that as a template for future classes, sharing the bulk of the class data in the process. Plus we're still getting the fastest invocation money can buy, because we don't have to use reflection.

Who could ask for more?

Parametric Explosion

I could. There's still a problem with our invokers: we have to create the templates.

Let's consider only Object-typed signatures for a moment. Even if we accept that everything's going to be an Object, we still want to avoid stuffing arguments into an Object[] every time we want to make a call. It's wasteful, because of all those transient Object[] we create and collect, and it's slow, because we need to populate those arrays and read from them on the other side. So you end up hand-generating many different methods to support signatures that don't box arguments into Object[]. For example, the many call signatures on JRuby's DynamicMethod type, which is the supertype of all Ruby method objects in a JRuby runtime:
    public abstract IRubyObject call(ThreadContext context, IRubyObject self, RubyModule clazz, 
String name, IRubyObject[] args, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule clazz,
String name, IRubyObject[] args);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, IRubyObject arg3);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, Block block);
public IRubyObject call(ThreadContext context, IRubyObject self, RubyModule klazz, String name, IRubyObject arg1, IRubyObject arg2, IRubyObject arg3, Block block);
What was that I said about wasteful?

And this doesn't even consider the fact that ideally we want to move toward calling methods with *specific types* since any good JVM dynlang will eventually have to call a normal Java method with a non-Object-based signature. Oh, we could certainly generate new versions of "call" into their own little interfaces at runtime, but we'd have to load them, manage them, make sure they can GC, make sure they don't collide with each other, and so on. We end up back where we started, because AnonymousClassLoader is only part of the solution. What we really need is a way to ask the JVM for a lightweight, non-reflected, statically-typed "handle" to a method that's primitive enough for the JVM to treat it like a function pointer.

Hey! Let's call it a MethodHandle! Brilliant!

Method Handles

MethodHandle is the next major piece of infrastructure added for InvokeDynamic. Instead of having to pass around java.lang.reflect.Method objects, which are slower to invoke and carry all that metadata and reflection bulk with them, we can now instead deal directly with MethodHandle, a very primitive reference type representing a specific method on a specific type with specific parameters.

But wait, didn't you say specifics get in the way?

Specifics can get in the way if we're concerned only about invoking dumb dynamic-typed methods that could accept any number of types, as is the case in dynamic languages. Being forced to specify a specific type means that specific type becomes Object, and so all paths must lead to the same generic code. And truly, if MethodHandle was no more than a "detachable method" it wouldn't be particularly useful. But in order to support the more complex call protocols dynamic languages introduce, with their implicit type conversions, dynamic lookup schemes, and "no such method" hooks, MethodHandles are also composable.

Say we have a target method on the Happy type that takes a single String argument.
public class Happy {
public void happyTime(String arg){}
}
We can capture a method handle for this class in one of two ways. We can either "unreflect" a java.lang.reflect.Method object, or we can ask the MethodHandles factory to produce one for us:
MethodHandle happyTimeHandle = MethodHandles.findVirtual(Happy.class, "happyTime", void.class, String.class);
Our new happyTimeHandle is a direct reference to the "happyTime" method. It's statically typed, with a type signature of "(Happy, String)void" (meaning it accepts a Happy argument and a String argument and returns void, since we must include the receiver type). And the code looks very similar to retrieving a java.lang.reflect.Method instance. So if all we're concerned about is calling happyTime on a Happy instance with a String argument, this is basically all there is to it. But that's rarely enough for us dynamic types. No, we need all our "magic" too.

Luckily, MethodHandles also provides a way to adapt and compose handles. Perhaps the simplest adaptation is currying.

Currying a method (and really when we talk about methods here we're talking about functions with a leading receiver argument) means to grab that method reference, stuff a couple values into its argument list, and produce a new method reference that uses those values plus future values you provide at call time to make the target call. In this case, we'll insert a Happy instance we want this handle to always invoke against.
MethodHandle curriedHandle = MethodHandles.insertArgument(happyTimeHandle, new Happy());
The resulting curried handle has a signature of only "(String)void", since we've curried or bound the handle to a specific instance of Happy.

There are also more complicated adaptations. We may need to have what John Rose calls a "flyby" adapter that examines and possibly coerces arguments in the arg list. So we grab a handle to the method representing that logic, attach it to our MethodHandle as a flyby argument adapter, and the resulting handle will perform that adaptation as calls pass through it. We may want to "splat" or "spread" arguments, accepting a variable argument count and automatically stuffing it into an array. MethodHandles.spreadArguments can return a handle that does what we're looking for. Perhaps we need pre and post-call logic, like artificial frame or variable scope allocation. We just represent the logic as simple functions, produce handles for each, and assemble a new MethodHandle that brackets the call. Bit by bit, piece by piece, the complex vagaries of our call protocols can be decomposed into functions, referenced by method handles, and composed into fast, efficient, direct calls. Are we having fun yet?

We haven't even gotten to the coolest part.

Brief History

JSR-292 started out life as a proposal for a new bytecode, "invokedynamic", to accompany the four other "invoke" bytecodes by allowing for dynamic invocation. When it was announced, the early concept provided only for invocation without a static-typed signature. It still required a call to eventually reach a real method on a real type, and it did not provide (or did not specify) a way to alter the JVM's normal logic for looking up what method it should actually invoke. For languages like JRuby and Groovy, which store method tables in their own structures, this meant the original concept was essentially useless: most dynamic languages have "open" types whose methods can be added, removed, and redefined later, so it was impossible to ever present a normal type invokedynamic could call.

It also included nothing to solve the larger problems of implementing a dynamic language on the JVM, problems like the restrictive, over-pedantic rules for loading new bytecode and the limitations and poor performance of reflected methods. It was, in essence, dead in the water. That was mid 2006.

Fast-forward to September of that year. Sun Microsystems, after years of promoting Java as the "one true language" on the JVM, has decided to hire on two open-source developers to work on the JRuby project, a JVM implementation of Ruby, a fairly complex dynamically-typed language. The pair had managed to run the most complicated application framework the Ruby world had to offer, and for the first time in a long time it started to look like directly supporting non-Java languages on the JVM might be a good idea.

Around this time or shortly after, John Rose became the new JSR-292 lead. John was a member of the Hotspot VM team, and among his many accomplishments he listed a fast Scheme VM and a bytecode-based regular expression engine. But perhaps most importantly, John knew Hotspot intimately, knew that the its core was simply *made* for dynamic languages, and had a pretty good idea how to expose that core. So it began.

InvokeDynamic

The culmination of InvokeDynamic is, of course, the ability to make a dynamic call that the JVM not only recognizes, but also optimizes in the same way it optimizes plain old static-typed calls. AnonymousClassLoading provides a piece of that puzzle, making it easy to generate lightweight bits of code suitable for use as adapters and method handles. MethodHandle provides another piece of that puzzle, serving as a direct method reference, allowing fast invocation, argument list manipulation, and functional composability. The last piece of the puzzle, and probably the coolest one of all, is the bootstrapper. Now it's time to blow your mind.

There's two sides to a invocation. There's the call, presumably a chunk of bytecode doing an "invoke" operation, and there's the target, the actual method it invokes. Under normal circumstances, targets fall into three categories: static methods, virtual methods, and interface methods. Because two of these types--static and virtual--are explicitly bound to a specific method, they can be verified when the method's bytecode is loaded. If the type or method do not exist, the bytecode is considered invalid and an error is thrown. However the third type of target, an interface method, may have any number of targets at runtime, potentially targets that have not even been loaded into the system yet. So the JVM gives invokeinterface operations much more flexibility. Flexibility we can exploit.

Much of the JVM's optimizations come from it treating what looks like normal code as "special". Hotspot, for example, has a large list of "intrinsic" methods (like System.arraycopy or Object.getClass), methods that it always tries to inline directly into the caller, to ensure they have the maximum possible performance and locality. It turns out that adding bytecodes to the JVM isn't really even necessary, if you have the freedom to define special new behaviors based solely on the methods, types, or operations in play. And apparently, the Hotspot team has that freedom.

Because of the low probability of a new bytecode being approved, and because it really wasn't necessary, John introduced a "special" new interface type called java.dyn.Dynamic. Dynamic does not include any methods, nor is it intended as a marker interface. You can implement it if you like, but its real purpose comes when paired with the invokeinterface bytecode. For you see, under InvokeDynamic, an invokeinterface against Dynamic is not really an interface invocation at all.
public class SimpleExample {
public Object doDynamicCall(Object arg) {
return arg.myDynamicMethod();
}
}
Here's a simple example of code that won't compile. Because the incoming argument's type is Object, we can only call methods that exist on Object. "myDynamicMethod" is not one of them. The hypothetical bytecode for that call, if it did compile, would look roughly like this:
public java.lang.Object doDynamicCall(java.lang.Object);
Code:
0: aload_1
1: invokevirtual #3; //Method java/lang/Object.myDynamicMethod:()V
4: areturn

In its current state, this bytecode would not even load, because the verifier would see there's no myDynamicMethod on Object and kick it out. But we want to make a dynamic call, right? So let's transform that virtual invocation into a dynamic one:
public java.lang.Object doDynamicCall(java.lang.Object);
Code:
0: aload_1
1: invokeinterface #3; //Method java/dyn/Dynamic.myDynamicMethod:()V
4: areturn
Hooray! We've set up a dynamic call! Wasn't that easy?

We've made it an interface invocation, so the JVM won't kick it out and it loads happily. And we've provided our "special" marker, the java.dyn.Dynamic interface, so the JVM knows not to do a normal interface invocation. That wraps up the call side...myDynamicMethod is now recognized as an "invokedynamic". But what about the target? How do we route this call to the right place?

Now we finally get to the bootstrap process. In order to make dynamic languages truly first-class citizens on the JVM, they need to be able to actively participate in method dispatch decisions. If method lookup and dispatch is forever only in the hands of the JVM, it's a much more complicated process to do fast dynamic calls. Believe me, I've tried. So John came up with the idea of a "bootstrap" method.

The bootstrap method is simply a piece of code that the JVM can call when it encounters a dynamic invocation. The bootstrap receives all information about the call directly from the JVM itself, makes a decision about where that call needs to go, and provides that information to the JVM. As long as that decision remains valid, meaning future calls are against the same type and method tables don't change, no further calls to the bootstrap are needed. The JVM proceeds to link and optimize the dynamic call as if it were a normal static-typed invocation. Here's what this looks like in practice:
public class DynamicInvokerThingy {
public static Object bootstrap(CallSite site, Object... args) {
MethodHandle target = MethodHandles.findStatic(
MyDynamicTarget.class,
"myDynamicMethod",
MethodType.make(Object.class, site.type().parameterArray()));
site.setTarget(target);

return MyDynamicTarget.myDynamicMethod(args[0]);
}
}
This is a simple bootstrap method for the "myDynamicMethod" call above. When "myDynamicMethod" is invoked, the JVM "upcalls" into this bootstrap method. It provides the original argument list (with the receiver first, since invokeinterface always takes a receiver), and a CallSite. CallSite is a representation of the "site" in the original code where the dynamic invocation came from, and it has a type just like a method handle. In this case, the CallSite.type() is "(Object)Object" since we always pass along the receiver (the one Object argument) and the method returns an Object.

In this case, we're just going to bind any dynamic call coming into this bootstrap to the same method, which might look like this:
public class MyDynamicTarget {
public static Object myDynamicMethod(Object receiver) { ... }
}
Notice that now we actually have a formal argument for the receiver; because we have bound an instance invocation (invokeinterface) to a static method (invokestatic) the receiver becomes the first argument to the call. Back in bootstrap, we retrieve a handle to this method and set it into the CallSite. At this point the CallSite has everything it needs for the JVM to link future calls straight through. As a final step, we perform the invocation ourselves to provide a return value for the current call. And the bootstrap method will never be called for this particular call site again...because the JVM links it straight through.

As I alluded to earlier, we can also invalidate a CallSite by clearing its target. Clearing the target tells the JVM the originally linked method is no longer the right one, please bootstrap again. We're basically a direct participant in the JVM's method selection and linking process. So cool.

Oh, there's one more bit of magic I should show you: how to get from point A to point B, i.e. how to tell the JVM which bootstrap method to use. Remember our SimpleExample class above? The one we coaxed into doing dynamic invocation? Here's how we point SimpleExample's dynamic calls at our bootstrap method...we just this code add to SimpleExample itself:
    static {
Linkage.registerBootstrapMethod(
SimpleExample.class,
MethodHandles.findStatic(DynamicInvokerThingy.class, "bootstrap", Linkage.BOOTSTRAP_METHOD_TYPE));
}
Linkage is another class from InvokeDynamic, responsible primarily for wiring up dynamic-invoker classes to their bootstrap logic. Here we're registering a bootstrap method for SimpleExample by creating a handle to DynamicInvokerThingy.bootstrap. Linkage has a convenient BOOTSTRAP_METHOD_TYPE constant we can use for the type. And that's basically it. What could be easier?

Status

InvokeDynamic is a work in progress. It first successfully performed a dynamic invocation on August 26, 2008 - International InvokeDynamic Day. John had given me wind of the "imminent" event, so I had already started to look at wiring it into JRuby. Ultimately, it was into the first week of September before I got all the bits together and working, but after a day or two of back-and-forth emails, a bug report (I found a bug! I'm helping!), and a little JRuby refactoring, I managed to successfully wire InvokeDynamic directly into JRuby's dispatch process! Such excitement! The code is already in JRuby's trunk, and will ship with JRuby 1.1.5 (though it obviously will be disabled on JVMs without InvokeDynamic).

Now before you go off and get all excited, you should know that I wired it up in probably the most primitive way possible. A lot of the method-adapting logic isn't fully implemented yet, and what is there isn't wired into Hotspot's JIT, so it's still early days. But I'm absolutely giddy when I think about the possibilities of MethodHandles alone, much less the entire InvokeDynamic package all together. It gives me shivers just thinking about it.

(Before you think I'm some kind of crackpot, imagine how much work it's taken to get JRuby running as well as it is today and how much work each tiny incremental improvement requires. The idea that the next round of *major* improvements will be a simple matter of functionally decomposing JRuby's core--something we've wanted to do all along--is pure butterscotches and unicorns.)

And there's also the sobering fact that at best this would be a Java 7 feature; there's no possibility of backporting it other than as an emulation layer. So production users looking for InvokeDynamic-enabled JRuby are going to have to be ambitious or at least wait for Java 7...and that's assuming we're able to get the JSR approved and included (though I'm going to do whatever I can to make that happen).

But at the end of the day, make no mistake: The JVM is going to be the best VM for building dynamic languages, because it already is a dynamic language VM. And InvokeDynamic, by promoting dynamic languages to first-class JVM citizens, will prove it.

More Information

If you'd like to read more about InvokeDynamic, here's a few resources:

The JSR-292 JCP page has a link to the draft document about InvokeDynamic. It's starting to get a little aged now but the general concepts are all there. A good read.

The JRuby SVN repository already contains the early InvokeDynamic work I've done. Look for the classes InvokeDynamicInvocationCompiler and InvokeDynamicSupport, both referenced from StandardASMCompiler. And feel free to email or stop into #jruby on FreeNode IRC if you have questions.

The Multi-Language VM page can get you started with John Rose's InvokeDynamic patches, along with some other oddities like JVM continuations and something called "quid". And you'll need a good walkthrough on building OpenJDK, so try Volker Simonis's OpenJDK instructions for now. Unfortunately the MLVM bits only work on Linux and Solaris builds of OpenJDK at the moment; that will change in the future.

Update: I can't believe I forgot to do my final plug for the JVM Languages Summit, which is coming up at the end of this month. I believe there's still a few seats open. If you're in the SF bay area or feel like taking a trip, the slate of talks is going to be awesome. You will hear John Rose talk about InvokeDynamic, me talk about JRuby past and future, and lots, lots more. There's even a couple Microsofties coming down and a Parrot presentation. Great fun!

And as always, feel free to contact me, comment on this blog, or look me up on IM or IRC. I'm keen to see InvokeDynamic put through its paces all the way through its specification and development process, and I could use some help.

Thank you for your time!

Friday, September 05, 2008

The Elephant

I'll make this a short one.

I was just having a conversation with a friend, a Rubyist whose opinion I respect, who clued me in that he really hates when JRuby users use Java libraries with little or no Ruby syntactic sugar. He hates that there's a better chance every day that Java-related technologies will enter his world. That he's going to have to fix someone's Java-like Ruby. He lamented the lack of decent wrapper libraries that hide "the Java insanity", that are just bare-metal shims over the Java classes they call. He expressed his frustration that JRuby being successful will mean he's going to have to deal with Java. He doesn't want to *ever* have to do that.

And he said it's our fault.

I've heard variations of this from other key Rubyists too. There's a lot of hate and angst in the Ruby community. Many of them are Java escapees, who long ago decided they couldn't tolerate Java as a language or were fed up dealing with some of the many failed libraries and development patterns it has spawned. Some of them are C escapees who've never quite been able to let go of C, be it for performance reasons or because of specific libraries they need. Some of them have been Rubyists longer than anything else (or maybe just longer than anyone else), and see themselves as the purists, the elite, the Ivory Tower, keepers of all that's good in the Ruby world and judge, jury, and executioner for all that's bad. In the end, however, there's one thing these folks share in common.

They think JRuby is a terrible idea.

Of course it's not everyone. I think the general Ruby populace still looks at JRuby as an interesting project...for Java developers. Or maybe just as a gateway to bring people into the community. A growing minority of folks, however, have managed to move beyond prejudices against Java to make new tools, applications, and libraries using JRuby that might not otherwise have been possible. And some folks are simply ecstatic about JRuby's potential.

Why is JRuby such a polarizing issue?

I don't see this in the Python community, for example, which might surprise some Rubyists. Pythonistas seem to have positively embraced both IronPython and Jython. There's no side-chatter at the conferences about the evils of anything with a J in it. There's no mocking slides, no jokes at Jython or IronPython developers' expense. No "Python elite" cliques actively working to shut Jython or IronPython out, or to discourage others from considering them. The community as a whole--Guido included--seems to be genuinely thankful for implementation diversity. Even if one of them does have a J in it.

What's different about these two communities? Why?

I work on JRuby. For the past 3-4 years, it has been my passion. There's been pain and there's been triumph: compatibility hassles; performance numbers steadily increasing; rewriting subsystems I swore I'd never touch like IO and Java integration. Over the past two years, I've put in four years' worth of work, writing compilers, rewriting JRuby's runtime, rewriting whole subsystems, speaking at conferences, staying up late nights (frequently ALL night) helping users on the JRuby IRC channel or mailing lists, and hacking, hacking, hacking almost all day, every day. For what? Because I want to infect the JRuby community with a new and more virulent strain of Java? Because I don't know any better?

I work on JRuby because I love Ruby and I honestly believe JRuby is one of the best things ever to happen to Ruby. JRuby takes a decade of Java dogma and turns it on its head. JRuby isn't about Java, it's about taking the best of the Java platform and using it to improve Ruby. It's about me and others working relentlessly, writing Java so you don't have to. It's about giving Ruby access to one of the best VMs around, to one of the largest collections of libraries in the world, to a pool of talented engineers who've written this stuff a dozen times over. Sure there's crap in the Java world. Sure the Java elite took power in the late 90s and started to jam a bunch of nonsense down our throats. Sure the language has aged a bit. That's all peripheral. JRuby makes it possible to filter out and take advantage of the good parts of the Java world without writing a single line of Java.

Tell me that's not a good idea.

I sympathize with my friend...I really do. I've not only seen a lot of really bad Ruby code come out of JRubyists, I've created some of it. Writing good code is hard in any language, but writing Ruby code that meets the Ivory Tower's standards is like trying to decipher J2EE specifications. If I have to listen to some speaker meditate on what "beautiful code" means one more time I think I'm going to kill someone. Yes, beauty is important. I have my idea of beautiful code and you have yours, and there may be a nexus where the two meet. But tearing into people who are trying to learn Ruby, trying to move away from Java, doing the best they can to meet the Ivory Tower's standards of "beauty"...well that's just mean. And it doesn't have to be that way. "Beauty" doesn't have to be Ruby's "Enterprise".

JRuby doesn't mean Java any more than MRI means C, Ironruby means C#, or Rubinius means C++ and LLVM. JRuby, like the other implementations, is a tool, an enabler, an alternative. JRuby does many things extremely well and others poorly, just like the other implementations. It's bringing new people into Ruby, and for that we should be thankful. It's pushing the boundaries of what you can do with Ruby, and for that we should be thankful. It's not about Java...it's about learning from the successes and mistakes of the past and using that knowlege to push Ruby forward.

So what do we do about JRuby users that start writing Java code in Ruby? We teach them. We help them. We don't slap a scarlet J on their chest and run them out of town. What do we do about shim layers over Java libraries? We build a layer on top of that shim that better exercises Ruby's potential, or we help build a new wrapper to replace the old. That's what Nick Sieger did with Warbler. That's what the Happy Campers are doing with Monkeybars and Jeremy Ashkenas did with Ruby-Processing. More and more people are recognizing that JRuby isn't a threat, doesn't represent the old world, doesn't mean Java...it means empowerment, it means standing on the shoulders of giants, and never having to leave Ruby.

I guess what it really comes down to is this:

The next time someone tries to cut down JRuby, tries to convince you it's a bad idea, to avoid it, to stay away from the evils of Java; the next time someone tears into a library author who hasn't learned the best way to utilize Ruby; the next time someone complains about a library that doesn't lend itself to reimplementation on the C-based implementations, doesn't hide the fact that it's wrapping Java code; the next time someone tries to convince you that JRuby is going to hurt the Ruby community...you tell them to remember this:

JRuby is not going away. More people try JRuby every day. As long as Rubyists who know "the way", who have learned how to create beautiful APIs and DSLs, who serve as the stars, the leaders of the Ruby community, setting standards for others to follow...as long as those people try to marginalize JRuby, treat it like a pariah, or convince others to do the same...

...it will only get worse.

Tuesday, September 02, 2008

A Few Thoughts on Chrome

I'm just reading through the comic and thought I'd jot down a few things I notice as I go. Take them for what they're worth, high-level opinions.

Browsers single-threaded? Maybe 10 years ago. I routinely have a CPU-heavy JS running in Firefox and it doesn't stop me working in other tabs. Threading is hard. Let's go shopping. Or use processes. So now when I have 50 tabs open, it won't just be 50 idle tabs, it will be 50 idle processes all holding on to their resources, not sharing, not pooling. Hmm. The majority of sites I view don't have copious amounts of JS running constantly. In response to events, sure. On load, sure. But not sitting there churning away. So a process per tab for pages that are rendered once and executed once seems a little wasteful. Browsers eat more memory because of memory fragmentation, eh? Memory management is hard. Let's go shopping and use processes. Or use a compacting memory manager, eh? What year is this? Even without GC it's not difficult to use opaque handles instead of pointers so you can juggle memory around as needed. Perhaps this is the over-optimization of the C/C++ crowd still living large. "I can't afford to dereference ONE MORE POINTER to get at my data! I've got to be direct-fucking-to-the-metal!" Not particularly looking forward to seeing 50 Chrome processes in my process lists. The ability to see individual pages as processes and kill them separately does sound intriguing; not sure it's worth every damn tab being a process though. Fuzz testing for browsers is a great idea. Zed Shaw and I talked about doing something similar for Ruby implementations...a sort of "smart fuzzing" that sends them parsable but random input. I don't know why more testing setups don't fuzz. It will be interesting to see V8 versus TraceMonkey. Hopefully my minders will see there's some value in revisiting Rhino and bringing it up to date (it hasn't had any major work done in a long time, and could be a *lot* faster). Pattern-based code-generation behind the scenes is also good. Kresten Krab Thorup demonstrated something similar with Ruby on JVM where instance variables (and I think, method dispatch tables) could be promoted to temporary classes with real fields at runtime, making them a lot faster. I've prototyped something similar in JRuby, but until Java 7 it's expensive to generate and load lots of throwaway bytecode.
Ahh, now they start talking about using better garbage collection technology. Perhaps the folks who decided processes were the only way to efficiently manage cross-tab memory shoulda had a V8? Dragging tabs between windows sounds pretty cool. Except that I actually find multiple browser windows to be a nuisance most of the time (and no, not because I can't drag tabs...because I generally browse everything full-screen and would have to search for the right window). Besides, I can drag tabs in Firefox already. I suppose the benefit in Chrome is they wouldn't have to reload. Do I care?
I just dragged the Chrome comic to another window to test it and then spent 30 seconds trying to figure out where I dragged the Chrome comic. Multiple browser windows = fail. I'm glad browser address completion has been made "really compelling" finally. Finally someone makes local browser caching smart! I'd say probably 50% of the pages I visit never ever change, like online documentation. I never want to have to go to the net to view them if I don't have to, and if I'm not connected...dammit, just give me what's in the cache! On the other hand, there's privacy concerns too. Hopefully I can have full control over what is and is not cached and an easy way to flush it. I can imagine someone borrowing my browser for a moment and stumbling onto a non-public page I've cached. Opening an MRU page in a new tab is fine, but I'm almost always using the keyboard to open a tab. Hopefully this isn't going to get in the way of my immediately typing a place to go, which is usually why I open a tab. Ahh, there we go, an "incognito tab" for private browsing. Seems like a good way to go. Not sure I'm seeing the keyboard angle well-understood here. Hopefully that's not the case. As much as possible, I avoid the mouse, even when browsing. Attention to the keyboard crowd would go a long way, especially in geekier communities. Example: keyboard support in most Google apps is bafflingly arbitrary. Maybe that's a limitation of JS or the "old world" browsers. Maybe not. Still baffling. I hope there's going to be support for Java in Chrome. Seems like it would be a big win, since it already can share some data across processes (like the the class libraries), misbehaving applets wouldn't impact the rest of the process (which is admittedly a good case for process-based isolation...maybe use the presence of heavy JS, plugins, applets as the indication to process-isolate?), and I'd wager Google could do a cleaner job of integrating it (without it being intrusive) than other browsers have so far. Plus since plugins are already being offloaded to a separate process, that could be a single JVM kept warm (or separate JVMs, for more memory use but better isolation and sandboxing). Java ought to be one of the best-behaved plugin citizens...done right.
I wonder how long it will be until Chrome gets Google API updates (like Gears) before the other browsers. I don't buy the "we just want to make the user experience all unicorns and lollipops" thing. There's a business motivation for Chrome. More ad exposure? A first-class deployment for Google APIs so more people write for them so more people use Chrome so ads are easier to channel? Ahh yes...we want them to be open standards. Maybe they will, maybe they won't. If they don't...here's where it gets squirrely...it's open source! Anyone can take what they want and put it in their own browser, right? Yeah, you and the other 100 plugins that only work in one type of browser. How quickly do they get gobbled up by others? You do realize it takes some dedicated resources to "take what you want" and maintain it, not to mention keeping it up to date with the original. A commitment to always supporting all browsers seems to be much harder once you've made a monetary commitment to building your own browser. Left hand, meet right hand. I won't ask the "why didn't you just help Firefox" question, since it's obvious and there's a million reasons why someone starts a new project. But I will ask why this project to "help all browsers become more powerful" is sprung on the world a day before the beta. There's a desire for exclusivity here or I'll eat my hat. Unicorns and lollipops would have required opening the project months ago, so "all browsers" could benefit from it *as it progressed* and contribute *as it progressed*. Open is not "open once we're ready to beta our product because we think it's nearing completion", it's "we're working on it now and want everyone to benefit from it as we move forward."
Overall I'm sure it will be a really excellent browser that I may or may not use. Partially because Google's client app support is basically nonexistent for OS X (how long has Google Talk been promised for OS X?), and partially because...I dunno...I feel better using a browser not designed and controlled by an ad-funded megapower. I'd rather not allow Google to control the vertical AND the horizontal. Not that I have anything against Google in general. This blog is hosted on Google-powered Blogger, I use Google for search pretty much exclusively, and I host services for my personal domain headius.com on Google as well. But it seems disingenuous to say Chrome is supposed to "help everyone" and yet nobody gets to see any of it until they're in beta. I guarantee Firefox folks would have started integrating portions as soon as the code were opened up, which of course would have taken some steam out of the eventual announcement. Try to look past the fireworks and bluster, folks.

And I reserve the right to completely flip any of these opinions after the beta is released this evening...though I probably won't, since I can't run it (Windows only).

Update: I borrowed a friend's Windows machine to give Chrome a 15-minute try. Here's my additional 15-minute thoughts, so take them at face value.
I hate installers that download additional stuff. When my friend and I first downloaded, we proceeded to walk away from the interwebs for some offline fiddling. Only then did we discover we didn't have the whole thing. Love the interface. It's almost too clean. Unfortunately I can guarantee I'd immediately clutter it up with bookmarks I need (want) one-click access to. Such is life. But I like starting from a blank slate first, rather than starting from a cluttered mess. Very fast, true to form. It also feels snappier than Firefox, but Firefox isn't known for it's blazing speed. Maybe feels faster than Safari. Of course, young products are always fast. Not quite a process per tab. It seems like tabs manually opened and presumably tabs opened from bookmarks do get their own processes. Tabs opened via right-clicking on the link and choosing to open in a new tab stay in-process. That's a reasonable way to reduce process load, since a good portion of the tabs I open are from existing pages like Reader or News. Unfortunately, this also means that a good portion of the tabs I open are not subject to the sandboxing or isolation touted as a key feature of Chrome. The developer pages list a Mozilla Java plugin wrapper among the included technologies. Yay! I did not get a chance to try it out (Windows rapidly started to piss me off again).
I picked a tab at random to forcibly kill and the entire browser disappeared. I guess I picked the right one.
This is a little worrisome:


All told it's about what I expected. Very clean, very polished, very young. I'm sure a lot of these issues will get shaken out during the beta. I do hope there's a way to turn off a tab-per-process, or I can't see myself ever wanting to run Chrome. I can see myself gathering several dozen Chrome processes in the course of a week. Process isolation for other aspects (like JS or plugins), no worries. I'm looking forward to an OS X version, and from looking at the Chrome developer pages it sounds like that isn't too far off. Perhaps marketroids pressured the team to get out a Windows version first, so they could make some headlines. Damn marketroids.

And as regular readers of my blog will tell you, I can be a bit salty about young up-and-coming technologies with a chip on their shoulders. Ignore that.

Sunday, August 31, 2008

A Duby Update

I haven't forgotten about my promise to post on FFI and MVM APIs, but I've been taking occasional breaks from JRuby (heaven forbid!) to get some time on on Duby.

What Is Duby?

Duby, for those who have not heard of it, is my little toy language. It's basically a Ruby-like static-typed language with local type inference and not a whole lot of bells and whistles. The goal for Duby (which is most definitely a working name...it will probably change), is to provide the all the best parts of Ruby syntax people are familiar with, but add to it:

Written all in Ruby (and obviously the eventual plan would be to port it to Duby) Backend-agnostic (JVM is obviously my focus, but nothing stops someone from building an LLVM or CLR typer+compiler)
Minimally-intrusive static typing (Duby infers types from arguments and calls, like Scala) Features missing from Java (Duby treats module inclusion and class reopening like defining extension methods in C#) A very pluggable type inference engine (Duby's "Java" typer is currently all of about 20 lines of code that plugs into the engine) A pluggable compiler (Duby will allow adding compiler plugins to turn str1 + str2 into concatenation or StringBuffer calls, for example) Absolutely no runtime dependencies (I want compiled output from Duby to be *done*, so there's no runtime library to lug along so it works; once compiled, there are no dependencies on Duby)
The primary motivation for Duby was originally to have a Ruby-like language we could use to implement parts of JRuby. The JVM, it's type system, and its bytecode are all actually really, really nice. There's a huge collection of libraries, fast primitives (that get optimized into faster native code), and a bytecode specification that's pretty easy for almost anyone to grok. But there's a problem: Java.

Now don't get me wrong, Java is a great language, but it's become a victim of its own success. While other languages have been adding niceities like local type inference, structural typing, closures and extension methods, Java's stayed pretty much the same. There have been no major language changes to Java since Java 5's additions of generics, enums, annotations, varargs, and a few other miscellaneous odds and ends. Meanwhile, I live in a torturous world between Ruby and Java, where I'd love to write everything in Ruby (too slow, too inexact for "stable layer" code), but must instead write everything in Java (with associated syntactic baggage and 20th-century language design). And so necessity dictates taking a new approach.
def fib(a => :fixnum)
if a < 2
a
else
fib(a - 1) + fib(a - 2)
end
end

puts fib(45)
So here we see an example I've shown in previous posts, but with a twist. First off, it's almost exactly the same as the equivalent Ruby code except for the argument type declaration => :fixnum. The rest of the script is all vanilla Ruby, even down to the puts call at the bottom.

But all is not as it seems. This is not Ruby code.

The type declaration in the method def looks natural, but it's not actually parseable Ruby. I had Tom Enebo hack a change in to JRuby's parser (off by default) to allow that syntax. Duby originally had a syntax something like this, so it could be parsed by any Ruby impl:

def fib(a)
{a => :fixnum}
...
end
But it's obviously a lot uglier.

New Type Inference Engine

Ignoring Java for a moment we can focus on the type inference happening here. Originally Duby only worked with explicit Java types, which obviously meant it would only ever be useful as a JVM language. The use of those types was also rather ugly, especially in cases where you just want something "Fixnum-like". So even though I had a working Duby compiler several months ago, I took a step back to rewrite it. The rewrite involved two major changes:
Rather than build Duby directly on top of JRuby's AST I introduced a transformation phase, where the Ruby AST goes in and a Duby AST comes out. This allowed me to build up a structure that more accurately represented Duby, and also has the added bonus that transformers could be built from any Ruby parse output (like that of ruby_parser). Instead of being inextricably tied to the JVM's types and type system, I rewrote the inference engine to be type-system independent. Basically it uses all symbolic and string-based type identifiers, and allows wiring in any number of typing plugins, passing unresolved nodes to them in turn. Two great example plugins exist now: a Math plugin that knows how to handle mathematical and boolean operators against numeric types like :fixnum (it knows :fixnum < :fixnum produces a :boolean, for example), and a Java plugin that knows how to reach out into Java's classes and methods to infer return types for calls out of Duby-space.
The result of this is that up to the point of compilation, there's no explicit dependency on any named set of types, any type system, or any backend. Here's the output from the type inference engine running against that fib script above:
* [Simple] Learned local type under MethodDefinition(fib) : a = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "<" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "<" Type(fixnum) on Type(fixnum) = Type(boolean)
* [AST] [Call] resolved!
* [AST] [Condition] resolved!
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "fib" Type(fixnum) on Type(script)
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "fib" Type(fixnum) on Type(script)
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Method type for "+" on not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "+" on not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "+" on
* [Simple] Deferring inference for Call(+)
* [Simple] Deferring inference for If
* [Simple] Learned method fib (Type(fixnum)) on Type(script) = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [AST] [PrintLine] resolved!
* [Simple] Entering type inference cycle
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "+" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "+" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] [Cycle 0]: Inferred type for Call(+): Type(fixnum)
* [AST] [If] resolved!
* [Simple] [Cycle 0]: Inferred type for If: Type(fixnum)
* [Simple] Inference cycle 0 resolved all types, exiting
There's a lot going on here. You can see the MathTyper and JavaTyper both getting involved here. Since there's no explicit Java calls it's mostly the MathTyper doing all the heavy lifting. The inference stage progresses as follows:
Make a first pass over all AST nodes, performing trivial inferences (declared arguments, literals, etc). Add each unresolvable node encountered to an unresolved list. Cycle over that list repeatedly until either all nodes have resolved or the list's contents do not change from one cycle to the next.
It's a fairly brute-force inference mechanism, certainly not on the scale of a full Hindley-Milner inference. Honestly I find the type declaration in the argument list to be far more helpful than harmful, though, and I'm not smart enough to write my own H/M engine at the moment.

Include Java

Anyway, back to Duby. Here's a more complicated example that makes calls out to Java classes:
import "System", "java.lang.System"

def foo
home = System.getProperty "java.home"
System.setProperty "hello.world", "something"
hello = System.getProperty "hello.world"

puts home
puts hello
end

puts "Hello world!"
foo
Here we see a few new concepts introduced.

First off, there's an import. Unlike in Java however, import knows nothing about Java types; it's simply associating a short name with a long name. The syntax (and even the name "import") is up for debate...I just wired this in quickly so I could call Java code.

Second, we're actually making calls that leave the known Duby universe. System.getProperty and setProperty are calls to the Java type java.lang.System. Now the Java typer gets involved. Here's a snippit of the inference output for this code:
* [Simple] Method type for "getProperty" Type(string) on Type(java.lang.System meta) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xaf17c7>
* [Math] Method type for "getProperty" Type(string) on Type(java.lang.System meta) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1eb717e>
* [Java] Method type for "getProperty" Type(string) on Type(java.lang.System meta) = Type(java.lang.String)
* [AST] [Call] resolved!
The Java typer is fairly simple at the moment. When asked to infer the return type for a call, it takes the following path:
Attempt to instantiate known Java types for the target and arguments. It makes use of the list of "known types" in the typing engine, augmented by import statements. If those types successfully resolve to Java types... It uses Java reflection APIs (through JRuby) to look up a method of that name with those arguments on the target type. From this method, then, we have a return type. The return type is reduced to a symbolic name (since again, the rest of the type inference engine knows nothing of Java types) and we consider it a successful inference. If the method does not exist, we temporarily fail to resolve; it may be that additional methods are defined layer that will support this name and argument list.
So in this case, the "System" type has been associated with the "java.lang.System" class (the "meta" in the type reference means it's a class reference rather than an instance reference), and the argument type "string" resolves to "java.lang.String". So java.lang.System.getProperty(java.lang.String) resolves as returning java.lang.String, and we have successfully resolved the call.

Next Steps

I see getting the JVM backend and typer working as two major milestones. Duby already can learn about Java types anywhere in the system and can compile calls to them. But mostly what works right now is what you see above. There's no support for array types, instantiating objects, or hierarchy-aware type inference. There's no logic in place to define new types, static methods, or to define or access fields. All this will come in time, and probably will move very quickly now that the basic plumbing is installed.

I'm hoping to get a lot done on Duby this month while I take a "pseudo-vacation" from constant JRuby slavery. I also have another exciting project on my plate: wiring JRuby into the now-functional "invokedynamic" support in John Rose's MLVM. So I'll probably split my time between those. But I'm very interested in feedback on Duby. This is real, and I'm going to continue moving it forward. I hope to be able to use this as my primary language some day soon.

Update: A few folks asked me to post performance numbers for that fib script above. So here's the comparison between Java and Duby for fib(45).

Java source:
public class FibJava {
public static int fib(int a) {
if (a < 2) {
return a;
} else {
return fib(a - 1) + fib(a - 2);
}
}

public static void main(String[] args) {
System.out.println(fib(45));
}
}
Java time:
➔ time java -cp . FibJava


real 0m13.368s
user 0m12.684s
sys 0m0.154s 1134903170
Duby source:
def fib(a => :fixnum)
if a < 2
a
else
fib(a - 1) + fib(a - 2)
end
end

puts fib(45)
Duby time:
➔ time java -cp . fib


real 0m12.971s
user 0m12.687s
sys 0m0.112s 1134903170
So the performance is basically identical. But I prefer the Duby version. How about you?

Sunday, August 24, 2008

Zero to Production in 15 Minutes

There still seems to be confusion about the relative simplicity or difficulty of deploying a Rails app using JRuby. Many folks still look around for the old tools and the old ways (Mongrel, generally), assuming that "all that app server stuff" is too complicated. I figured I'd post a quick walkthrough to show how easy it actually is, along with links to everything to get you started.

Here's the full session, in all its glory, for those who just want the commands:

~/work ➔ java -Xmx256M -jar ~/Downloads/glassfish-installer-v2ur2-b04-darwin.jar
~/work ➔ cd glassfish
~/work/glassfish ➔ chmod a+x lib/ant/bin/*
~/work/glassfish ➔ lib/ant/bin/ant -f setup.xml
~/work/glassfish ➔ bin/asadmin start-domain
~/work/glassfish ➔ cd ~/work/testapp
~/work/testapp ➔ jruby -S gem install warbler
~/work/testapp ➔ jruby -S gem install activerecord-jdbcmysql-adapter
~/work/testapp ➔ vi config/database.yml
~/work/testapp ➔ warble
~/work/testapp ➔ ../glassfish/bin/asadmin deploy --contextroot / testapp.war
Now, on to the full walkthrough!

Prerequisites

JRuby 1.1.3 or higher

None of the steps in the main walkthrough require JRuby, since Warbler works fine under other Ruby implementations. But if you want to install and test against the JDBC ActiveRecord adapters, JRuby's the way. And in general, if you're deploying on JRuby, you should probably test and build on JRuby as well. Go to www.jruby.org under "Download!" and grab the latest 1.1.x "bin" distribution. I link here JRuby 1.1.3 tarball and JRuby 1.1.3 zip for your convenience. Download, unpack, put in PATH. That's all there is to it.

Java 5 or higher

Most systems already have a JVM installed. Here's some links to OpenJDK downloads for various platforms, in case you don't already have one.
Windows, Linux, Solaris: Download the JDK directly from Sun's Java SE Downloads page. I typically download the JDK (Java Development Kit) because I find it convenient to have Java sources, compilers, and debugging tools available, but this walkthrough should work with the JRE (Java Runtime Environment) as well. Linux and Solaris users should also be able to use their packaging system of choice to install a JDK.
OS X: The 32-bit Intel macs can't run the Apple Java 6, so you'll want to look at the Soylatte Java 6 build for OS X to get the best performance. A small warning...it doesn't have Cocoa-based UI components, so it will use X11 if you start up a GUI app. BSDs: FreeBSD users should check the FreeBSD Java downloads page. I believe there's a port for FreeBSD and package/port for OpenBSD but I couldn't dig up the details. We have had users on both platforms, though, so I know they work fine. Others: There's basically a JDK for just about every platform, so if you're not on one of these just do a little digging. All you need to know is that it needs to be a full Java 5 or higher implementation.
Rails 2.0 or higher

Hopefully by now most of you are on a 2.x version of Rails. This walkthrough will assume you've got Rails 2.0+ installed. If you're using JRuby, it's a simple "jruby -S gem install rails" or if you've got JRuby's bin in PATH, "gem install rails" should do the trick. Note that the Warbler (described later) should work in any Ruby implementation, since it's just a packager and it includes JRuby.

Step One: The App Server

The words "Application Server" are terrifying to most Rubyists, to the point that they'll refuse to even try this deployment model. Of course, the ones that try it usually agree it's a much cleaner way to deploy apps, and generally they don't want to go back to any of the alternatives.

Much of the teeth-gnashing seems to surround the perceived complexity of setting a server up. That was definitely the case 5 years ago, but today's servers have been vastly simplified. For this walkthrough, I'll use GlassFish, since it's FOSS, fast, and easy to install.

I'm using GlassFish V2 UR2 (that's Version 2, Update Release 2) since it's very stable and by most accounts the best app server available, FOSS or otherwise. Not that I'm biased or anything. At any rate, it's hard to argue with the install process.

1. Download from the GlassFish V2 UR2 download page. The download links start about halfway down the page and are range from 53MB (English localization) to 93MB (Multilanguage) in size.

2. Run the GlassFish installer. The .jar file downloaded is an executable jar containing the installer for GlassFish as well as GlassFish itself. The -Xmx specified here increases the memory cap for the JVM from its default 64MB to 256MB, since the archive gets unpacked in memory.
~/work ➔ java -Xmx256M -jar ~/Downloads/glassfish-installer-v2ur2-b04-darwin.jar
glassfish
glassfish/docs
glassfish/docs/css
glassfish/docs/figures
...
glassfish/updatecenter/README
glassfish/updatecenter/registry/SYSTEM/local.xml
installation complete
~/work ➔
Before the unpack begins, the installer will pop up a GUI asking you to accept the GlassFish license.
Read the license or not...it's up to you. But to accept, you need to at least pretend you read it and scroll the license to the bottom.
The installer will proceed to unpack all the files for GlassFish into ./glassfish.

3. Run the GlassFish setup script. In the unpacked glassfish directory, there are two .xml files: setup.xml and setup-cluster.xml. Most users will just want to use setup.xml here, but if you're interested in clustering several GlassFish instances across machine, you'll want to look into the clustered setup. I won't go into it here.

The unpacked glassfish dir also contains Apache's Ant build tool, so you don't need to download it. If you already have it available, your copy should work fine, and the chmod command below--which sets the provided Ant's bin scripts executable--would be unnecessary. If you're on Windows, the bin scripts are bat files, so they'll work fine as-is.

Two items to note: you should probably move the glassfish dir where you want it to live in production, and you should run the installer with the version of Java you'd like GlassFish to run under. Both can be changed later, but it's better to just get it right the first time.
~/work ➔ cd glassfish
~/work/glassfish ➔ chmod a+x lib/ant/bin/*
~/work/glassfish ➔ lib/ant/bin/ant -f setup.xml
Buildfile: setup.xml

all:
[mkdir] Created dir: /Users/headius/work/glassfish/bin

get.java.home:

setup.init:
...
jar-unpack:
[unpack200] Unpacking with Unpack200
[unpack200] Source File :/Users/headius/work/glassfish/lib/appserv-cmp.jar.pack.gz
[unpack200] Dest. File :/Users/headius/work/glassfish/lib/appserv-cmp.jar
[delete] Deleting: /Users/headius/work/glassfish/lib/appserv-cmp.jar.pack.gz
...
do.copy.unix:
[copy] Copying 1 file to /Users/headius/work/glassfish/config
[copy] Copying 1 file to /Users/headius/work/glassfish/bin
[copy] Copying 1 file to /Users/headius/work/glassfish/bin
...
create.domain:
[exec] Using port 4848 for Admin.
[exec] Using port 8080 for HTTP Instance.
[exec] Using port 7676 for JMS.
...
BUILD SUCCESSFUL
Total time: 29 seconds
~/work/glassfish ➔
4. Start up your GlassFish server. It's as simple as one command now.
~/work/glassfish ➔ bin/asadmin start-domain
Starting Domain domain1, please wait.
Log redirected to /Users/headius/work/glassfish/domains/domain1/logs/server.log.
Redirecting output to /Users/headius/work/glassfish/domains/domain1/logs/server.log
Domain domain1 is ready to receive client requests. Additional services are being started in background.
Domain [domain1] is running [Sun Java System Application Server 9.1_02 (build b04-fcs)] with its configuration and logs at: [/Users/headius/work/glassfish/domains].
Admin Console is available at [http://localhost:4848].
Use the same port [4848] for "asadmin" commands.
User web applications are available at these URLs:
[http://localhost:8080 https://localhost:8181 ].
Following web-contexts are available:
[/web1 /__wstx-services ].
Standard JMX Clients (like JConsole) can connect to JMXServiceURL:
[service:jmx:rmi:///jndi/rmi://charles-nutters-computer.local:8686/jmxrmi] for domain management purposes.
Domain listens on at least following ports for connections:
[8080 8181 4848 3700 3820 3920 8686 ].
Domain does not support application server clusters and other standalone instances.

~/work/glassfish ➔
Congratulations! You have installed GlassFish. Simple, eh?

A few tips for using your new server:
There's a web-based admin page at http://localhost:4848 where the admin login is admin/adminadmin by default. You'll want to change that password. Select "Application Server" on the left and then "Administrator Password" along the top. Poke around the admin console to get a feel for the services provided. You won't need any of them for the rest of this walkthrough, but you might want to dabble some day. And if you want to set up a connection pool later on (which ActiveRecord-JDBC supports) this is where you'll do it. Most folks will probably want to set up init scripts to ensure GlassFish is launched at server startup. That's outside the scope of this walkthrough, but it's pretty simple. I'll update this page (and it's equivalent on the JRuby Wiki) once I know more. GlassFish works just fine as a standalone server, but many users will want to proxy it through Apache or another web server. Again, this is outside the scope of this walkthrough, but it should be as simple as configuring a virtual host or a set of matching URLs to hit the GlassFish server at port 8080 (which is the default port for web applications). For apps I'm running, however, I just use GlassFish.
Step 2: Package your App

This step is made super-trivial by Nick Sieger's Warbler. It includes JRuby itself and provides a simple set of commands to package up your app, add a packaging config file, and more. In this case, I'll just be packaging up a simple Rails app.

Note that Warbler works just fine under non-JRuby Ruby implementations, since it's all Ruby code. But again, if you're deploying with JRuby, it's probably a good idea to test and build with JRuby as well.
~/work ➔ jruby -S gem install warbler
Successfully installed warbler-0.9.10
1 gem installed
Installing ri documentation for warbler-0.9.10...
Installing RDoc documentation for warbler-0.9.10...
~/work ➔ cd testapp
~/work/testapp ➔ ls .
README Rakefile app config db doc lib log public script test tmp vendor
~/work/testapp ➔ warble
jar cf testapp.war -C tmp/war .
~/work/testapp ➔
And that's essentially all there is to it. You will get a .war file containing your app, JRuby, Rails, and the Ruby standard library. This one file is now a deployable Rails applications, suitable for any app server, any OS, and any platform without a recompile. The target server doesn't even have to have JRuby or Rails installed.

Step 3: Deploy your Application

There's two ways you can deploy. You can either go to the Admin Console web page, select "Web Applications" from the "Applications" category on the left, and deploy the file there, or you can just use GlassFish's command-line interface. I will demonstrate the latter, and I'm providing the optional contextroot flag to deploy my app at the root context.
~/work/testapp ➔ ../glassfish/bin/asadmin deploy --contextroot / testapp.war
Command deploy executed successfully.
~/work/testapp ➔
That's it? Yep, that's it! If we now hit the server at port 8080, we can see the app is deployed.
Step 4: Tweaking

There's a few things you can do to tweak your deployment a bit. The first would be to generate a warble.rb config file and adjust settings to suit your application.
~/work/testapp ➔ warble config
~/work/testapp ➔ head config/warble.rb
# Warbler web application assembly configuration file
Warbler::Config.new do |config|
# Temporary directory where the application is staged
# config.staging_dir = "tmp/war"

# Application directories to be included in the webapp.
config.dirs = %w(app config lib log vendor tmp)
...
In this file you can set the min/max number of Rails instances you need, additional files and directories to include, additional gems and libraries to include, and so on. The file is heavily commented, so you should have no trouble figuring it out, but otherwise the Warbler page on the JRuby Wiki is probably your best source of information. And since a lot of people ask how many instances they should use, I'll provide a definitive answer: it depends. Try the defaults, and scale up or down as appropriate. Hopefully with Rails 2.2 this will no longer be needed, as is the case for Merb and friends.

The other tweak you'll probably want to look into is using the JDBC-based ActiveRecord adapters instead of the pure Ruby versions (or the C-based versions, if you're migrating from the C-based Ruby impls). This is generally pretty simple too. Install the JDBC adapter for your database, and tweak your database.yml. Here's the commands on my system:
~/work/testapp ➔ jruby -S gem install activerecord-jdbcmysql-adapter
Successfully installed jdbc-mysql-5.0.4
Successfully installed activerecord-jdbcmysql-adapter-0.8.2
2 gems installed
Installing ri documentation for jdbc-mysql-5.0.4...
Installing ri documentation for activerecord-jdbcmysql-adapter-0.8.2...
Installing RDoc documentation for jdbc-mysql-5.0.4...
Installing RDoc documentation for activerecord-jdbcmysql-adapter-0.8.2...
~/work/testapp ➔ vi config/database.yml
~/work/testapp ➔
And the modified database.yml file:
...
# This was changed from "adapter: mysql"
production:
adapter: jdbcmysql
encoding: utf8
...
Now repackage ("warble" command again), redeploy, and you're done!

Conclusion

Hopefully this walkthrough clears up some confusion around JRuby on Rails deployment to an app server. It's really a simple process, despite the not-so-simple history surrounding Enterprise Application Servers, and GlassFish almost makes it fun :)

Sunday, August 17, 2008

Q/A: What Thread-safe Rails Means

There's been a little bit of buzz about David Heinemeier Hansson's announcement that Josh Peek has joined Rails core and is about to wrap up his GSoC project making Rails finally be thread-safe. To be honest, there probably hasn't been enough buzz, and there's been several misunderstandings about what it means for Rails users in general.

So I figured I'd do a really short Q/A about what effect Rails thread-safety would have on the Rails world, and especially the JRuby world. Naturally there's some of my opinions reflected here, but most of this should be factually correct. I trust you will offer corrections in the comments.

Q: What does it mean to make Rails thread-safe?

A: I'm sure Josh or Michael Koziarski, his GSoC mentor, can explain in more detail what the work involved, but basically it means removing the single coarse-grained lock around every incoming request and replacing it with finer-grained locks around only those resources that need to be shared across threads. So for example, data structures within the logging subsystem have either been modified so they are not shared across threads, or locked appropriately to make sure two threads don't interfere with each other or render those data structures invalid or corrupt. Instead of a single database connection for a given Rails instance, there will be a pool of connections, allowing N database connections to be used by the M requests executing concurrently. It also means allowing requests to potentially execute without consuming a connection, so the number of live, active connections usually will be lower than the number of requests you can handle concurrently.

Q: Why is this important? Don't we have true concurrency already with Rails' shared-nothing architecture and multiple processes?

A: Yes, processes and shared-nothing do give us full concurrency, at the cost of having multiple processes to manage. For many applications, this is "good enough" concurrency. But there's a down side to requiring as many processes as concurrent requests: inefficient use of shared resources. In a typical Mongrel setup, handling 10 concurrent requests means you have to have 10 copies of Rails loaded, 10 copies of your application loaded, 10 in-memory data caches, 10 database connections...everything has to be scaled in lock step for every additional request you want to handle concurrently. Multiply the N copies of everything times M different applications, and you're eating many, many times more memory than you should.

Of course there are partial solutions to this that don't require thread safety. Since much of the loaded code and some of the data may be the same across all instances, deployment solutions like Passenger from Phusion can use forking and memory-model improvements in Phusion's Ruby Enterprise Edition to allow all instances to share the portion of memory that's the same. So you reduce the memory load by about the amount of code and data in memory that each instance can safely hold in common, which would usually include Rails itself, your static application code, and to some extent the other libraries loaded by Rails and your app. But you still pay the duplication cost for database connections, application code, and in-memory data that are loaded or created after startup. And you still have "no better" concurrency than the coarse-grained locking since Ruby Enterprise Edition is is just as green-threaded as normal Ruby.

Q: So for green-threaded implementations like Ruby, Ruby EE, and Rubinius, native threading offers no benefit?

A: That's not quite true. Thread-safe Rails will mean that an individual instance, even with green threads, can handle multiple requests at the same time. By "at the same time" I don't mean concurrently...green threads will never allow two requests to actually run concurrently or to utilize multiple cores. What I mean is that if a given request ends up blocking on IO, which happens in almost all requests (due to REST hits, DB hits, filesystem hits and so on), Ruby will now have the option of scheduling another request to execute. Put another way, removing the coarse-grained lock will at least improve concurrency up to the "best" that green-threaded implementations can do, which isn't too bad.

The practical implication of this is that rather than having to run a Rails instance for every process you want to handle at the same time, you will only have to run a certain constant number of instances for each core in your system. Some people use N + 1 or 2N + 1 as their metric to map from cores (N) to the number of instances you would need to effectively utilize those cores. And this means that you'd probably never need more than a couple Rails instances on a one-core system. Of course you'll need to try it yourself and see what metric works best for your app, but ultimately even on green-threaded implementations you should be able to reduce the number of instances you need.

Q. Ok, what about native-threaded implementations like JRuby?

A. On JRuby, the situation improves much more than on the green-threaded implementations. Because JRuby implements Ruby threads as native kernel-level threads, a Rails application would only need one instance to handle all concurrent requests across all cores. And by one instance, I mean "nearly one instance" since there might be specific cases where a given application bottlenecks on some shared resource, and you might want to have two or three to reduce that bottleneck. In general, though, I expect those cases will be extremely rare, and most would be JRuby or Rails bugs we should fix.

This means what it sounds like: Rails deployments on JRuby will use 1/Nth the amount of memory they use now, where N is the number of thread-unsafe Rails instances currently required to handle concurrent requests. Even compared to green-threaded implementations running thread-safe Rails, it willl likely use 1/Mth the memory where M is the number of cores, since it can parallelize happily across cores with only "one" instance.

Q: Isn't that a huge deal?

A: Yes, that's a huge deal. I know existing JRuby on Rails users are going to be absolutely thrilled about it. And hopefully more folks will consider using JRuby on Rails in production as a result.

And it doesn't end at resource utilization in JRuby's case. With a single Rails instance, JRuby will be able to "warm up" much more quickly, since code we compile and optimize at runtime will immediately be applicable to all incoming requests. The "throttling" we've had to do for some optimizations (to reduce overall memory consumption) may no longer even be needed. Existing JDBC connection pooling support will be more reliable and more efficient, even allowing connection sharing from application to application as well as across instances. And it will put Rails on JRuby on par with other frameworks that have always been (probably) thread-safe like Merb, Groovy on Grails, and all the Java-based frameworks.

Naturally, I'm thrilled. :)

Thursday, August 07, 2008

'Twas Brillig

It's been a long time since I posted last. I figured it was time to get at least a basic update out to folks.

JRuby's been clicking along really well. Perhaps a little too well. Our bug tracker counts 476 open bugs and climbing, even while we're trying to periodically sweep through them. We're slipping behind, but there's a silver lining: every other bug is filed by someone new. So there's happiness in slavery.

Since my last post, a lot has happened. We pushed out JRuby 1.1.3 around the middle of July, which boasted a whole bunch of new awesome. Vladimir Sizikov posted a really nice rundown of the changes.

My favorite item has to be the improved interpreter performance, which now is clearly faster than MRI's interpreter. So whether you're running interpreted or compiled in JRuby, you should be getting pretty solid straight-line performance.

There were also dozens upon dozens of compatibility fixes, an upgrade to RubyGems 1.2, and lots of other miscellaneous changes to make JRuby a little friendlier and easier to use. It's nice to be able to focus on that kind of stuff now that compatibility is mostly a done deal (ok, high 90th percentile...but pretty darn good).

So then...what's on the slate for the next JRuby release?

Making the Leap?

We had originally been talking about JRuby 1.1.3 being the last planned release in the 1.1 line. It had reached a really solid level of stability and performance, people were putting it in production for all sorts of apps, and we were generally very happy with it. "Last planned release" doesn't mean we wouldn't continue to do maintenance releases as needed...it just means we weren't going to do day-to-day development against the 1.1 line.

The primary reason for this is a number of large-scale projects we'd been putting off. For example, the "Java Integration" (JI) subsystem of JRuby has long been a serious thorn in our sides. Based off probably a dozen different people's contributions over nearly eight years, it had become an impenetrable maze of code, half in Java, half in Ruby, but all clearly in the "suck" column. And I'm not saying anything negative about the contributors who wrote all that suckage (which includes me)...they all made valiant attempts to improve the situation. But a dozen 5% solutions had led us down an ever-more-terrifying rabbit hole. We faced the ultimate decision: fix or rewrite?

We were all set to rewrite. I'd already started an experiment I called "MiniJava", a mostly code-generated replacement for JI that boasted dynamic dispatch speed no slower than Ruby to Ruby and static dispatch speed (from Java into Ruby) only about two times slower than Java calling Java. There were many levels of awesome there, but one serious problem: no tests.

JRuby's JI layer has evolved over a long period of time, and has been worked on by a mix of TDD fans and non-fans. There have also been numerous attempts to rewrite portions of the system, usually ending far short of intentions and with no new tests for functionality added along the way. As a result, any attempt to replace the JI layer would be an exercise in pain: even *we* didn't know how it all works. It's probably not exaggerating to say that existing test cases covered less than half of JI functionality, and that the current JRuby team probably understood less than half of JI's implementation. Not a great place to start from.

Faced with the certainty of pain and knowing, workaholic that I am, that a large portion of the rewrite would probably fall on my shoulders, I decided to give it one more go. I decided to attempt an in-place refactoring of JRuby's Java Integration layer, tens of thousands of lines of spaghetti Java and Ruby code.

Redemption

I've had false starts before. The compiler had at least two partial attempts that failed to go anywhere. I've rewritten the JRuby interpreter several times. Hell, I can't count the number of times I tried to refactor JRuby's IO subsystem before finally succeeding. But JI...man, that is some seriously heinous code. So I had to start small. How about something I was already intimately familiar with: method dispatch.

puts "Measure bytelist appends (via Java integration)"
5.times {
puts Benchmark.measure {
sb =