Embedding Clojure (Comments)

There is some information out there on embedding Clojure in Java, but it isn’t the easiest to find, and the examples don’t tend to come with explanations, so… here is yet another!

Let’s take a silly example and say we want to embed clojure as a validation language on something, so that it looks something like this

public class Thing
{
    private int num = 0;
    
    @Validate("(> num 0)")
    public void setNum(@Name("num") Integer num) {
        this.num = num;
    }
    
    
    @Validate("(< first second)")
    public void setInOrder(@Name("first") Integer first,
                           @Name("second") Integer second) {
        this.num = first + second;
    }
}

We want the validation function, expressed in the @Validate annotation to be invoked on every call to the method, binding the appropriate parameters to their @Name, etc. That is, for the second one, we want ensure that first is less than second, and so forth. We want it to be really fast – the validation will be called on every invocation of the validated method, so we need it to be really fast.

While fairly contrived, and rather absurd, it makes a nice example :-)

What we’d like to do is hold a reference to an otherwise anonymous clojure function (we don’t want to pollute the global namespace) and invoke it on every method call with some kind of method interceptor.

We can create the Clojure function reference with something like:

public IFn define(String func) throws Exception {
  String formish = String.format("(fn [val] (true? %s))", func);
  return (IFn) clojure.lang.Compiler.load(new StringReader(formish);
}

/* ... */

IFn fn = define("(> val 0)");

assertTrue((Boolean) fn.invoke(7));

The clojure compiler (inconeniently in Java 6) is named Compiler and provides a handy load(String) function which will read and evaluate a String, returning whatever it evaluates to. In this case we return a function which wraps our validation function in a test for true-ishness. In this example, our passed in value has a hard coded name, val, which is unfortunate, but can be worked around.

We can invoke this function directly via one of its’ invoke methods – it has a ton of overloads for different argument counts.

This approach will generate a Java class (well, a .class anyway) implementing our function.

To wrap behavior of a class, rather than an interface, and in a performant way, we’ll break out the ever-scary-but-awesome CGLIB and create a runtime extension of the class being validated. CGLIB is fast, but you pay for that with some gnarly low-level-feeling hackey. Not as low as ASM, though :-)

Our object factory looks like

public <T> T build(Class<T> type) throws Exception {
    Enhancer e = new Enhancer();
    e.setSuperclass(type);
    List<Callback> callbacks = new ArrayList<Callback>();
    callbacks.add(NoOp.INSTANCE);
    final Map<String, Integer> callback_mapping = new HashMap<String, Integer>();
    int count = 0;
    for (Method method : type.getDeclaredMethods()) {
        if (method.isAnnotationPresent(Validate.class)) {
            callbacks.add(new Handler(method));
            callback_mapping.put(method.getName(), ++count);
        }
        else {
            callback_mapping.put(method.getName(), 0);
        }
    }
    e.setCallbacks(callbacks.toArray(new Callback[callbacks.size()]));
    e.setCallbackFilter(new CallbackFilter()
    {
        public int accept(Method method) {
            Integer i = callback_mapping.get(method.getName());
            if (i == null) {
                return 0;
            }
            else {
                return i;
            }
        }
    });
    return (T) e.create();
}

Which is pretty gnarly. Basically, for methods without the @Validate annotation, it provides a NOOP, passing through to the parent class, for methods with the annotation, it delegates to a special Callback. We do some calback filter hackery to allow it to avoid dynamically dispatching at runtime (like a reflection proxy would), allowing CGLIB to generate a method body that invokes out handler directly. None of this is really what we are interested in, the Handler has the good stuff, let’s see it.

private static class Handler implements MethodInterceptor
{
    private final IFn fn;
    private int[] boundParamOffsets;
    private final Method method;

    private Handler(Method m) throws Exception {
        final Validate v = m.getAnnotation(Validate.class);
        Annotation[][] panno = m.getParameterAnnotations();
        ArrayList<String> names = new ArrayList<String>();
        List<Integer> counts = new ArrayList<Integer>();
        for (int i = 0; i < panno.length; i++) {
            Annotation[] param_annotations = panno[i];
            for (Annotation a : param_annotations) {
                if (a instanceof Name) {
                    names.add(((Name) a).value());
                    counts.add(i);
                }
            }
        }
        final StringBuilder args = new StringBuilder();
        for (String name : names) {
            if (name != null) {
                args.append(name).append(" ");
            }
        }
        String form = format("(fn [%s] (false? %s))",
                                                args.toString(),
                                                v.value());
        fn = (IFn) load(new StringReader(form));
        this.boundParamOffsets = new int[counts.size()];
        Class[] arglets = new Class[this.boundParamOffsets.length];
        for (int i = 0; i < this.boundParamOffsets.length; i++) {
            arglets[i] = Object.class;
            this.boundParamOffsets[i] = counts.get(i);
        }
        method = fn.getClass().getDeclaredMethod("invoke", arglets);
    }

    public Object intercept(Object o, 
                            Method method, 
                            Object[] objects, 
                            MethodProxy proxy) throws Throwable {
        Object[] args = new Object[boundParamOffsets.length];
        for (int i = 0; i < boundParamOffsets.length; i++) {
            args[i] = objects[boundParamOffsets[i]];
        }
        final Boolean bad = (Boolean) this.method.invoke(fn, args);
        if (bad) {
            throw new IllegalArgumentException("Failed validation!");
        }
        return proxy.invokeSuper(o, objects);
    }
}

YIKES! Okay, this is where it gets ugly, though it is a darned nice example of why macros are handy… if your language supports em. Java doesn’t, so it’s ugly.

We’ll step through it, though. The constructor wants to build three things, the Clojure function (fn), an array of offsets into the parameter list for the named parameters (boundParamOffsets), and finally grab the Java Method for the correct invoke on the Clojure function so we can invoke it via reflection (okay, we could optimize one step further and create concrete invoker classes which do it without reflection, but it’s getting late) imaginatively named method.

The first chunk of the constructor finds the relevant annotations and builds up a list of their names, in the order they appear, as well as the offsets, in that same order. Luckily, we are going to define a wrapper function which binds them, so we can control the order.

    final Validate v = m.getAnnotation(Validate.class);

    Annotation[][] panno = m.getParameterAnnotations();
    ArrayList<String> names = new ArrayList<String>();
    List<Integer> counts = new ArrayList<Integer>();
    for (int i = 0; i < panno.length; i++) {
        Annotation[] param_annotations = panno[i];
        for (Annotation a : param_annotations) {
            if (a instanceof Name) {
                names.add(((Name) a).value());
                counts.add(i);
            }
        }
    }
    final StringBuilder args = new StringBuilder();
    for (String name : names) {
        if (name != null) {
            args.append(name).append(" ");
        }
    }

At the end, we build up a list of the names in the form they’ll be embedded into our function template

    String form = format("(fn [%s] (false? %s))",
                                            args.toString(),
                                            v.value());
    fn = (IFn) load(new StringReader(form));

This looks a lot like the earlier example of function definition, the main difference being that we are templating in the names of the arguments now, so that they match what is expected.

The last bit of the constructor,

    this.boundParamOffsets = new int[counts.size()];
    Class[] arglets = new Class[this.boundParamOffsets.length];
    for (int i = 0; i < this.boundParamOffsets.length; i++) {
        arglets[i] = Object.class;
        this.boundParamOffsets[i] = counts.get(i);
    }
    method = fn.getClass().getDeclaredMethod("invoke", arglets);

just stores off the bound parameter offsets and grabs a handle on the Method, straightforward… compared to the rest, anyway.

The rest of the fun stuff happens on method invocation:

    public Object intercept(Object o, 
                            Method method, 
                            Object[] objects, 
                            MethodProxy proxy) throws Throwable {
        Object[] args = new Object[boundParamOffsets.length];
        for (int i = 0; i < boundParamOffsets.length; i++) {
            args[i] = objects[boundParamOffsets[i]];
        }
        final Boolean bad = (Boolean) this.method.invoke(fn, args);
        if (bad) {
            throw new IllegalArgumentException("Failed validation!");
        }
        return proxy.invokeSuper(o, objects);
    }

In this case, we build up the array of arguments to pass to the clojure function, invoke it, and if it returns true, we raise an exception. Otherwise, we pass the invocation on to the parent class.

Whoo! It’s a fair number of hoops to jump through, but the performance requirement is to blame for a lot of them. The code can probably be simplified a lot, but… it works, and is damned fast. In testing it, the Clojure IFn invocation was nearly indistinguishable from a pure-Java Callable invocation in the informal microbenchmarks I ran. Not really surprising considering it is generating a class to implement the function… Adding a type hint on the arguments actually lead to the Clojure one being frequently faster… somehow, I am not sure how. Need to pull out javap to see what is being generated, but I digress :-)

Some key takeaways from this exercise, for me, were a mind shift in how I think about Clojure and Java interop. I am used to working with languages which embed and have their own runtime, like Lua. Clojure doesn’t get embedded, it just coexists. It doesn’t have its’ own runtime – you invoke functions and manipulate state (not that the final form we see here does that) directly via APIs which look like Java reflection or references, respectively. I really like it.

To wrap up, here is the test case for the whole thing:

package org.skife.example;

import static org.testng.Assert.assertEquals;
import org.testng.annotations.BeforeMethod;
import org.testng.annotations.Test;

public class TestValidatemajig
{
    private Thing t;

    @BeforeMethod
    public void setUp() throws Exception {
        t = new Validatemajig().build(Thing.class);
    }

    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testValidationFailure() throws Exception {
        t.setNum(-1);
    }

    @Test
    public void testValidationSuccess() throws Exception {
        t.setNum(1);
        assertEquals(1, t.num);
    }

    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testMultipleParamFailure() throws Exception {
        t.setInOrder(2, 1);
    }

    @Test
    public void testMultipleParamSuccess() throws Exception {
        t.setInOrder(1, 2);
        assertEquals(3, t.num);
    }

    public static class Thing
    {
        private int num = 0;

        @Validate("(> num 0)")
        public void setNum(@Name("num") Integer num) {
            this.num = num;
        }

        @Validate("(< first second)")
        public void setInOrder(@Name("first") Integer first,
                               @Name("second") Integer second) {
            this.num = first + second;
        }
    }
}

Hopefully more Clojure fun to come, though after wrestling with doing Clojure from Java, I may just switch to Clojure and do some Java from that angle.

Setting up TokyoCabinet and Ruby (Comments)

I ran into a couple weirdnesses setting up tokyocabinet and the Ruby API, so am adding this to my external memory. Hopefully it will help anyone else bumping into the same issue.

Assuming you install tokyocabinet at a non-standard location, such as /Users/brianm/.opt/tokyocabinet-1.4.27 and then want to build the ruby bindings for it via a gem, the trick is to add the bin/ directory for the tokyocabinet install dir to your $PATH (in my case, that is just export PATH=/users/brianm/.opt/tokyocabinet-1.4.27/bin:$PATH). The ruby API’s extconf.rb shells out to tc’s tcucodec to find paths to libraries, etc. Alternately you could modify the extconf.rb, which is very short and sweet, but I hate doing that for aesthetic reasons.

To build the gem, you need to build via extconf but not install. After the build, use the normal gem tokyocabinet.gemspec command to build a gem. Install the gem (in my case, via rip) and glod’s your uncle.

Now to figure out if anyone has done a convenience API wrapper around the table database in TC…

Borrowing Mark Reid's Styling (Comments)

I am playing with new layouts, using Mark Reid’s wonderfully readable stylesheets as a basis. I’m going ahead and pushing it out, despite it being a work in progress. For now it is changed very little, in fact the main css is identical, but it will evolve as I have time.

I’ve taken another cue from him in using markdown for posts with code in them. Something about redcloth doesn’t play nicely with pygments processing of inline code, whereas the markdown processor does play nicely. So, not really caring about which one I use, I swapped out to markdown for posts with code. Yea!

[1,2,3].map { |i| i * i }.inject([]) { |a, i| a << i }.flatten.map {|i| i - 99}.select {|i| i + 99 == 0}

=begin
heh
=end

[1,2,3].map { |i| i * i }.inject([]) { |a, i| a << i }.flatten.map {|i| i - 99}.select {|i| i + 99 == 0}

I particularly like how Mark’s styling handles long code lines :-)

Along the way I killed the search box, it will come back, but it does highlight Toby’s comment that I should have actual, you know, links to my archives. Eventually…

Dataflow Programmering (Comments)

Not long after the idea of dataflow programming clicked for me while reading the excellent Concepts, Techniques, and Models of Computer Programming (affiliate link) I have been trying to figure out the best way to apply it at the library level rather than the language level. Having it at the language level is fine and dandy, but I’m happy to sacrifice a little elegance for just being easy to use for building up a page or response in a webapp from a bunch of remote services.

When rendering (heh, originally typoed that rending, kind of appropriate) a typical page in Rails, PHP, JSP, whatever. If you are nice and clean you fetch all the data you need and shove it into some kind of container which is then used to populate a template. In a complex system it is not unusual to make 20+ remote calls to render a single response. These are to caches, databases, other services, and sometimes pigeons passing by with telegrams on their legs. A couple years ago, we used a reactor style dataflow tool Tim and I wrote for javascript. I rather miss having it when wiring together backend services.

I have done a number of ad-hoc versions in services for Java, using an executor and passing around references to futures, but I don’t have anything that really matches the rather nice push and react style thing we had in javascript there. I can imagine something using Doug Lea’s jsr166y fork/join tools, but every time I start to poke into them the… well, maybe mapping it into a library really is kind of ugly. It certainly is screaming for anonymous functions, oh well guess I am not holding my breath.

So, switching to the other languages I hack in nowadays, we have Ruby (oops, no threads), C (umh, no, wrong level of abstraction), and Lua (hey, actually not bad, particularly with how LuaSocket and coroutines play together…).

Properly, I should now shut up and go hack. On that note, off to hack!

Teh New Ruby Evil (Comments)

Found a beauty I don’t know how I missed before:

bar = 'hello world'
foo =~ /#{bar}/

I didn’t realize you could do interpolation into regex literals. I don’t know how I lasted this long without finding out!


You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser
Mobilytics