I'm building a simple enough web app, to manage some project related data. Withoug claiming the ability to see around corners, I'm quite sure this app will grow over time because the data it's working against is nebulous and that will push for more and more views. The main decision so far has been to keep data in XML+RDF - for example there is project data in DOAP files and person details are in FOAF and bits of DC are scattered about, RIG chunks, and so on.
I'm telling myself I'm using RDF+XML because I want to be able to pull data in from anywhere. That's true, but to be brutally honest I can't be bothered designing and maintaining yet another relational schema for yet another webapp - doing so is starting to make as much sense as designing my own filesystem or TP monitor. Life's too short, too short to be working on technology that can only possibly make sense when you're in dressed in combats and vans listening to Pearljam pretending it's still the nineties... there's a real wish to conduct oneself at a higher level of abstraction before complete dementia sets in. What's the point in designing tables for a webapp when an RDF-backed store will manage the data for you and RDF queries will come back as tabular data anyway? There are RDF triple stores that will handle in the order 10^6 statements - Leigh Dodds is doing some research on that, up to 10^8 by the looks of things. If I need queries instead of hacking out iterators+fiters I'll use versa/itql/rdql. Now, saying I never want to design another relational schema again is not to say I don't want to use a database. Most of these RDF triple stores are in fact using an RDBMS in the background, as the filesystem and indexer, it's just that the relational schema in use is not exposed to the application.
Can't say I'm too fussed about having a nice object model for the domain either. Yes, it's heresy not to have an object model for the domain - out of the corner of my eye, as I write this, I can see that Eric Evans' book is trying to wriggle off the shelf and wallop me upside my head.
Other than not using an RDBMS directly, and not being too fussed about objects, I wanted these capabilties:
Login+sessions Easy XML out Easy URL design Easy URL/action mapping Easy Atom subscriptions on views Save query as bookmark Save filters as bookmark Provide adequate opportunity to look at some frameworks
I didn't care much about:
Protecting graphic designers from code Protecting coders from user interfaces Planetary class scaling Shopping carts Pet store transactions
More heresy - no doubt this project will be a disasterous conflagaration of worst practices. I can't wait.
Once you know what the web app is about, you then have to decide what to write it in. That alone is a research exercise. I looked at RoR and I don't know, maybe a dozen Java and Python frameworks.
RoR: RoR is really nice, but a lot of the value is tied up in hooking into an RDBMS schema via ActiveRecord. I'm not using one of those, I'm using XML+RDF, so that takes me off The Golden Path. And even the RoR guys will tell you want to stay on The Golden Path. Maybe I could write an ActiveGraph (Redland RDF has Ruby bindings), but who knows what else would get rewritten by the by - a lot of RoR magic is in mapping the DB - when that's gone a lot of value goes with it. Please yes, I got past the demo and understand that scaffolding is only one part, I just don't understand what the value beyond that is. In RoR, The Golden Path is the system value. [update: Bruce D'Arcus pointed me at Obie Fernadez' musings; seems like there's some pent up demand for RDF on Rails]
Python: The state of the web frameworks in Python is nearly as confusing as Java, no small feat. There's lots of 'em and I have no good sense where the community interest really lies. If you are doing CMS, Plone wins hands down, but I'm not doing a CMS. Zope's a parallel world and I'd have to get around zodb for RDF. Twisted is fine for building servers, but pushes too much back onto an app developer even with Nevow. CherryPy I'm still playing with, it looks nicest so far, and feels closest to a 'done' thing. Greg Wilson has also noticed this excess in the Python world recently, and he thinks the Python community needs get on message - that would not be my conclusion. Python is also overflowing in templating languages, of which Clearsilver, Tal and Cheetah are notable - tho' I'd like to try out Kid - Ryan Tomayko cracks me up.
Java: I know the Java space better than Python. Struts is out for reasons of verbosity and sanity retention - there's that XML config file format (if you need a graph, use a programming language or RDF). By the same criteria, JSF is also out, never mind it's unproven, insofar as the answer to MVC on the Web is not neccessarily even more MVC on the Web. Tapestry looked interesting but it's squarely targeted at HTML output and stopping code monkeys and graphics monkeys squabbling over who gets which bananas. The latter is a non-requirement and the HTML only thing I don't entirely get, even though I gather Tapestry is very focused on non-programmers. I read Howard say somewhere once he doesn't believe in multi-site output - maybe I've been too long in the Atom world but it seems to me publishing as Atom/RSS is becoming a requirement. Spring webflow doesn't seem to offer any value for a two tiered web app where objects are going to be incidental by design - Spring also has that graphlike XML config going on as well. Webflow by the way, is the one part of Spring which I gather has stated it is not targeted for simplicity - instead it's for complex application flows. The rot starts there.
I've complained about Java web frameworks before, especially this obsession with MVC, but much of the issue is with Java, not the framework designs. There are two many steps involved in deploying a servlet app, or making changes to a running one, or updating half-a-dozen XML files - it's ridiculous. All I want to do is edit and hit reload. Letting me hit reload is the system job. To get there means scripting support.
(Non) Conclusion: RoR's database dependency and Object MVC obsession isn't working for me. of course if you read the 37Signals weblog (and you should) your conclusion could be that I'm approaching this entirely wrong and I should start with a user interface as the requirements and work back. But you know, slick UIs are like nice shoes - awfully sexy but awfully transient - you put them on and still no-one sees the real you. Whereas data will set you free. Less surreally - I don't care what the UI is made out of as long as its not arduous and doesn't dictate how this data is going to be structured.
CherryPy is the closest thing to a decision I've settled on in the Python world, but I'm really not sure. Not at all. [update: I am getting a ton of feedback on the Python side of things, enough to have me going back to reassess as I clearly don't have enough of a clue]
The best option for Java seems to be WebWork (one of my favourite web applications is built on it). WebWork is a big improvement over Struts and the Java world is really missing a trick by going straight to JSF. The main problem is setting things up for scripting, to get out from under the compile/deploy/load loop. PyServlet comes with Jython, but that leaves things configured backways, by putting the Jython at the front of processing chain; whereas it would be better to script the action code to be invoked by a framework. Then the framework is dealing with skinning, sessions, stringtables and the like. I'm thinking about embed a JythonActionSupport into WebWork so it can call out to .py files implementing the ActionSupport interface. WebWork uses proxy generation to create and fill out your action objects, hopefully there won't be problems integrating that magic with Jython.
Generally, the web frameworks situation is depressing unless you're a researcher. If someone thinks there's a framework I should be looking at, shout. Deparalysis imminent.