Abdera 0.4.0 Released
Friday, April 11th, 2008Apache Abdera 0.4.0-incubating has been released.
Apache Abdera 0.4.0-incubating has been released.
It took a while, but Google has followed IBM’s example and has issued a Royalty-Free pledge “for patents necessarily infringed by implementation (in whole or in part) of [Atom]”. Very good to see this.
More here.
As I mentioned previously, we’re putting the finishing touches on the Abdera 0.4.0 release. While I’ve posted previously on many of the new features, I figured it would be good to do another review…
Here are just a few of the changes in no particular order…
Most of the Abdera APIs now support method chaining, e.g. streamWriter.startEntry().writeTitle(”foo”).endEntry(); While there is some debate about the effectiveness of method chaining, we’ve found it to be a great way of simplifying code and making it easier to serialize Atom documents.
Other improvements include making it easier to work with custom serializers. For instance, in previous releases, the following code was required to use the JSON Serializer:
Abdera abdera = new Abdera();
Entry entry = abdera.newEntry();
// set entry properties
abdera.getWriterFactory().getWriter("json").writeTo(entry, System.out);
That last line isn’t exactly elegant. So we streamlined it to:
entry.writeTo("json", System.out);
The new StreamWriter interface allows applications to serialize Atom documents quickly, using a streaming model that does not build an internal object model. In some tests, we’ve seen a 100x performance boost using the StreamWriter interface to write out an Atom document vs. using the Abdera 0.3.0 release object model. Here’s a sample taken from the Abdera examples:
Abdera abdera = Abdera.getInstance();
StreamWriter out =
abdera.newStreamWriter()
.setOutputStream(System.out,"UTF-8")
.setAutoflush(false)
.setAutoIndent(true)
.startDocument()
.startFeed()
.writeBase("http://example.org")
.writeLanguage("en-US")
.writeId("http://example.org")
.writeTitle("<Testing 123>")
.writeSubtitle("Foo")
.writeAuthor("James", null, null)
.writeUpdated(new Date())
.writeLink("http://example.org/foo")
.writeLink("http://example.org/bar","self")
.writeCategory("foo")
.writeCategory("bar")
.writeLogo("logo")
.writeIcon("icon")
.writeGenerator("1.0", "http://example.org", "foo")
.flush();
for (int n = 0; n < 100; n++) {
out.startEntry()
.writeId("http://example.org/" + n)
.writeTitle("Entry #" + n)
.writeUpdated(new Date())
.writePublished(new Date())
.writeEdited(new Date())
.writeSummary("This is text summary")
.writeAuthor("James", null, null)
.writeContributor("Joe", null, null)
.startContent("application/xml")
.startElement("a","b","c")
.startElement("x","y","z")
.writeElementText("This is a test")
.startElement("a")
.writeElementText("foo")
.endElement()
.startElement("b","bar")
.writeAttribute("foo", new Date())
.writeAttribute("bar", 123L)
.writeElementText(123.123)
.endElement()
.endElement()
.endElement()
.endContent()
.endEntry()
.flush();
}
out.endFeed()
.endDocument()
.flush();
Note: the StreamWriter does not completely replace the use of the Feed Object Model. For many applications, particularly Atom Publishing Protocol client and server implementations, using the FOM is essential.
The IRI implementation has been further optimized to boost performance. Specifically, we updated Unicode Normalization and codepoint iteration implementations so code initialization time is significantly reduced.
As part of the IRI implementation, Abdera provides an implementation of Unicode Normalization algorithms. These can be used directly by applications. For instance, in the sample code below, three semantically equivalent strings are normalized to Normalization Form C.
String s1 = "\u00c5";
String s2 = "\u0041\u030A";
String s3 = "\u212B";
System.out.println(s1 + "=" + s2 + " ?\t" + s1.equals(s2)); // false
System.out.println(s1 + "=" + s3 + " ?\t" + s1.equals(s3)); // false
System.out.println(s2 + "=" + s3 + " ?\t" + s2.equals(s3)); // false
// Normalize to NFC
String n1 = Normalizer.normalize(s1, Normalizer.Form.C);
String n2 = Normalizer.normalize(s2, Normalizer.Form.C);
String n3 = Normalizer.normalize(s3, Normalizer.Form.C);
System.out.println(n1 + "=" + n2 + " ?t" + n1.equals(n2)); // true
System.out.println(n1 + "=" + n3 + " ?t" + n1.equals(n3)); // true
System.out.println(n2 + "=" + n3 + " ?t" + n2.equals(n3)); // true
// s1 is already normalized to NFC
System.out.println(n1.equals(s1)); // true
In Atompub implementations, the title of an entry or the Slug HTTP header is typically used to generate the permalink URL for an entry. Typically, these values have to be sanitized before they can be used. The Sanitization code in 0.4.0 has been improved and has been moved from the server module to the i18n package.
// french for "My trip to the beach".. note the accented character and the whitespace characters
String input = "Mon\tvoyage à la\tplage";
// The default rules will replace whitespace with underscore characters
// and convert non-ascii characters to pct-encoded utf-8
String output = Sanitizer.sanitize(input);
System.out.println(output);
// Output = Mon_voyage_%C3%A0_la_plage
// As an alternative to pct-encoding, a replacement string can be provided
output = Sanitizer.sanitize(input, "");
System.out.println(output);
// Output = Mon_voyage__la_plage
// In certain cases, applying Unicode normalization form D to the
// input can produce a good ascii equivalent to the input text.
output = Sanitizer.sanitize(input, "", true, Normalizer.Form.D);
System.out.println(output);
// Output = mon_voyage_a_la_plage
Abdera 0.3.0 included a simple implementation of RFC3066 language tags to support the xml:lang attribute. In 0.4.0, the 3066 implementation has been deprecated and a new RFC 4646 implementation has been introduced. The 4646 impl provides a broader range of options, including 4647 tag matching and validation.
From the test case:
@Test
public void test4646Lang() throws Exception {
Lang lang = new Lang("en-Latn-US-valencia");
assertEquals(lang.getLanguage().toString(),"en");
assertEquals(lang.getRegion().toString(), "US");
assertEquals(lang.getScript().toString(), "Latn");
assertEquals(lang.getVariant().toString(), "valencia");
assertNull(lang.getExtLang());
assertNull(lang.getExtension());
assertNull(lang.getPrivateUse());
assertTrue(lang.isValid());
Locale locale = lang.getLocale();
assertEquals(locale.getCountry(),"US");
assertEquals(locale.getLanguage(),"en");
assertEquals(locale.getVariant(),"valencia");
}
@Test
public void test4647Matching() throws Exception {
Lang lang = new Lang("en-Latn-US-valencia");
Range range1 = new Range("*",true);
Range range2 = new Range("en-*",true);
Range range3 = new Range("en-Latn-*",true);
Range range4 = new Range("en-US-*",true);
Range range5 = new Range("en-*-US-*",true);
Range range6 = new Range("*-US",true);
Range range7 = new Range("*-valencia",true);
Range range8 = new Range("*-FR",true);
assertTrue(range1.matches(lang,true));
assertTrue(range2.matches(lang,true));
assertTrue(range3.matches(lang,true));
assertTrue(range4.matches(lang,true));
assertTrue(range5.matches(lang,true));
assertTrue(range6.matches(lang,true));
assertTrue(range7.matches(lang,true));
assertFalse(range8.matches(lang,true));
}
0.4.0 now includes an implementation of the URI Templates specification. From the examples,
private static final Template template =
new Template("http://example.org{-opt|/~|user}{user}{-opt|/-/|categories}{-listjoin|/|categories}{-opt|?|foo,bar}{-join|&|foo,bar}");
// ...
Map<String,Object> map = new HashMap();
map.put("user","james");
map.put("categories", new String[] {"a","b","c"});
map.put("foo", "abc");
map.put("bar", "xyz");
System.out.println(template.expand(map));
0.4.0 includes an HTML Parser based on the nu.validator parser.
String html = "<html><body><p>this is <i>html</i></body></html>";
Abdera abdera = Abdera.getInstance();
Parser parser = abdera.getParserFactory().getParser("html");
Document<Element> doc =
parser.parse(
new StringReader(html));
Element root = doc.getRoot();
root.writeTo(System.out);
System.out.println();
XPath xpath = abdera.getXPath();
List<Element> list = xpath.selectNodes("//i",doc.getRoot());
for (Element element : list)
System.out.println(element);
0.4.0 includes an experimental serialization framework that can serialize Java objects to Atom either based on conventions or annotations.
For instance, suppose I have a Java object…
public static class MyEntry {
public String getId() {
return "tag:example.org,2008:foo";
}
public String getTitle() {
return "This is the title";
}
public String getAuthor() {
return "James";
}
public Date getUpdated() {
return date_now;
}
public Calendar getPublished() {
return cal_now;
}
public String getSummary() {
return "this is the summary";
}
public String getLink() {
return "http://example.org/foo";
}
}
To serialize that to Atom,
StreamWriter sw = abdera.newStreamWriter();
ByteArrayOutputStream out = new ByteArrayOutputStream();
sw.setOutputStream(out)
.setAutoIndent(true);
ConventionSerializationContext c =
new ConventionSerializationContext(sw);
c.setSerializer(MyEntry.class, new EntrySerializer());
sw.startDocument();
c.serialize(new MyEntry());
sw.endDocument();
By default, the serializer will look for conventionally named public methods and fields to acquire the necessary values for the Atom document. Alternatively, the Java object can be annotated…
@org.apache.abdera.ext.serializer.annotation.Entry
public static class MyAnnotatedEntry {
@ID public String getFoo() {
return "tag:example.org,2008:foo";
}
@Title public String getBar() {
return "This is the title";
}
@Author public String getBaz() {
return "James";
}
@Updated @Published public Date getLastModified() {
return date_now;
}
@Summary public String getText() {
return "this is the summary";
}
@Link public String getUri() {
return "http://example.org/foo";
}
}
And the serialization:
sw = abdera.newStreamWriter();
out = new ByteArrayOutputStream();
sw.setOutputStream(out)
.setAutoIndent(true);
c = new ConventionSerializationContext(sw);
sw.startDocument();
c.serialize(new MyAnnotatedEntry());
sw.endDocument();
The API and implementation are still rather rough and the whole thing is still considered experimental. Feedback is appreciated.
0.4.0 includes a reader that can silently remove characters that are not valid in XML documents. The filter can be enabled via the ParserOptions interface..
InputStream in = //...;
Abdera abdera = Abdera.getInstance();
Parser parser = abdera.getParser();
ParserOptions options = parser.getDefaultParserOptions();
options.setFilterRestrictedCharacters(true);
parser.parse(in, options);
As I mentioned before, the server framework has been completely revamped. There are many many changes to this area, far more than I can adequately cover in this post.
0.4.0 includes simple implementations of Atompub server adapters for CouchDB, JCR, Ibatis, Hibernate and the File system.
While the Ant build is still there, we’re now building the release with Maven. Also, we’ve deprecated the JDK 1.4.2 build and are only distributing the 1.5 build. Building the 1.4.2 jars is still possible using Retroweaver 2.x. Simply grab Retroweaver and run each of the Abdera jars through it to produce the 1.4.2 compatible jars.
That’s it for now.
The Apache Abdera 0.4.0 Release Candidate is a available for testing. Please download and kick the tires. Note: in previous releases, we bundled both JDK 1.5 and 1.4.2 jars with the release. This time, we’re only shipping the JDK 1.5 jars. Building the 1.4.2 jars is still pretty easy to do tho using the Ant build. Just grab the source distribution and do “ant -f build/build.xml zip”.
All feedback should be directed to the Abdera-dev mailing list.
What’s new in 0.4.0? I’m glad you asked…
Note: the change in the server framework means that existing Atompub implementations based on the 0.3.0 code will need to be ported to the new design. Trust me, it will be worth the effort and we’re not planning on making any further significant refactorings of the API any time soon… so go ahead and start porting to the 0.4.0 framework.
Update: A new build incorporating some of the feedback received has been uploaded.
Dan Diephouse: “I finished up the first draft of a guide on how to develop your first AtomPub service with Abdera.”
Very cool. More docs and tutorials coming for the pending 0.4.0 release.
You are viewing a mobilized version of this site...
View original page here