Getting in Touch with XML Contacts
by John E. SimpsonMarch 31, 2004
Q: How do I record contact information in XML?
I am trying to develop an address book kind of application. The contact information will be maintained in XML format. Is there any standard DTD for contacts?A: Great question, especially since it's firmly grounded in common sense. An address book would seem to be one of the simplest XML applications to develop from scratch. But if it's so simple, surely someone must have already tackled it. Why reinvent the wheel? And as it happens, you've got several options. Which you select is a matter of preference, compatibility with other standards, and perhaps compatibility with the other parts of your application.
vCard in XML
First, as far back as 1998 -- the year XML 1.0 became an XML Recommendation -- Frank Dawson submitted a proposal to the Internet Engineering Task Force (IETF) for a "vCard in XML" standard. As you may know, a vCard is an "electronic business card," suitable for exchanging information between, for example, two e-mail correspondents. (Indeed, many e-mail application programs allow you to set up and attach vCards to your messages.) The vCard standard consists of two documents promoted by IETF and the Internet Mail Consortium: "A MIME Content-Type for Directory Information" and "vCard MIME Directory Profile."
MIME is the Multipurpose Internet Mail Extensions standard, also an IETF specification, which dates back to 1996. The simplest way to think of MIME in this context is that it allows you attach to an e-mail message some other content, such as a text file, an image, or even a vCard.
There's nothing inherently XML-based about the vCard specification itself. (Most applications, for that matter, don't represent vCards in XML format.) But Dawson, who also contributed to the aforementioned two documents, independently devised a DTD for representing vCard data. You can find a copy of it, together with an abstract and other supporting materials, at Robin Cover's invaluable "XML Cover Pages" site.
All of these documents date back to 1998, which is ancient history in terms of XML. Why might you be interested in such a cobwebbed standard?
The answer is that the vCard in XML standard has been adopted by the Jabber Software Foundation for use in their flagship Jabber project -- an open-source instant-messaging protocol. Dozens of IM clients are now available supporting Jabber's various protocols, including their version of vCard in XML. (Note that this is a de-facto standard: although it hasn't been officially blessed by the Jabber Software Foundation, it's in widespread use among Jabber clients.)
A Jabber vCard is contained in a Jabber XML wrapper element (including instructions for sending/retrieving the vCard itself), called iq. Here's a sample vCard-only portion of such an exchange, taken from the specification (actual addresses altered for obvious reasons):
<vCard xmlns='vcard-temp'>
<FN>JosephUser</FN>
<N>
<GIVEN>Joseph</GIVEN>
<FAMILY>User</FAMILY>
<MIDDLE/>
</N>
<NICKNAME>joe</NICKNAME>
<EMAIL>
<INTERNET/>
<PREF/>
<USERID>joseph@notareal.org</USERID>
</EMAIL>
<JABBERID>joe@notareal.org</JABBERID>
</vCard>
The W3C vCard in XML/RDF Note
In 2001, IPR Systems Pty Ltd submitted a Note to the W3C, formally outlining the use of XML as a vCard standard. Like other Notes, this one -- its full title is "Representing vCard Objects in RDF/XML" -- has no official status; you might consider such Notes "strawman"-style proposals or extended comments on other proposals. Still, depending on how much detail you want to provide in your contacts-management application, and how concerned you are with meshing your approach with the larger world of standards, it might be worth taking a look at.
Like Jabber's vCard in XML approach, the vCard in XML/RDF proposal (which I'll henceforth refer to simply as vCard/RDF) embeds vCard-type information in a larger XML document. The wrapper here, though, isn't an application-specific one (like Jabber's IM protocol). Instead, it's a general-purpose Resource Description Framework (RDF) document. RDF is a full-blown W3C Recommendation; its purpose is to encode metadata about Internet resources. In vCard/RDF's case, the resource in question is the vCard itself.
What might you want to know about a vCard, other than the contact information which it includes? At the very least, you might want to know what (or rather, who) a given vCard is about. My vCard might tell you how to get in touch with me by various means: postal and e-mail addresses, phone numbers, and so on. But that contact information doesn't lay out for you everything you might need to know about me; in short, it doesn't describe me.
vCard/RDF attacks this problem by combining, in a given document, information in the RDF namespace with information in the vCard namespace. Here's an example, taken from the vCard/RDF Note (RDF-namespace elements and attributes boldfaced):
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" >
<rdf:Description
rdf:about="http://qqqfoo.com/staff/corky">
<vCard:FN>Corky Crystal</vCard:FN>
<vCard:N
rdf:parseType="Resource">
<vCard:Family>Crystal</vCard:Family>
<vCard:Given>Corky</vCard:Given>
</vCard:N>
<vCard:EMAIL
rdf:parseType="Resource">
<rdf:value>corky@qqqfoo.com</rdf:value>
<rdf:type
rdf:resource="http://www.w3.org/2001/vcard-rdf/3.0#internet"/>
</vCard:EMAIL>
<vCard:ORG
rdf:parseType="Resource">
<vCard:Orgname>qqqfoo.com Pty
Ltd</vCard:Orgname>
<vCard:Orgunit>
<rdf:seq>
<rdf:li>Commercialisation
Division</rdf:li>
<rdf:li>Engineering
Office</rdf:li>
<rdf:li>Java
Unit</rdf:li>
</rdf:seq>
</vCard:Orgunit>
</vCard:ORG>
</rdf:Description>
</rdf:RDF>
In general, most of the RDF markup is used to describe constraints on how the contact information is structured or what sort of resource a particular datum is. (For instance, the three rdf:li elements are to be used in the order shown when referring to "Corky Crystal's" work unit; this constraint is imposed by making those elements children of an rdf:seq element.) Aside from that markup, however, note in particular the rdf:Description element:
rdf:Description's scope. While the simple rdf:RDF element does perfunctory duty as the document's true root, rdf:Description might be considered its heart and soul. The rdf:about attribute points to a resource outside this document which really tells you about Corky -- not how to get in touch with Corky, but who Corky is. (Of course, an application which cares only about contacting Corky would be free to ignore this information. But it's great to have it available, and mixing the vCard markup with RDF is what makes that availability possible.)It's also interesting to compare this vCard/RDF sample with the Jabber vCard above. Even without considering the namespace prefixes, the vCard/XML Note doesn't seem to be consistently tied to the element names from the earlier standard: EMAIL is EMAIL in both, but Jabber's FAMILY becomes vCard/RDF's Family.
A commercial alternative
In researching this column, I came across an existing commercial contact-management package which touts XML-readiness as a feature. The application is called GoldMine, from FrontRange Solutions. (I don't claim, of course, that this is the only such package. If you know of others, feel free to use the "Comment on this Article" link below.)
While GoldMine isn't just a contact manager, managing contacts seems to be at the heart of the other things the product does. The last several versions have offered an import from/export to XML feature, specifically for transferring contact data between GoldMine itself and other applications or data sources. All that's required for importing to GoldMine is that the data conform to the expected structure. (Exported data presumably conforms to the structure without further user involvement.)
The structure in question is codified in an XML Schema document. You will probably search the FrontRange Web site in vain for this Schema -- I certainly did -- but I was able to obtain a copy of it through the generosity of FrontRange's marketing organization. While you asked specifically for a DTD, it's worth taking a look at the GoldMine Schema for insights into how a commercially successful product solves the problem (including, not insignificantly, how to handle multiple contacts in the scope of a large-scale application).
Tying it together
So the instinct implied in your question was right: you're nowhere near the first to consider using XML as a structured-data format for contact information. But you might consider broadening the question's scope a bit, by imagining something a bit more elaborate than a "closed-shop" contact-management system: how might you build a tool for translating contact information from one of these standards (or any others you can find) to any one of the others?
The obvious platform for such a tool is XSLT. I'm about out of space in this month's column to detail every issue you'd want to (ahem) address, should you decide to tackle this bigger project. Still, here are a few points to consider:
EMAIL element is an EMAIL element.)The important thing, I think, is not to confine your imagination to the relatively static context of "an XML document" -- even a bunch of XML documents. As always with XML, the most important questions are not those dealing with the data as such, but those dealing with what to do with the data once it's in XML form -- not only what the data might be, but what it might just come to mean.
Share your experience in our forums.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
2004-04-27 11:42:14 intiyac [Reply]
i am part of a project in which we will need to develop an rdf for scientific experts. although the contact information is very useful, it does not quite respond to the complexities of defining a xml standard describing experts (which would include insitution, academic background, current position, expertise etc.) i wonder if such work exists? either as scientists or professionals?
2004-05-26 15:13:27 cjhickel1 [Reply]
Here a link to an OWL schema for the researchers of the DARPA/DAML organization. http://www.daml.org/researchers.owl
OWL is RDF and XML compliant. It is a start. The DAML.org site may have other useful information and leads to satisfy your project needs. Good Luck.
2004-04-02 21:30:38 Brian Watt [Reply]
I would appreciate the opportunity to have a look at how the experts map contacts.
Will the link to the Goldmine xsd schema be fixed anytime soon, or should I throw out my bookmark to this page?
2004-04-05 06:35:26 John Simpson [Reply]
Thanks for catching that (ditto to Robert Leif, the earlier poster, who mentioned it). I've reported it to my XML.com editors and also to help@oreillynet.com; it should be taken care of within the next day or two. I'll check periodically myself as well -- if it's not here on XML.com by then, I'll post a link to the schema on my own Web site.
JES
2004-04-19 04:38:19 TGlas [Reply]
The link is still broken, and I'd really like to access the file.
The problem is a difficult one for me, and one of the most potentially helpful XML.com articles I've come across. However, I do not want to limit my solution to a DTD.
Please let us know where/how to get this sample, or how you'd change your answer if a schema had been requested in the first place.
2004-04-19 05:35:21 John Simpson [Reply]
Send your e-mail address to me (simpson "at" polaris.net) and I'll return you a copy of the Goldmine schema. (I'm not sure what's going on with the article link -- it is supposedly being fixed.)
JES
2004-04-01 12:44:06 Robert Leif [Reply]
The link is broken. RDF should be eliminated.
Robert C. Leif,
rleif@rleif.com
2004-04-01 08:58:41 Harry Criswell [Reply]
I've only read about FOAF but doesn't FOAF overlap the vCard/RDF about section? How many ways do we need to represent a persons / contacts attributes? Hopefully the 4 different XML structures mentioned and others that may exist can be rolled into one "standard" that covers all the same attributes of a contact.
Just like the versions of RSS that were developed separately so have very differing structures. It would be easier to support one definitive standard that followed a controlled path of enhancement over succeeding versions than several different versions developed without regard for the existing format / structure of the others.



