view printer-friendly version (opens in new window)
From RDF to Topic Maps and back again
If you want to share not just content, but your knowledge of a particular domain, there's two standards that can provide the means: RDF and Topic Maps. The two developed independent of each other, but the first survey of how to map knowledge in the one standard to the other has now been published. It'll be the first step to a standardised RDF - Topic Map interoperability guideline.
Though ISO's Topic Map standard and the W3C's RDF were developed for broadly similar purposes and at the same time, there are some significant differences between them. Where RDF is meant to be a building block in machine processable descriptions of anything, from blog feeds to web service interfaces, Topic Maps are specifically designed to guide humans through the 'infoglut'. In their makers' soundbites, RDF "allows anyone to say anything about anything" where Topic Maps will be the "GPS of the information universe".
There has been a predictable bit of grumbling either side of the fence, once it became clear how close both efforts were. Still, the fact that the present survey is published by the W3C, but headed by the leader of the ISO Topic Map group already indicates that there's a concerted push to reconcile the two approaches.
One look at the raw (XML) representation of both standards shows the differences in depth. Topic Maps are complex looking webs of relatively readily understood elements: topics, associations between topics, occurrences of the topic and so on. Raw RDF is more like a large number of very simple, uniform statements, with three fairly abstract parts each, depending on the vocabulary used.
Not that the good looks of naked source should make any difference to the casual user, even if the interoperability of both standards should.
This proves to be rather more involved than may, at first, be expected. The abstract, universal and low level nature of RDF would suggest that it should be fairly straightforward to capture the meaning of any given Topic Map.
In one sense, this is indeed the case. There are several ways to automatically map the semantics of one Topic Map to a bunch of RDF statements, and, preferably, back again. Trouble is, such an object level mapping results in a very large pile of rather un-natural looking RDF.
The reason for that strangeness is that such a mapping only happens at the object level- it's a 'blind' translation of everything that is stated in a specific topic map to something that validates as proper RDF. Thing is that the resulting RDF ignores any generalities in how Topic Maps structure information, and ignores how people normally structure information in RDF as well. Result: the RDF is practically unuseable in ordinary RDF process tools and query languages. Similar issues happen when going from RDF to Topic Maps at that level.
The other approach is to come up with a higher level, more semantic approach to mapping between the two standards. This approach focusses on how you would generally model the semantics of either standard in the other, in such a way that the result is in line with each of the standard's approaches to modelling information generally.
That is also possible, and has already been implemented in tools such as the Ontopia Topic Map engine. Trouble there is that it is hard to both automatise the process, and not lose meaning every time a model is mapped from a Topic Map to RDF and back again. The reasons for that difficulty are enlightening beyond the narrow question of interoperability between the two approaches.
Identifiers and relations
The very existence of an interoperability issue between two supposedly universal ways of describing things in a machine readable way, indicates that the technology will be influenced to at least some extend by how its makers understand the world. Less philosophically, and more practically: both will allow you to say almost whatever you wish about whatever is important to you and your community and exchange that information. But both have to make subtle assumptions about the world which will colour your descriptions.
This is particularly apparent in the vexed question of identifiers. This is fundamental to semantic interoperability, since, at the minimum, I must be able to determine that what you're talking about is the same thing as what I'm talking about. If you put a nice description of "Einstein" on the web, then I need a way to figure out that that refers to what I think "Einstein" is.
RDF simply says that an RDF statement must use a URI for that purpose. What that URI looks like, and whether it points to something directly in the online world is essentially up to the implementor. Topic Maps, however, make a subtle but important distinction between subject identifiers and subject locators. The first is just a means of determining equivalence, the second does that, but actually points to an online resource too. Put very bluntly, "Einstein" is a dead person, which is not the same thing as a webpage about him- not even a generally agreed online resource that is supposed to stand for him. Topic Maps force you to recognise the distinction, but RDF does not.
As a result, a decision has to be made whenever such identifiers are mapped from one to the other. There is not necessarily any information in an RDF fragment to indicate whether the URIs in it are resolvable to anything, nor is there an agreed way to preserve the Topic Map distinction between locator and pure identifiers when going the other way.
A similarly deep issue is that of relations. RDF is built on directional binary relations, which means that it models a resource as a property of another resource, but not necessarily the other way round. A grossly simplified RDFish statement could be:
Wilbert likes appleGiven the right identifiers and vocabulary, that would state that I'm fond of that particular fruit. But it doesn't state directly that apples are liked by me. There are mechanisms that allow you to infer that fact, but it's not directly stated. A Topic Map with a 'likes' association between the topics 'Wilbert' and 'apple' states both things- guaranteed.
From a logical perspective, the Topic Map association makes perfect sense, and it also allows software to draw all sorts of interesting and pertinent links for its users in a straightforward way. Whether humans always think that logically is a different matter, however. Metaphors, for example, can be expressed directly in RDF: "man is a wolf", for example is perfectly doable. Not so in Topic Maps, because the association is not necessarily intended to work the other way: "wolf is a man" is a different proposition 1.
Solving the puzzle
There are a number of other, slightly more tractable differences between RDF and Topic Maps. What they all have in common, though, is that mapping them to useful constructs in the other approach, without loss of meaning, requires information that is external to either the standards or the documents themselves.
In other words, a human needs to look at a particular set of Topic Maps or RDF documents, think about it for a bit, and then write some extra information somewhere before a machine can successfully map from one set to the other. Not a problem per se, just one that needs some agreement.
The W3C's Survey mentions a couple of ad hoc solutions that people have tried, but is curiously brief about one quite obvious solution: an ontology in the W3C's OWL. That language offers a couple of levels of increasingly sophisticated and powerful definitions of exactly how an instance of one thing is related to another. A RDF-to-Topic Map ontology ought, therefore, to be able to spell out exactly how, say, Topic Map associations relate to equivalent statements in RDF. It would also leave instances of translated documents relatively straightforward: a Topic Map identifier could be treated as a regular URI in its RDF equivalent, until some programme or human is sufficiently bothered to actually go out and consult the ontology for the finer distinction (e.g. when translating the RDF back to Topic Maps).
Since the W3C survey is just a working draft, and since an OWL-based solution would require some thought if it is to be applied to Topic Maps, it is likely that the solution will be more fleshed out in the near future. More generally, it is pretty obvious from both sides of this fence that there is a real commitment to making RDF - Topic Map interoperability work.
The latest version of A Survey of RDF/Topic Maps Interoperability Proposals is available from the W3C. They also have the RDF Primer and a handy OWL FAQ for more information about those languages.
Ontopia systems hosts the The TAO of Topic Maps; Finding the Way in the Age of Infoglut. They also have an online and downloadable version of their Omnigator Topic Map and RDF browser.
1 Whether you can actually process metaphors in a machine is a different and truly vast question, but at least you can state them in RDF.