W3C releases major Semantic Web building blocks
The ambitious, five year old vision of the Semantic Web is a major step closer following the release of the Resource Description Framework (RDF) and the Ontology Web Language (OWL) as full World Wide Web Consortium (W3C) recommendations. We assess what it is and what it might mean for e-elearning.
Most every e-learning specification or standard rests on a stack of lower level specification, mostly from the W3C. At its most visible in (X)HTML -the language in which this web page reached your browser- to its most basic in eXtensible Markup Language (XML), the language in which the majority of e-learning specifications are expressed. What is possible with W3C technologies, therefore, constrains and enables what is possible with e-learning to a pretty considerable extent.
For some years now, the Next Big Thing on the web has, as far as the W3C is concerned, been the Semantic Web. It is a pile of interrelated technologies that is designed to enable an 'intelligent' web: one which is not just a huge, unstructured pile of text and multimedia for humans to admire, but one where both humans and machines can search and interact with resources in informed ways.
HTML, XML, RDF and OWL
RDF has been the cornerstone of the semantic web vision, because it allows most anything to be described. Such things to be described are primarily things on the web (such as learning objects), but an RDF 'triple' can also point to objects in the real world.
To understand it properly, it is handy to have a look at the main technologies that make up the web today. HTML was there at the beginning, and it continues to serve its one purpose -displaying formatted text with hyperlinks- reasonably well. But that is all it does: the machine understandable things it can say (the datamodel) is fixed and limited to displaying text, embedded pictures and links. The way it expresses what it can say (its syntax) is also fixed.
To overcome all these limitations, XML was invented. This provided a means with which any datamodel can be expressed. Just define what can and needs to be said, and put that in a template; either an Document Type Declaration (DTD) or a set of XMLSchema Definitions (XSDs). As long as the template is followed in any document that claims to stick to a particular DTD or XSD, programmes can roughly predict what will be in it, and thereby what to do with it. Hence things like IMS Content Packaging: it contains a manifest in XML that captures the structure of a learning object's resources and expresses it in a way that a machine can understand and act on. Provided it 'knows' what an IMS Content Package looks like in advance.
But what about those cases where it will be difficult or impossible to predict what might be in a document? As the number of XML applications such as Content Packaging proliferate, machines can't know about all XML formats. Even different versions of the same XML application can cause a headache. Hence an increasing demand for something that requires a machine to know only very general principles, but still provide meaningful structure.
RDF and its uses
Step forward, RDF. It has a simple, natural language-like subject, predicate and object syntax, but that is still enough to say meaningful statements. Such as: this web page (subject) has a creation date (predicate) of 13 February 2004 (object). Add the ability to combine several such triplets in interesting ways, and the ability of triplets to says something about other triplets (e.g. that the above triplet has an author: John Smith), and we're getting somewhere.
Take, for example, the IEEE Learning Object Metadata (LOM) standard. It defines what can be said about learning objects, and is usually expressed in XML. But the IEEE Learning Technology Standards Committee (LTSC), led by Mikael Nilson, is also busy working on an RDF version ('binding').
This enables such things as extending what a LOM record says as and when it is needed. That is, if you find that, after cataloging thirty thousand learning objects, you need to say something about the thirty thousand and first that is not easily accommodated in the LOM as it is, you can still add it without worrying too much about mucking up the systems that you've already got.
Also, because of the ability of RDF to say something about other RDF documents, it becomes easy to say something about an existing record or extend it. To realise how valuable that is, just think back of the days when people still believed that sticking a few keywords in the HTML of a web page ('metatags') was good enough: every webmaster and his dog stuck in whatever they thought would drive most traffic to the page. Never mind consistency or accurate description. The ability to let others either extend or contradict a metadata record to reflect what they think is useful or accurate, sidesteps that issue by allowing users to decide whom to trust.
But even RDF has some limitations. Though it forces people to stick to a kind of predictable meta-datamodel (the subject, predicate, object business), a lot depends on specific vocabularies that fill in the three slots. And those vocabularies will necessarily be specific to a community. What I mean by 'creation date' or 'author' might be subtly different from what you mean.
OWL addresses this by going one step further than RDF on its own: you can make defining statements with it. That is, you can make a machine understandable statement about what a vocabulary item means. For example, that an "author" is a kind of "human". And, in a separate statement, that John Smith is a "human". That sounds like the bleeding obvious to humans, but it can help machines reason about things they otherwise haven't got a clue about.
To take the author example, if my OWL vocabulary's statement asserts that "authors" are "humans", and nothing else, a machine can conclude that searching for "Royal Institute of Technology" in the author field of any of my metadata records is not going to be of any use, since someone else has probably asserted that "Royal Institute of Technology" is a kind of "group of people", and not a "human". Most intriguingly, it becomes possible to assert equivalences between different vocabularies as well: my vocabulary may have an "affiliation" item. If somebody has an OWL statement that expresses the relation between "author" and "affiliation" in the right way, the machine can still search my records on authors from the Royal Institute of Technology.
Despite the W3C's best efforts, most of this RDF and OWL goodness is still some way off, though. True, the RDF Site Summary (RSS) specification is in very widespread use across the web, but other applications are still gathering steam.
One of the challenges there is one familiar to any metadata venture: who is going to make the metadata, and why. Publishers have a motive, but, as the metatags example shows, may not always be the best metadata creators. There are librarians and other information professionals as well, but there's only so many to go 'round. Much of the hope, then, lies in tools that capture as much as possible automatically, when a resource is created.
Another challenge that is specific to RDF is that it is just so damn hard to understand. The principles are easy enough, but the practice, the full theory and the range of related concepts (RDF Schema, RDF Semantics etc) is not for the faint hearted. The positive spin is that it might need more applications for people to get clue. Explaining the concepts of the web to someone who's never seen it would be pretty difficult too, after all.
Lastly, there may be some danger of the technologies biting each other. OWL and RDF (plus related) are largely complementary and all are W3C specs, but there is some overlap. On a human level, the tone of some of the comparisons of OWL to RDF in the OWL Web Ontology Language Use Cases and Requirements document is quite revealing. The OWL people don't think much of RDF, it seems.
For all about the new OWL and RDF specs, there's an extensive press release from the W3C.
For more about the RDF binding of the LOM, see the knowledge management group of the centre for user oriented IT design at the Swedish Royal Institute of Technology