January 17, 2005
Course Definitions in XML/RDF: First Steps
I've started work recently on mapping requirements for course definitions into an RDF vocabulary at the behest of a new working group on course information. The group is being chaired by Mark Stubbs of Manchester Metropolitan University, with quite a few colleges and universities either expressing an interest or attending the first meeting in Manchester the other week. The process is still informal, and I've come up with some initial proposals.
For a kickoff, the actual requirements are coming from the grassroots here - these are people with real issues of how to manage information about the curriculum, share it with partners, and advertise courses. Its a real problem, and many of them have already had to develop their own data models and schemas to fill the gap while we've all been off developing standards for "learning objects" :-)
Luckily, a lot of the information in the UK is already fairly constrained by the requirements of UCAS (our national University application service) and the reporting agencies such as QAA, which means the source data has already converged quite a bit to fit through this "funnel".
Looking at the requirements I've been drawn towards a very simple approach based on a combination of RDF and Dublin Core, rather like RSS v1.0. RDF is useful for mapping properties to resources, and aggregating multiple documents into a single store (for example, where the pieces of a course definition originate separately with faculty, administration, and marketing units, and have to be joined up into one "prospectus"). Dublin Core is generally very useful for describing most aspects of a resource. The critical piece at the moment is the vocabulary of high-level terms that the descriptions are attached to, such as prerequisites, course aims etc.
I did consider IMS Enterprise, but the data model isn't quite right, and the SOAP protocol model isn't I think appropriate to the problem.
My idea at the moment is to create an RDF vocabulary so that anyone can subscribe to course definitions via a URL, pretty much the same as using OAI or RSS. Eventually it should be possible to discover learning opportunities in a similar way to resources. I've read through 5 different schemas so far, including a spec from Norway called Course Description Metadata (CDM) which was very interesting in terms of domain coverage, but I think went a little too deep into the details, especially on syntax - I think practical interoperability can happen with something simpler.
Of course, not everyone likes RDF, so it may be necessary to do the whole thing in XML, which would work too, albeit a bit less elegantly, requiring a whole load of extra element definitions to fill gaps.
So far this is what I've got:
- Class: Course
- type (e.g. programme, module)
- level (level of study)
- dc:title (course title)
- dc:description (course description)
- image (an image, possibly for marketing)
- dc:subject (keywords for searching)
- begin (when it starts)
- end (when it ends)
- duration (how long, e.g "4 years FTE")
- dc:relation (link to related information, such as course homepage, reading list, or RSS feed)
- awarded-by (the institution issuing awards for the course)
- taught-by (an institution teaching the course)
- contact (a person to contact about the course)
- attendance mode
- topic (i.e. a part of the syllabus)
- review (outcome of quality reviews etc)
- location (where the course meets)
- expenses (fees, grants etc)
- dc:hasPart (other courses that are part of this course)
- dc:isPartOf (other courses of which this is a part)
All of the properties from "awarded-by" onwards are containers for DC metadata or references to external files (e.g. vCard). They are all multiple-optional. Eventually there may need to be specialised vocabularies for some of these properties, for example to describe qualfiications and credits. However, just being able to exchange titles and descriptions of these things will be a big step forward, and a solid basis on which to add in other models as the need arises.
There are a lot of properties in the list, more than I'd prefer really - I don't like big vocabularies of data elements - but I can't see any contenders for relegation, especially as all the example data I've looked at covers pretty much the whole lot. At least the majority of any data will be DC, so the semantics are mostly within to the domain rather than straying into other areas covered by existing specs.
If you're interested in this work, drop by the CETIS Enterprise SIG, who are hosting the effort, or drop me an email.