Introduction: why we need AMG, first version, and redesign

Why we need Automatic Metadata Generation (AMG)

One of the main concerns in learning technology research is the problem of acquiring the critical mass to establish real reuse. There are several aspects to the solution of this problem: some projects focus on the creation of content and how this can be made easier, faster or cheaper. Other projects focus on interoperability aspects. The aspect we focus on is the creation of metadata. Without appropriate metadata no learning content will be really reusable because it will be difficult or impossible to identify and retrieve it. Although a lot of progress has been achieved in the field of metadata, there are still problems in the actual creation of it:

  • problem of achieving critical mass,
  • problem of having only a very limited set of metadata associated to them.
A possible solution to this problem is the automatic creation of learning object metadata. In this way, the users do not have to bother with the metadata if they do not want to. This can be compared with search engines on the web that index web pages in the background without any intervention of the creator or the host of the site.



First version of our AMG framework

In the first phase of our work, we wanted to create a fast, more or less working implementation of a system that can do automatic metadata generation. We built a general framework for this task, and tried it on a number of case studies. The result of this is reported on in a number of publications in e.g. WWW2005 and Edmedia2005. However this first version sufferd from a number of limitations, among which the fact that we were limited to an application profile of LOM, and that the developed web services were not really interoperable between platforms.



Rationale behind our work on automatic metadata generation
& redesign of the first version

The previously mentioned limitations had to be dealt with, because interoperability and cooperation between metadata generation systems is an important goal for us. One of the things we aim at is the realisation of a system that one could call "federated AMG", and which is illustrated in the figure below. In this figure one can see that there exist several what we call SAmgI installation, i.e. systems that do "some form of metadata generation" (see below for an explanation of the word SAmgI). Examples of this are a Java webservice running on a Tomcat, a .NET web service running on a .NET server, or for example a (closed) Learning Management System that does some metadata generation for its contents.
The idea is that a client can call each those clients to let them do each a part of the metadata generation job. The results of those systems can then be combined into 1 global metadata instance. In the future one could then imagine that some "Federated AMG engine" is written that does this job of contacting several installations and combining their results.
This can be compared with the way that federated search services work: a client sends a query to a federated search engine who is then responsible for federating the query to several targets, and for combining their results. An advantage of using such a federated engine is that it will be his responsibility to deal with installations that e.g. use a different query language or a different results format. This way, the client is relieved from this burden.

Federated AMG: conceptual design
Figure: The idea of cooperating systems, that each do part of the metadata generation job,
and that are combined to generate a final metadata instance for learning objects.

Now, because we want to allow people to create their own Automatic Metadata Generation-implementations, that can all be used together, one of the crucial points is to specify the interface of AMG-installations to the outside world. We have given this aspect very much attention, resulting in a specification that we call the Simple AMG Interface, i.e. SAmgI. This origin of this name is the comparison with the Simple Query Interface, i.e. SQI, which tries to define a unified interface for searching. A more indepth overview of the architecture and the available methods is described in a seperate page.







Try it out!

Old version:

The following allows you to test some case studies of the framework. The "File" tab allows you to upload a file, and the result will be metadata for the file as such, in no context. The "General URL" will do the same for online documents; so you can put a link to an online document there.


Local file (browse file system (filesizes < 50kB), MS Office files tend to work better in '97 format:    
Online file (give URL):    
Teachnet URL, note that optimaization for scoilnet urls is currently disabled due to changes in the site structure of Scoilnet.ie:    
leMill URL:    


New version (beta):

We just released a new version of the SAmgI framework. This is a major redesign. The new SAmgI is available to test.