 Automatic Metadata Generation 

MELT created an enrichment portal for use by experienced indexers in the project and the SAmgI automatic metadata generator (developed at KULeuven) was used there to provide additional data on top of the metadata already present.

SAmgI in the MELT enrichment portal

In MELT two separate case studies were demonstrated the possibilities for automated metadata generation. One was for Scoilnet, the Irish national portal, where metadata was automatically generated with SAmgI for more than 10.000 learning objects. The other was LeMill, an online web community for finding, authoring and sharing learning resources. Here metadata was generated automatically for 650 LeMill learning objects.

These cases demonstrated that automatic metadata generation works best if the object based generation and context based generation can complement one another well.

  • If, for instance, the object based generator is not capable of generating sufficient content, then the resulting metadata will most likely not be able to describe the learning material well. This can be due to a lack of metadata generators for a certain mime type or due to limitations of the generators. For instance, there is a minimal amount of text required to detect the language or generate keywords and descriptions.
  • On the other hand, if there is only limited context based information available, then chances are reduced that features that are typically hard to extract from the content like duration, difficulty, intended user role, etc. will be generated for the learning object.

In general, context based annotation works well if the learning objects are offered within a certain structure. This can be a file hierarchy, a structured web site (see LeMill case) or a connection to a LMS. For example, if information about rights, if at all present, occurs always in the footer or header of a web page, it will be easier to extract than when it is to be found at a random location, due to the diverse nature of learning objects and habits of their authors. This can be problematic for referatories like Scoilnet that typically link to many different content providers.