Using social tags to improve indexing
During the second half of the project, experienced indexers analysed and discussed the social tags/folksonomies created by teachers and reported on their findings in a Final Report on Phase II. There were some interesting findings.
- In general, expert indexers were positive towards tags and indicated that they were as suitable (i.e. clear and unambiguous) as indexing keywords but sometimes were reluctant to include them in the original Learning Object Metadata (LOM).
- Expert indexers said that they did not always understand the meaning of the tag because it a) was in a language that they did not understand and b) was so specific to the subject/topic domain that it was not understood by outsiders. Some tags in multiple languages were deemed to add useful information regardless of the skills of the indexer; these were called “travel well” tags.
- About 11% of user-generated, distinct tags were terms that already existed in the LRE Multilingual Thesaurus; however, the original indexer had failed to add them as descriptors. We dubbed these “thesaurus tags”.
- A number of end-user tags (about a third of the total) were very similar to the terms in the Thesaurus but differed slightly in spelling or writing convention (e.g. English vs. English language). Expert indexers indicated that they would be prepared to adopt these tags, particularly in cases where the original indexing was poor.
- The use of some tags indicates that the thesaurus terms are not always sufficient for describing the topic of a certain curriculum subject. These types of tags could become candidates for future inclusion in the thesaurus.
Thesaurus tags proved especially interesting. Following more extensive analysis of all multilingual tags, we found that a “thesaurus tag” had a higher average rate of applications, indicating their popularity among teachers. A "thesaurus tag" was applied by teachers on average 11.8 times, whereas a normal tag was applied only 2.5 times. Due to this popularity, the “thesaurus tags” constituted 30.6% of all the tags in the MELT system.
In future, these “thesaurus tags” could be used to provide a bridge between tags in multiple languages and other indexing terms to support better language identification of tags as well as better retrieval. “Thesaurus tags” could also assist in identifying “travel well” tags thanks to their same spelling in many different languages.