Supporting Learning Object Repository by automatic extraction of metadata
The Learning Objects Repositories are electronic databases able to deliver material on the web allowing instructors sharing and reusing educational units and students accessing and enjoying them. The best way to guarantee these interactions is a good indexing. Each content needs a machine-understandable description able to declare requirements and limits for its right use and to improve any research and delivery action. These descriptions are stored in the metadata. Filling in the metadata is a boring and time-consuming activity but it is very important since it could influence the choice of the best material to deliver. This paper describes a possible methodological approach to automate this activity by extracting metadata directly from the files setting up the learning object itself. In the literature, there are many methods able to automatically characterize the technological aspects of the content, but very few of them are able to provide information about its pedagogical features. The proposed approach tries to draw together information theory, learning models, statistical analysis and ad hoc heuristics to extract a wide set of fields of the metadata. The results of a first experimentation are particularly encouraging to think about this approach as a solution to support learning object repositories and other platforms having needs to manage wide content storage and huge amount of users with various personal features, devices for interaction and goals as in the MOOCs.