The following are the key innovations of the On-To-Knowledge project w.r.t. the state of the art:

  • going beyond key-word based search
  • enabling automated information extraction
  • exploiting ontologies
  • supporting information maintenance

Beyond key-word based search

There are numerous approaches on information retrieval, text extraction or agent-based information access. However, nearly all of them work at the keyword level. It is well known from information retrieval that keyword-based information access has principal limitations (concerning precision and recall).

Enable automated information extraction

Another main limitation of these approaches is that they usually deliver raw documents (in case of Web search engines these documents are URLs). This requires human effort to extract the required answer (i.e. browse and read the delivered documents until the information has been found). This burdens the human user and drastically hampers automated information extraction by agents.

On-To-Knowledge will provide a query answering mechanim for unstructured, weakly structured and formalized documents. Besides query answering facilities (used by humans or software agents).

On-To-Knowledge will provide means for creating user-specific views on information documents, for maintaining information content, and for automatically generating new documents from existing.

Exploit ontologies

We will use ontologies to mediate information access and will provide an integrated tool environment that covers acquisition, maintenance, and access to online information based on ontologies. To our knowledge no such project already exists.

Ontologies can provide more complex definitions (ranging as far as logical axioms) than is possible with thesauri used in information retrieval. They are our key asset in automating query answering, maintenance, and automatic document generation. The integration of ontologies and automated information retrieval (IR) approaches (as support for ontology generation) are investigated.

Where some approaches from IR exist that deal with text analysis, the novelty here is the way in which such techniques are integrated in ontology creation, maintenance, comparison and visualization.

Support for information maintenance

The issue of maintenance, as mentioned in the proposal, clearly goes beyond existing work in information retrieval. We will provide tool support enabling automatic maintenance and view definitions on this knowledge. That is, we will provide systematic support for information providers which is essential in a knowledge management environment.