Sciweavers

APCCM
2009

Extracting and Modeling the Semantic Information Content of Web Documents to Support Semantic Document Retrieval

14 years 18 days ago
Extracting and Modeling the Semantic Information Content of Web Documents to Support Semantic Document Retrieval
Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically processed, retrieved and explored by computer applications. Existing information extraction system mainly concerns with extracting important keywords or key phrases that represent the content of the documents. The semantic aspects of such keywords have not been explored extensively. In this paper we propose an approach meant to assist in extracting and modeling the semantic information content of web documents using natural language analysis technique and a domain specific ontology. Together with the user's participation, the tool gradually extracts and constructs the semantic document model which is represented as XML. The semantic models representing each document are then being integrated to form a global semantic model. Such a model provides users with a global knowledge model of some domains.
Shahrul Azman Noah, Lailatulqadri Zakaria, Arifah
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2009
Where APCCM
Authors Shahrul Azman Noah, Lailatulqadri Zakaria, Arifah Che Alhadi
Comments (0)