Building a web thesaurus from web link structure

16 years 6 days ago

Download research.microsoft.com

Thesaurus has been widely used in many applications, including information retrieval, natural language processing, and question answering. In this paper, we propose a novel approach to automatically constructing a domain-specific thesaurus from the Web using link structure information. The proposed approach is able to identify new terms and reflect the latest relationship between terms as the Web evolves. First, a set of high quality and representative websites of a specific domain is selected. After filtering out navigational links, link analysis is applied to each website to obtain its content structure. Finally, the thesaurus is constructed by merging the content structures of the selected websites. The experimental results on automatic query expansion based on our constructed thesaurus show 20% improvement in search precision compared to the baseline. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval - search process, re...

Zheng Chen, Shengping Liu, Liu Wenyin, Geguang Pu,

Real-time Traffic

Content Structure | Domain-specific Thesaurus | SIGIR 2003 | Thesaurus |

claim paper

» Wikipedia Mining for an Association Web Thesaurus Construction

» Using ContentBased and LinkBased Analysis in Building Vertical Search Engines

» Processing link structures and linkbases in the webs open world linking

» Linking and Building Ontologies of Linked Data

» Thesaurus Extension Using Web Search Engines

» Automatic annotation of multilingual text collections with a conceptual thesaurus

» ProThes thesaurusbased metasearch engine for a specific application domain

» Collaboratively Building Structured Knowledge with DBin From delicious Tags to an RDFS Fol...

Post Info
More Details (n/a)

Added	05 Jul 2010
Updated	05 Jul 2010
Type	Conference
Year	2003
Where	SIGIR
Authors	Zheng Chen, Shengping Liu, Liu Wenyin, Geguang Pu, Wei-Ying Ma

Comments (0)

Sciweavers

Building a web thesaurus from web link structure

Content Structure | Domain-specific Thesaurus | SIGIR 2003 | Thesaurus |

Explore & Download

Productivity Tools

Sciweavers