Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

161

ICTAI
1999
IEEE

101views Artificial Intelligence» more ICTAI 1999»

A New Study on Using HTML Structures to Improve Retrieval

15 years 11 months ago

A New Study on Using HTML Structures to Improve Retrieval

Download www.cs.binghamton.edu

Locating useful information effectively from the World Wide Web (WWW) is of wide interest. This paper presents new results on a methodology of using the structures and hyperlinks of HTML documents to improve the effectiveness of retrieving HTML documents. This methodology partitions the occurrences of terms in a document collection into classes according to the tags in which a particular term appears (such as Title, H1H6, and Anchor). The rationale is that terms appearing in different structures of a document may have different significance in identifying the document. The weighting schemes of traditional information retrieval were extended to include class importance values. We implemented a genetic algorithm to determine a "best so far" class importance factor combination. Our experiments indicate that using this technique the retrieval effectiveness can be improved by 39.6% or higher.

Michal Cutler, H. Deng, S. Maniccam, Weiyi Meng

Real-time Traffic

Artificial Intelligence | Class Importance | HTML Documents | ICTAI 1999 | World Wide Web |

claim paper

Related Content

» Using the Structure of HTML Documents to Improve Retrieval

» Title extraction from bodies of HTML documents and its application to web page retrieval

» Evaluation of Alignment Methods for HTML Parallel Text

» Deriving linkcontext from HTML tag tree

» Towards a Better Understanding of Web Resources and Server Responses for Improved Caching

» Web page title extraction and its application

» Improving designpattern identification a new approach and an exploratory study

» Studies of Radical Model for Retrieval of Cursive Chinese Handwritten Annotations

» Exploiting Thread Structures to Improve Smoothing of Language Models for Forum Post Retrie...

Post Info
More Details (n/a)

Added	03 Aug 2010
Updated	03 Aug 2010
Type	Conference
Year	1999
Where	ICTAI
Authors	Michal Cutler, H. Deng, S. Maniccam, Weiyi Meng

Comments (0)