Sciweavers

43 search results - page 1 / 9
» A Lightweight and Efficient Tool for Cleaning Web Pages
Sort
View
LREC
2008
108views Education» more  LREC 2008»
14 years 18 days ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert
DEXA
2006
Springer
197views Database» more  DEXA 2006»
14 years 1 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife
W4A
2006
ACM
14 years 5 months ago
Transforming web pages to become standard-compliant through reverse engineering
Developing Web pages following established standards can make the information more accessible, their rendering more efficient, and their processing by computer applications easier...
Benfeng Chen, Vincent Y. Shen
CAISE
2005
Springer
14 years 1 months ago
CATO - A Lightweight Ontology Alignment Tool
Ontologies are becoming increasingly common in the World Wide Web as the building block for a future Semantic Web. In this Web, ontologies will be responsible for making the semant...
Karin Koogan Breitman, Carolina Howard Felic&iacut...
AINA
2009
IEEE
14 years 6 months ago
CUTER: An Efficient Useful Text Extraction Mechanism
In this paper we present CUTER, a system that processes HTML pages in order to extract the useful text from them. The mechanism is focalized on HTML pages that include news articl...
George Adam, Christos Bouras, Vassilis Poulopoulos