Sciweavers

23 search results - page 3 / 5
» A Hypergraph Model for Computing Page Reputation on Web Coll...
Sort
View
ACL
2009
13 years 5 months ago
Employing Topic Models for Pattern-based Semantic Class Discovery
A semantic class is a collection of items (words or phrases) which have semantically peer or sibling relationship. This paper studies the employment of topic models to automatical...
Huibin Zhang, Mingjie Zhu, Shuming Shi, Ji-Rong We...
CHI
1996
ACM
13 years 11 months ago
Silk from a Sow's Ear: Extracting Usable Structures from the Web
In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly f...
Peter Pirolli, James E. Pitkow, Ramana Rao
AINA
2009
IEEE
14 years 2 months ago
CUTER: An Efficient Useful Text Extraction Mechanism
In this paper we present CUTER, a system that processes HTML pages in order to extract the useful text from them. The mechanism is focalized on HTML pages that include news articl...
George Adam, Christos Bouras, Vassilis Poulopoulos
HPDC
2010
IEEE
13 years 8 months ago
ParaText: scalable text modeling and analysis
Automated analysis of unstructured text documents (e.g., web pages, newswire articles, research publications, business reports) is a key capability for solving important problems ...
Daniel M. Dunlavy, Timothy M. Shead, Eric T. Stant...
WISE
2002
Springer
14 years 11 days ago
Applying the Site Information to the Information Retrieval from the Web
In recent years, several information retrieval methods using information about the Web-links are developed, such as HITS and Trawling. In order to analyze the Web-links dividing i...
Yasuhito Asano, Hiroshi Imai, Masashi Toyoda, Masa...