Sciweavers

1149 search results - page 8 / 230
» Classification of Web Documents Using a Graph Model
Sort
View
WWW
2002
ACM
14 years 8 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
MLDM
2007
Springer
14 years 1 months ago
PE-PUC: A Graph Based PU-Learning Approach for Text Classification
This paper presents a novel solution for the problem of building text classifier using positive documents (P) and unlabeled documents (U). Here, the unlabeled documents are mixed w...
Shuang Yu, Chunping Li
KES
2006
Springer
13 years 7 months ago
Integrated Document Browsing and Data Acquisition for Building Large Ontologies
Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...
LAWEB
2006
IEEE
14 years 1 months ago
Where and How Duplicates Occur in the Web
In this paper we study duplicates on the Web, using collections containing documents of all sites under the .cl domain that represent accurate and representative subsets of the We...
Álvaro R. Pereira Jr., Ricardo A. Baeza-Yat...
WWW
2001
ACM
14 years 8 months ago
On integrating catalogs
We address the problem of integrating documents from different sources into a master catalog. This problem is pervasive in web marketplaces and portals. Current technology for aut...
Rakesh Agrawal, Ramakrishnan Srikant