Sciweavers

708 search results - page 41 / 142
» Identifying Content Blocks from Web Documents
Sort
View
LAWEB
2006
IEEE
14 years 2 months ago
Analysis of Web Search Engine Clicked Documents
In this paper we process and analyze web search engine query and click data from the perspective of the documents (URL’s) selected. We initially define possible document categor...
David F. Nettleton, Liliana Calderón-Benavi...
WWW
2003
ACM
14 years 9 months ago
The XML web: a first study
Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML w...
Laurent Mignet, Denilson Barbosa, Pierangelo Veltr...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 3 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
IADIS
2003
13 years 10 months ago
Information Management and Interoperability Strategies: The Case for Digital Identifiers
The move from the current compartmentalised systems into an interoperable environment is the central challenge facing digital development this decade. In the quest for a semantic ...
Robin Wilson
CAISE
2004
Springer
14 years 2 months ago
Facing Document-Provider Heterogeneity in Knowledge Portals
Knowledge portals aim at facilitating the location, sharing and dissemination of information by sitting ontologies at the core of the system. For heterogeneous environments where c...
Jon Iturrioz, Oscar Díaz, Sergio Fern&aacut...