Sciweavers

318 search results - page 51 / 64
» Mining data records in Web pages
Sort
View
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 3 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
PVLDB
2010
161views more  PVLDB 2010»
13 years 6 months ago
Annotating and Searching Web Tables Using Entities, Types and Relationships
Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational...
Girija Limaye, Sunita Sarawagi, Soumen Chakrabarti
KDD
1997
ACM
169views Data Mining» more  KDD 1997»
14 years 18 days ago
Learning to Extract Text-Based Information from the World Wide Web
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...
Stephen Soderland
KDD
2004
ACM
145views Data Mining» more  KDD 2004»
14 years 1 months ago
A graph-theoretic approach to extract storylines from search results
We present a graph-theoretic approach to discover storylines from search results. Storylines are windows that offer glimpses into interesting themes latent among the top search re...
Ravi Kumar, Uma Mahadevan, D. Sivakumar
WSDM
2009
ACM
125views Data Mining» more  WSDM 2009»
14 years 3 months ago
Less is more: sampling the neighborhood graph makes SALSA better and faster
In this paper, we attempt to improve the effectiveness and the efficiency of query-dependent link-based ranking algorithms such as HITS, MAX and SALSA. All these ranking algorith...
Marc Najork, Sreenivas Gollapudi, Rina Panigrahy