Sciweavers

1541 search results - page 77 / 309
» Extracting Web Data Using Instance-Based Learning
Sort
View
WWW
2010
ACM
14 years 4 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
14 years 9 months ago
Information extraction from Wikipedia: moving down the long tail
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Fei Wu, Raphael Hoffmann, Daniel S. Weld
ESWA
2008
140views more  ESWA 2008»
13 years 9 months ago
Web taxonomy integration with hierarchical shrinkage algorithm and fine-grained relations
We address the problem of integrating web taxonomies from different real Internet applications. Integrating web taxonomies is to transfer instances from a source to target taxonom...
Chia-Wei Wu, Richard Tzong-Han Tsai, Cheng-Wei Lee...
ICDM
2008
IEEE
142views Data Mining» more  ICDM 2008»
14 years 3 months ago
Unsupervised Face Annotation by Mining the Web
Searching for images of people is an essential task for image and video search engines. However, current search engines have limited capabilities for this task since they rely on ...
Duy-Dinh Le, Shin'ichi Satoh
KDD
2009
ACM
190views Data Mining» more  KDD 2009»
14 years 9 months ago
Named entity mining from click-through data using weakly supervised latent dirichlet allocation
This paper addresses Named Entity Mining (NEM), in which we mine knowledge about named entities such as movies, games, and books from a huge amount of data. NEM is potentially use...
Gu Xu, Shuang-Hong Yang, Hang Li