Sciweavers

368 search results - page 28 / 74
» Template-Based Information Mining from HTML Documents
Sort
View
WWW
2006
ACM
14 years 8 months ago
Robust web content extraction
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
CIKM
2008
Springer
13 years 9 months ago
A system for finding biological entities that satisfy certain conditions from texts
Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and tex...
Wei Zhou, Clement T. Yu, Weiyi Meng
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
DASFAA
2005
IEEE
154views Database» more  DASFAA 2005»
14 years 1 months ago
Mining Positive and Negative Association Rules from XML Query Patterns for Caching
Recently, several approaches that mine frequent XML query patterns and cache their results have been proposed to improve query response time. However, frequent XML query patterns m...
Ling Chen 0002, Sourav S. Bhowmick, Liang-Tien Chi...
WWW
2009
ACM
14 years 8 months ago
Mining multilingual topics from wikipedia
In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages...
Xiaochuan Ni, Jian-Tao Sun, Jian Hu, Zheng Chen