Sciweavers

563 search results - page 40 / 113
» Crawling the web for structured documents
Sort
View
159
Voted
TREC
2000
15 years 5 months ago
Information Space Based on HTML Structure
The main goal for the Information Space system for TREC9 was early precision. To facilitate this, an emphasis was placed on seeking matches from only the TITLE, H1, H2 and H3 tags...
Gregory B. Newby
ITCC
2005
IEEE
15 years 9 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
SIGMOD
2009
ACM
219views Database» more  SIGMOD 2009»
16 years 4 months ago
Hermes: a travel through semantics on the data web
The Web as a global information space is developing from a Web of documents to a Web of data. This development opens new ways for addressing complex information needs. Search is n...
Haofen Wang, Thomas Penin, Kaifeng Xu, Junquan Che...
OHS
2001
Springer
15 years 8 months ago
Revisiting and Versioning in Virtual Special Reports
Adaptation/personalization is one of the main issues for web applications and require large repositories. Creating adaptive web applications from these repositories requires to hav...
Sébastien Iksal, Serge Garlatti
DOCENG
2007
ACM
15 years 8 months ago
Structure and content analysis for html medical articles: a hidden markov model approach
We describe ongoing research on segmenting and labeling HTML medical journal articles. In contrast to existing approaches in which HTML tags usually serve as strong indicators, we...
Jie Zou, Daniel X. Le, George R. Thoma