Sciweavers

945 search results - page 28 / 189
» Information Extraction from HTML: Application of a General M...
Sort
View
WWW
2010
ACM
14 years 3 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
ICDE
2005
IEEE
126views Database» more  ICDE 2005»
14 years 2 months ago
ProtChew: Automatic Extraction of Protein Names from Biomedical Literature
With the increasing amount of biomedical literature, there is a need for automatic extraction of information to support biomedical researchers. Due to incomplete biomedical inform...
Amund Tveit, Rune Sætre, Astrid Lægrei...
WWW
2007
ACM
14 years 9 months ago
Robust web page segmentation for mobile terminal using content-distances and page layout information
The demand of browsing information from general Web pages using a mobile phone is increasing. However, since the majority of Web pages on the Internet are optimized for browsing f...
Gen Hattori, Keiichiro Hoashi, Kazunori Matsumoto,...
CIKM
2011
Springer
12 years 8 months ago
Mining entity translations from comparable corpora: a holistic graph mapping approach
This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe...
Jinhan Kim, Long Jiang, Seung-won Hwang, Young-In ...
FTDB
2008
82views more  FTDB 2008»
13 years 8 months ago
Information Extraction
The automatic extraction of information from unstructured sources has opened up new avenues for querying, organizing, and analyzing data by drawing upon the clean semantics of str...
Sunita Sarawagi