Sciweavers

910 search results - page 158 / 182
» Standardization of Speech Corpus
Sort
View
WWW
2010
ACM
14 years 3 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
PAKDD
2009
ACM
112views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Romanization of Thai Proper Names Based on Popularity of Usages
The lack of standards for Romanization of Thai proper names makes searching activity a challenging task. This is particularly important when searching for people-related documents ...
Akegapon Tangverapong, Atiwong Suchato, Proadpran ...
VL
2009
IEEE
128views Visual Languages» more  VL 2009»
14 years 3 months ago
Improving API documentation using API usage information
Jadeite is a new Javadoc-like API documentation system that takes advantage of multiple users’ aggregate experience to reduce difficulties that programmers have learning new API...
Jeffrey Stylos, Andrew Faulring, Zizhuang Yang, Br...
IWANN
2009
Springer
14 years 3 months ago
Identification of Chemical Entities in Patent Documents
Biomedical literature is an important source of information for chemical compounds. However, different representations and nomenclatures for chemical entities exist, which makes th...
Tiago Grego, Piotr Pezik, Francisco M. Couto, Diet...
VLDB
2007
ACM
132views Database» more  VLDB 2007»
14 years 2 months ago
EntityRank: Searching Entities Directly and Holistically
As the Web has evolved into a data-rich repository, with the standard “page view,” current search engines are becoming increasingly inadequate for a wide range of query tasks....
Tao Cheng, Xifeng Yan, Kevin Chen-Chuan Chang