Sciweavers

363 search results - page 31 / 73
» A Redundancy-Based Method for Relation Instantiation from th...
Sort
View
115
Voted
WWW
2006
ACM
16 years 6 months ago
Robust web content extraction
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
165
Voted
DIS
2007
Springer
16 years 9 days ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
163
Voted
EMNLP
2009
15 years 3 months ago
Hypernym Discovery Based on Distributional Similarity and Hierarchical Structures
This paper presents a new method of developing a large-scale hyponymy relation database by combining Wikipedia and other Web documents. We attach new words to the hyponymy databas...
Ichiro Yamada, Kentaro Torisawa, Jun'ichi Kazama, ...
WWW
2006
ACM
16 years 6 months ago
Logical structure based semantic relationship extraction from semi-structured documents
Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information ex...
Kuo Zhang, Gang Wu, Juan-Zi Li
AIME
2003
Springer
15 years 11 months ago
Learning Derived Words from Medical Corpora
Abstract. Morphological knowledge (inflection, derivation, compounds) is useful for medical language processing. Some is available for medical English in the UMLS Specialist Lexic...
Pierre Zweigenbaum, Natalia Grabar