Sciweavers

1541 search results - page 92 / 309
» Extracting Web Data Using Instance-Based Learning
Sort
View
IJCNLP
2004
Springer
14 years 3 months ago
Combining Labeled and Unlabeled Data for Learning Cross-Document Structural Relationships
Multi-document discourse analysis has emerged with the potential of improving various NLP applications. Based on the newly proposed Cross-document Structure Theory (CST), this pap...
Zhu Zhang, Dragomir R. Radev
WEBDB
1999
Springer
196views Database» more  WEBDB 1999»
14 years 2 months ago
Web Ecology: Recycling HTML Pages as XML Documents Using W4F
In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...
Arnaud Sahuguet, Fabien Azavant
WECWIS
2003
IEEE
132views ECommerce» more  WECWIS 2003»
14 years 3 months ago
Page Digest for Large-Scale Web Services
The rapid growth of the World Wide Web and the Internet has fueled interest in Web services and the Semantic Web, which are quickly becoming important parts of modern electronic c...
Daniel Rocco, David Buttler, Ling Liu
NLPRS
2001
Springer
14 years 2 months ago
Automatic Corpus-Based Extraction of Chinese Legal Terms
This paper reports on a study involving the automatic extraction of Chinese legal terms. We used a word segmented corpus of Chinese court judgments to extract salient legal expres...
Oi Yee Kwong, Benjamin K. Tsou
WWW
2003
ACM
14 years 10 months ago
Mining the peanut gallery: opinion extraction and semantic classification of product reviews
The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, ...
Kushal Dave, Steve Lawrence, David M. Pennock