Sciweavers

483 search results - page 60 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
137
Voted
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
16 years 1 days ago
GeoFolk: Latent spatial semantics in Web 2.0 social media
We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spat...
Sergej Sizov
113
Voted
LREC
2010
130views Education» more  LREC 2010»
15 years 4 months ago
The Problems of Language Identification within Hugely Multilingual Data Sets
As the data for more and more languages is finding its way into digital form, with an increasing amount of this data being posted to the Web, it has become possible to collect lan...
Fei Xia, Carrie Lewis, William D. Lewis
133
Voted
ACL
1998
15 years 4 months ago
A Connectionist Architecture for Learning to Parse
We present a connectionist architecture and demonstrate that it can learn syntactic parsing from a corpus of parsed text. The architecture can represent syntactic constituents, an...
James Henderson, Peter Lane
148
Voted
EMNLP
2004
15 years 4 months ago
Instance-Based Question Answering: A Data-Driven Approach
Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...
Lucian Vlad Lita, Jaime G. Carbonell
133
Voted
IAT
2009
IEEE
15 years 9 months ago
An Intelligent Agent That Autonomously Learns How to Translate
—We describe the design of an autonomous agent that can teach itself how to translate from a foreign language, by first assembling its own training set, then using it to improve...
Marco Turchi, Tijl De Bie, Nello Cristianini