Sciweavers

415 search results - page 12 / 83
» Finding nuggets in documents: A machine learning approach
Sort
View
CIKM
2005
Springer
14 years 3 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...
EMNLP
2007
13 years 11 months ago
Learning to Find English to Chinese Transliterations on the Web
We present a method for learning to find English to Chinese transliterations on the Web. In our approach, proper nouns are expanded into new queries aimed at maximizing the probab...
Jian-Cheng Wu, Jason S. Chang
ICML
2006
IEEE
14 years 11 months ago
Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
Charles Elkan
ECIR
2006
Springer
13 years 11 months ago
Automatic Document Organization in a P2P Environment
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
Stefan Siersdorfer, Sergej Sizov
ERCIMDL
2010
Springer
180views Education» more  ERCIMDL 2010»
13 years 7 months ago
SciPlore Xtract: Extracting Titles from Scientific PDF Documents by Analyzing Style Information (Font Size)
Extracting titles from a PDFs full text is an important task in information retrieval to identify PDFs. Existing approaches apply complicated and expensive (in terms of calculating...
Jöran Beel, Bela Gipp, Ammar Shaker, Nick Fri...