Sciweavers

466 search results - page 10 / 94
» Scalable Feature Extraction from Noisy Documents
Sort
View
IJCNLP
2005
Springer
14 years 28 days ago
Aligning Needles in a Haystack: Paraphrase Acquisition Across the Web
This paper presents a lightweight method for unsupervised extraction of paraphrases from arbitrary textual Web documents. The method differs from previous approaches to paraphrase...
Marius Pasca, Péter Dienes
ILP
2007
Springer
14 years 1 months ago
Using ILP to Construct Features for Information Extraction from Semi-structured Text
Machine-generated documents containing semi-structured text are rapidly forming the bulk of data being stored in an organisation. Given a feature-based representation of such data,...
Ganesh Ramakrishnan, Sachindra Joshi, Sreeram Bala...
28
Voted
RULEML
2004
Springer
14 years 24 days ago
Rule Learning for Feature Values Extraction from HTML Product Information Sheets
The Web is now a huge information repository with a rich semantic structure that, however, is primarily addressed to human understanding rather than automated processing by a compu...
Costin Badica, Amelia Badica
ICDAR
2003
IEEE
14 years 22 days ago
Numerical Sequence Extraction in Handwritten Incoming Mail Documents
In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the nu...
Guillaume Koch, Laurent Heutte, Thierry Paquet
ICMLA
2009
13 years 5 months ago
Knowledge Transfer for Feature Generation in Document Classification
One important problem in machine learning is how to extract knowledge from prior experience, then transfer and apply this knowledge in new learning tasks. To address this problem, ...
Jian Zhang, Shobhit S. Shakya