Sciweavers

368 search results - page 19 / 74
» Template-Based Information Mining from HTML Documents
Sort
View
ELPUB
2000
ACM
13 years 11 months ago
XML: More Than an E-Publishing Language
XML is an SGML-based language designed for the interchange of documents with more flexible and powerful features than those provided by HTML. It can be considered as an intermedia...
Jaime Delgado, Ramon Martí, Xavier Perramon
CASCON
2006
150views Education» more  CASCON 2006»
13 years 8 months ago
Exploring a new space of features for document classification: figure clustering
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
Nawei Chen, Hagit Shatkay, Dorothea Blostein
DEXAW
2008
IEEE
121views Database» more  DEXAW 2008»
13 years 9 months ago
Mining Topological Relations from the Web
Topological relations between geographic regions are of interest in many applications. When the exact boundaries of regions are not available, such relations can be established by...
Steven Schockaert, Philip D. Smart, Alia I. Abdelm...
GFKL
2005
Springer
93views Data Mining» more  GFKL 2005»
14 years 27 days ago
A Hybrid Machine Learning Approach for Information Extraction from Free Text
Abstract. We present a hybrid machine learning approach for information extraction from unstructured documents by integrating a learned classifier based on the Maximum Entropy Mod...
Günter Neumann
TEX
2004
Springer
205views Latex» more  TEX 2004»
14 years 21 days ago
Managing TEX Resources with XML Topic Maps.
For many years the Polish TEX Users Group newsletter has been published online on the GUST web site. The repository now contains valuable information on TEX, METAFONT, electronic d...
Tomasz Przechlewski