Sciweavers

502 search results - page 30 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
ICDAR
2011
IEEE
12 years 8 months ago
Localization of Digit Strings in Farsi/Arabic Document Images Using Structural Features and Syntactical Analysis
—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...
Ali Abedi, Karim Faez
WWW
2007
ACM
14 years 9 months ago
Extraction and search of chemical formulae in text documents on the web
Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets ar...
Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee...
IJCNLP
2005
Springer
14 years 2 months ago
Automatic Partial Parsing Rule Acquisition Using Decision Tree Induction
Abstract. Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties of pa...
Myung-Seok Choi, Chul Su Lim, Key-Sun Choi
ANTSW
2004
Springer
14 years 1 months ago
How to Use Ants for Hierarchical Clustering
Abstract. We present in this paper, a new model for document hierarchical clustering, which is inspired from the self-assembly behavior of real ants. We have simulated the way ants...
Hanene Azzag, Christiane Guinot, Gilles Venturini
ACL
2010
13 years 6 months ago
Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
We show how web mark-up can be used to improve unsupervised dependency parsing. Starting from raw bracketings of four common HTML tags (anchors, bold, italics and underlines), we ...
Valentin I. Spitkovsky, Daniel Jurafsky, Hiyan Als...