Sciweavers

160 search results - page 16 / 32
» Exploiting structural information for semi-structured docume...
Sort
View
RIAO
2007
13 years 9 months ago
From Layout to Semantic: a Reranking Model for Mapping Web Documents to Mediated XML Representations
Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...
Guillaume Wisniewski, Patrick Gallinari
ICDE
2005
IEEE
121views Database» more  ICDE 2005»
14 years 8 months ago
Top-Down Specialization for Information and Privacy Preservation
Releasing person-specific data in its most specific state poses a threat to individual privacy. This paper presents a practical and efficient algorithm for determining a generaliz...
Benjamin C. M. Fung, Ke Wang, Philip S. Yu
ACL
2006
13 years 9 months ago
An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition
This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local depen...
Vijay Krishnan, Christopher D. Manning
KDD
2008
ACM
147views Data Mining» more  KDD 2008»
14 years 7 months ago
Extracting shared subspace for multi-label classification
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye
CHI
2008
ACM
14 years 7 months ago
Paperproof: a paper-digital proof-editing system
Recent approaches for linking paper and digital information or services tend to be based on a one-time publishing of digital information where changes to the printed document beco...
Nadir Weibel, Adriana Ispas, Beat Signer, Moira C....