Sciweavers

708 search results - page 84 / 142
» Identifying Content Blocks from Web Documents
Sort
View
PRIS
2004
13 years 9 months ago
Learning Text Extraction Rules, without Ignoring Stop Words
Information Extraction (IE) from text /web documents has become an important application area of AI. As the number of web sites and documents has grown dramatically, the users need...
João Cordeiro, Pavel Brazdil
WWW
2002
ACM
14 years 8 months ago
A machine learning based approach for table detection on the web
Table is a commonly used presentation scheme, especially for describing relational information. However, table understanding remains an open problem. In this paper, we consider th...
Yalin Wang, Jianying Hu
SIGMOD
2004
ACM
150views Database» more  SIGMOD 2004»
14 years 7 months ago
When one Sample is not Enough: Improving Text Database Selection Using Shrinkage
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
Panagiotis G. Ipeirotis, Luis Gravano
SIGDOC
2005
ACM
14 years 1 months ago
Co-generation of text and graphics
To reduce potential discrepancies between textual and graphical content in documentation, it is possible to produce both text and graphics from a single common source. One approac...
David G. Novick, Brian Lowe
ICDE
2007
IEEE
170views Database» more  ICDE 2007»
13 years 11 months ago
A UML Profile for Core Components and their Transformation to XSD
In business-to-business e-commerce, traditional electronic data interchange (EDI) approaches such as UN/EDIFACT have been superseded by approaches like web services and ebXML. Nev...
Christian Huemer, Philipp Liegl