Sciweavers

119 search results - page 10 / 24
» Learning to Extract Text-Based Information from the World Wi...
Sort
View
COOPIS
1997
IEEE
14 years 26 days ago
Semi-Automatic Wrapper Generation for Internet Information Sources
To simplify the task of obtaining information from the vast number of information sources that are available on the World Wide Web (WWW), we are building tools to build informatio...
Naveen Ashish, Craig A. Knoblock
JAIR
2008
173views more  JAIR 2008»
13 years 8 months ago
Creating Relational Data from Unstructured and Ungrammatical Data Sources
In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the...
Matthew Michelson, Craig A. Knoblock
CEAS
2006
Springer
14 years 11 days ago
Introducing the Webb Spam Corpus: Using Email Spam to Identify Web Spam Automatically
Just as email spam has negatively impacted the user messaging experience, the rise of Web spam is threatening to severely degrade the quality of information on the World Wide Web....
Steve Webb, James Caverlee, Calton Pu
ACL
2006
13 years 10 months ago
URES : an Unsupervised Web Relation Extraction System
Most information extraction systems either use hand written extraction patterns or use a machine learning algorithm that is trained on a manually annotated corpus. Both of these a...
Binyamin Rosenfeld, Ronen Feldman
IICS
2003
Springer
14 years 1 months ago
Aggregation Transformation of XML Schemas to Object-Relational Databases
As XML has become an emerging standard for information exchange on the World Wide Web, it has gained attention in database communities to extract information from XML sees as a dat...
Nathalia Devina Widjaya, David Taniar, J. Wenny Ra...