Sciweavers

945 search results - page 18 / 189
» Information Extraction from HTML: Application of a General M...
Sort
View
WWW
2005
ACM
14 years 8 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
IR
2000
13 years 7 months ago
Automating the Construction of Internet Portals with Machine Learning
Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsear...
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristi...
CORIA
2011
12 years 11 months ago
Mining the Web for lists of Named Entities
Named entities play an important role in Information Extraction. They represent unitary namable information within text. In this work, we focus on groups of named entities of the s...
Arlind Kopliku, Mohand Boughanem, Karen Pinel-Sauv...
ECAI
2004
Springer
14 years 1 months ago
Automatic Recognition of Famous Artists by Machine
The paper addresses the question whether it is possible for a machine to learn to distinguish and recognise famous musicians (concert pianists), based on their style of playing. We...
Gerhard Widmer, Patrick Zanon
CIKM
2005
Springer
14 years 1 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...