Sciweavers

945 search results - page 18 / 189
» Information Extraction from HTML: Application of a General M...
Sort
View
147
Voted
WWW
2005
ACM
16 years 3 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
153
Voted
IR
2000
15 years 2 months ago
Automating the Construction of Internet Portals with Machine Learning
Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsear...
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristi...
214
Voted
CORIA
2011
14 years 6 months ago
Mining the Web for lists of Named Entities
Named entities play an important role in Information Extraction. They represent unitary namable information within text. In this work, we focus on groups of named entities of the s...
Arlind Kopliku, Mohand Boughanem, Karen Pinel-Sauv...
115
Voted
ECAI
2004
Springer
15 years 8 months ago
Automatic Recognition of Famous Artists by Machine
The paper addresses the question whether it is possible for a machine to learn to distinguish and recognise famous musicians (concert pianists), based on their style of playing. We...
Gerhard Widmer, Patrick Zanon
116
Voted
CIKM
2005
Springer
15 years 8 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...