Sciweavers

240 search results - page 5 / 48
» Learning to Extract Content from News Webpages
Sort
View
ICWS
2009
IEEE
14 years 5 months ago
Deactivation of Unwelcomed Deep Web Extraction Services through Random Injection
Websites serve content both through Web Services as well as through user-viewable webpages. While the consumers of web-services are typically ‘machines’, webpages are meant fo...
Varun Bhagwan, Tyrone Grandison
ICMCS
2006
IEEE
124views Multimedia» more  ICMCS 2006»
14 years 2 months ago
The Semantic Pathfinder for Generic News Video Indexing
This paper presents the semantic pathfinder architecture for generic indexing of video archives. The pathfinder automatically extracts semantic concepts from video based on the ...
Cees G. M. Snoek, Marcel Worring, Jan-Mark Geusebr...
WWW
2010
ACM
14 years 3 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
WWW
2006
ACM
14 years 9 months ago
Robust web content extraction
We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative ...
Marek Kowalkiewicz, Maria E. Orlowska, Tomasz Kacz...
AIEDU
2008
174views more  AIEDU 2008»
13 years 8 months ago
Automatic Extraction of Pedagogic Metadata from Learning Content
Annotating learning material with metadata allows easy reusability by different learning/tutoring systems. Several metadata standards have been developed to represent learning obje...
Devshri Roy, Sudeshna Sarkar, Sujoy Ghose