Sciweavers

332 search results - page 2 / 67
» Document Content Extraction Using Automatically Discovered F...
Sort
View
WWW
2005
ACM
14 years 8 months ago
Automatically learning document taxonomies for hierarchical classification
While several hierarchical classification methods have been applied to web content, such techniques invariably rely on a pre-defined taxonomy of documents. We propose a new techni...
Kunal Punera, Suju Rajan, Joydeep Ghosh
LREC
2008
160views Education» more  LREC 2008»
13 years 9 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany
WWW
2005
ACM
14 years 8 months ago
Extracting semantic structure of web documents using content and visual information
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Rupesh R. Mehta, Pabitra Mitra, Harish Karnick
DOCENG
2006
ACM
14 years 1 months ago
NEWPAR: an automatic feature selection and weighting schema for category ranking
Category ranking provides a way to classify plain text documents into a pre-determined set of categories. This work proposes to have a look at typical document collections and ana...
Fernando Ruiz-Rico, José Luis Vicedo Gonz&a...
AIEDU
2008
174views more  AIEDU 2008»
13 years 7 months ago
Automatic Extraction of Pedagogic Metadata from Learning Content
Annotating learning material with metadata allows easy reusability by different learning/tutoring systems. Several metadata standards have been developed to represent learning obje...
Devshri Roy, Sudeshna Sarkar, Sujoy Ghose