Sciweavers

MAICS
2004

Intelligent Content Based Title and Author Name Extraction from Formatted Documents

14 years 1 months ago
Intelligent Content Based Title and Author Name Extraction from Formatted Documents
This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe several algorithms for doing so in a manner designed not to rely on specific stylistic dictates of any document formatting standard. Rather, they are designed to rely on a combination of overt and subtle cues that form a generalized, common standard for placing this information in a document and its easy extraction by readers.
Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sah
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where MAICS
Authors Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sahouri, Michel Abraham
Comments (0)