Abstract. We describe a text summarization system that moves beyond standard approaches by using a hybrid approach of linguistic and statistical analysis and by employing text-sort...
We describe a biographical multidocument summarizer that summarizes information about people described in the news. The summarizer uses corpus statistics along with linguistic kno...
Barry Schiffman, Inderjeet Mani, Kristian J. Conce...
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
Complex documents stored in a flat or partially marked up file format require layout sensitive preprocessing before any natural language processing can be carried out on their tex...
We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...
Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...