Sciweavers

311 search results - page 16 / 63
» XTRACT: A System for Extracting Document Type Descriptors fr...
Sort
View
SIGIR
2005
ACM
14 years 1 months ago
Controlling overlap in content-oriented XML retrieval
The direct application of standard ranking techniques to retrieve individual elements from a collection of XML documents often produces a result set in which the top ranks are dom...
Charles L. A. Clarke
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
IMCSIT
2010
13 years 4 months ago
Using Self Organizing Map to Cluster Arabic Crime Documents
This paper presents a system that combines two text mining techniques; information extraction and clustering. A rulebased approach is used to perform the information extraction tas...
Meshrif Alruily, Aladdin Ayesh, Abdulsamad Al-Marg...
WWW
2010
ACM
14 years 2 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
FLAIRS
2007
13 years 10 months ago
Indexing Documents by Discourse and Semantic Contents from Automatic Annotations of Texts
The basic aim of the model proposed here is to automatically build semantic metatext structure for texts that would allow us to search and extract discourse and semantic informati...
Brahim Djioua, Jean-Pierre Desclés