This paper describes the retrieval approach proposed by the SIG/EVI group of the IRIT research centre in INEX’2004 evaluation. The approach uses a voting method coupled with some...
Abstract: The thematic text segmentation task consists in identifying the most important thematic breaks in a document in order to cut it into homogeneous passages. We propose in t...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...
Although text categorization is a burgeoning area of IR research, readily available test collections in this field are surprisingly scarce. We describe a methodology and system (...
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...