CL Research's question-answering system (DIMAP-QA) for TREC-10 only slightly extends its semantic relation triple (logical form) technology in which documents are fully parse...
Accessing the structured content of PDF document is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we first present different methods...
"Short-text clustering" is a very important research field due to the current tendency for people to use very short documents, e.g. blogs, text-messaging and others. In s...
Recent research has had some success using the length of time a user displays a document in their web browser as implicit feedback for document preference. However, most studies h...
: Text classification, document clustering and similar document analysis tasks are currently the subject of significant global research, since such areas underpin web intelligence,...