We describe our participation in the 2009 CLEF-IP task, which was targeted at priorart search for topic patent documents. Our system retrieved patent documents based on a standard...
This paper presents work on document retrieval for Italian carried out at ITC-irst. Two different approaches to information retrieval were investigated, one based on the Okapi wei...
Topic modeling has been a key problem for document analysis. One of the canonical approaches for topic modeling is Probabilistic Latent Semantic Indexing, which maximizes the join...
Deng Cai, Qiaozhu Mei, Jiawei Han, Chengxiang Zhai
In this work we propose a novel approach to anomaly detection in streaming communication data. We first build a stochastic model for the system based on temporal communication pa...
Most current sentence alignment approaches adopt sentence length and cognate as the alignment features; and they are mostly trained and tested in the documents with the same style...