Clustering the results of a search helps the user to overview the information returned. In this paper, we regard the clustering task as indexing the search results. Here, an index...
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for...
We apply statistical machine translation (SMT) tools to generate novel paraphrases of input sentences in the same language. The system is trained on large volumes of sentence pair...
This article compares eight different diversity methods: 3 based on visual information, 1 based on date information, 3 adapted to each topic based on location and visual informatio...
Sabrina Tollari, Philippe Mulhem, Marin Ferecatu, ...
News articles about the same event published over time have properties that challenge NLP and IR applications. A cluster of such texts typically exhibits instances of paraphrase a...