News articles about the same event published over time have properties that challenge NLP and IR applications. A cluster of such texts typically exhibits instances of paraphrase a...
Browsing and finding pictures in large-scale and heterogeneous collections is an important issue, most particularly for online photo sharing applications. Since such services know...
A labeled sequence data set related to a certain biological property is often biased and, therefore, does not completely capture its diversity in nature. To reduce this sampling b...
Extracting information from data, often also called data analysis, is an important scienti c task. Statistical approaches, which use methods from probability theory and numerical a...
Bernd Fischer 0002, Johann Schumann, Thomas Pressb...
Constrained clustering has been well-studied for algorithms like K-means and hierarchical agglomerative clustering. However, how to encode constraints into spectral clustering rem...