Sciweavers

SDM
2012
SIAM
247views Data Mining» more  SDM 2012»
12 years 1 months ago
Simplex Distributions for Embedding Data Matrices over Time
Early stress recognition is of great relevance in precision plant protection. Pre-symptomatic water stress detection is of particular interest, ultimately helping to meet the chal...
Kristian Kersting, Mirwaes Wahabzada, Christoph R&...
ICDAR
2011
IEEE
12 years 11 months ago
Word Retrieval in Historical Document Using Character-Primitives
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
Partha Pratim Roy, Jean-Yves Ramel, Nicolas Ragot
ICDAR
2011
IEEE
12 years 11 months ago
Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method
—In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where p...
Marçal Rusiñol, David Aldavert, Rica...
AAAI
2011
12 years 11 months ago
Exploiting Phase Transition in Latent Networks for Clustering
In this paper, we model the pair-wise similarities of a set of documents as a weighted network with a single cutoff parameter. Such a network can be thought of an ensemble of unwe...
Vahed Qazvinian, Dragomir R. Radev
EMNLP
2010
13 years 9 months ago
Staying Informed: Supervised and Semi-Supervised Multi-View Topical Analysis of Ideological Perspective
With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document co...
Amr Ahmed, Eric P. Xing
JUCS
2008
167views more  JUCS 2008»
13 years 11 months ago
A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives
: Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsi...
Josep Lladós, Dimosthenis Karatzas, Joan Ma...
CORR
2006
Springer
132views Education» more  CORR 2006»
13 years 11 months ago
Navigating multilingual news collections using automatically extracted information
We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that...
Ralf Steinberger, Bruno Pouliquen, Camelia Ignat
AVI
2000
14 years 24 days ago
A Modular Approach for Exploring the Semantic Structure of Technical Document Collections
The identification and analysis of an enterprise's knowledge available in a documented form is a key element of knowledge management. Visual methods which allow easy access t...
Andreas Becks, Stefan Sklorz, Matthias Jarke
ACL
2006
14 years 25 days ago
Are These Documents Written from Different Perspectives? A Test of Different Perspectives Based on Statistical Distribution Dive
In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for examp...
Wei-Hao Lin, Alexander G. Hauptmann
SDM
2007
SIAM
187views Data Mining» more  SDM 2007»
14 years 26 days ago
Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning
Topic modeling techniques have widespread use in text data mining applications. Some applications use batch models, which perform clustering on the document collection in aggregat...
Arindam Banerjee, Sugato Basu