In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known...
This paper proposes a methodology for the creation of specialized data sets for Textual Entailment, made of monothematic Text-Hypothesis pairs (i.e. pairs in which only one lingui...
Luisa Bentivogli, Elena Cabrio, Ido Dagan, Danilo ...
We propose an efficient sampling based outlier detection method for large high-dimensional data. Our method consists of two phases. In the first phase, we combine a "sampling...
Timothy de Vries, Sanjay Chawla, Pei Sun, Gia Vinh...
We define a class of algorithms for constructing coresets of (geometric) data sets, and show that algorithms in this class can be dynamized efficiently in the insertiononly (data ...
In this study, we examine the use of graph ordering algorithms for visual analysis of data sets using visual similarity matrices. Visual similarity matrices display the relationsh...
Christopher Mueller, Benjamin Martin, Andrew Lumsd...
Animation is frequently utilized to visually depict change in timevarying data sets. For this task, it is a natural fit. Yet explicit animation is rarely employed for static data....
James Shearer, Michael Ogawa, Kwan-Liu Ma, Toby Ko...
This paper presents a concept hierarchy-based approach to privacy preserving data collection for data mining called the P-level model. The P-level model allows data providers to d...
: The ability to distinguish, differentiate and contrast between different data sets is a key objective in data mining. Such ability can assist domain experts to understand their d...
It is not always clear how best to represent integrated data sets, and which application and database features allow a scientist to take best advantage of data coming from various ...
Joanna Jakubowska, Ela Hunt, John McClure, Matthew...
Geographic information systems (GIS) must support large georeferenced data sets. Due to the size of these data sets finding exact answers to spatial queries can be very time consum...
Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger