The aim of this paper is to investigate the multiple attribute decision making problems with linguistic information, in which the information about attribute weights is incomplete...
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
Relational autocorrelation is ubiquitous in relational domains. This observed correlation between class labels of linked instances in a network (e.g., two friends are more likely ...
Because of the heterogeneous nature of multiple data sources, data integration is often one of the most challenging tasks of today's information systems. While the existing l...
Zhengrui Jiang, Sumit Sarkar, Prabuddha De, Debabr...
Web content mining opens up the possibility to use data presented in web pages for the discovery of interesting and useful patterns. Our web mining tool, FBL (Filtered Bayesian Le...
Classical retrieval models support content-oriented searching for documents using a set of words as data model. However, in hypertext and database applications we want to consider...
: In real-life data, in general, many attribute values are missing. Therefore, rule induction requires preprocessing, where missing attribute values are replaced by appropriate val...
Jerzy W. Grzymala-Busse, Witold J. Grzymala-Busse,...
Clustering is an important data mining problem. Most of the earlier work on clustering focussed on numeric attributes which have a natural ordering on their attribute values. Rece...
Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishn...
In this paper, we propose a method for extracting bibliographic attributes from reference strings captured using Optical Character Recognition (OCR) and an extended hidden Markov ...
We consider the problem of dynamically indexing temporal observations about a collection of objects, each observation consisting of a key identifying the object, a list of attribu...