We have been working on two different KDD systems for scientific data. One system involves comparative genomics, where the database contains more than 60,000 plant gene and protei...
With the increase in information on the World Wide Web it has become difficult to quickly find desired information without using multiple queries or using a topic-specific search ...
Sofus A. Macskassy, Arunava Banerjee, Brian D. Dav...
We consider the problem of aggregation for uncertain and imprecise data. For such data, we define aggregation operators and use them to provide information on properties and patte...
This paper describes our work in learning online models that forecast real-valued variables in a high-dimensional space. A 3GB database was collected by sampling 421 real-valued s...
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...