Abstract. Database systems have been vital for all forms of data processing for a long time. In recent years, the amount of processed data has been growing dramatically, even in sm...
Document clustering plays an important role in data mining systems. Recently, a flocking-based document clustering algorithm has been proposed to solve the problem through simulat...
Yongpeng Zhang, Frank Mueller, Xiaohui Cui, Thomas...
Automatically generated HTML, as produced by WYSIWYG programs, typically contains much repetitive and unnecessary markup. This paper identifies aspects of such HTML that may be al...
Stream computing research is moving from terascale to petascale levels. It aims to rapidly analyze data as it streams in from many sources and make decisions with high speed and a...
Ankur Narang, Vikas Agarwal, Monu Kedia, Vijay K. ...
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...