Sciweavers

445 search results - page 71 / 89
» Distributed hierarchical document clustering
Sort
View
CLUSTER
2008
IEEE
14 years 1 months ago
Towards an understanding of the performance of MPI-IO in Lustre file systems
—Lustre is becoming an increasingly important file system for large-scale computing clusters. The problem, however, is that many data-intensive applications use MPI-IO for their ...
Jeremy Logan, Phillip M. Dickens
EWCBR
2006
Springer
13 years 11 months ago
Unsupervised Feature Selection for Text Data
Feature selection for unsupervised tasks is particularly challenging, especially when dealing with text data. The increase in online documents and email communication creates a nee...
Nirmalie Wiratunga, Robert Lothian, Stewart Massie
WWW
2005
ACM
14 years 8 months ago
Disambiguating Web appearances of people in a social network
Say you are looking for information about a particular person. A search engine returns many pages for that person's name but which pages are about the person you care about, ...
Ron Bekkerman, Andrew McCallum
BMCBI
2006
131views more  BMCBI 2006»
13 years 7 months ago
Statistical modeling of biomedical corpora: mining the Caenorhabditis Genetic Center Bibliography for genes related to life span
Background: The statistical modeling of biomedical corpora could yield integrated, coarse-to-fine views of biological phenomena that complement discoveries made from analysis of m...
David M. Blei, K. Franks, Michael I. Jordan, I. Sa...
DASFAA
2009
IEEE
118views Database» more  DASFAA 2009»
13 years 8 months ago
Detecting Aggregate Incongruities in XML
The problem of identifying deviating patterns in XML repositories has important applications in data cleaning, fraud detection, and stock market analysis. Current methods determine...
Wynne Hsu, Qiangfeng Peter Lau, Mong-Li Lee