This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Participatory texture documentation (PTD) is a geospatial data collection process in which a group of users (dedicated individuals and/or general public) with camera-equipped mobi...
Farnoush Banaei Kashani, Houtan Shirani-Mehr, Bei ...
We approached the problem of classifying papers for the TREC 2004 Genomics Track triage task as a four step process: feature generation, feature selection, classifier training, an...
Aaron M. Cohen, Ravi Teja Bhupatiraju, William R. ...
In this paper, we show that stylistic text features can be exploited to determine an anonymous author's native language with high accuracy. Specifically, we first use automat...
Semantic similarity between words or phrases is frequently used to find matching correlations between search queries and documents when straightforward matching of terms fails. Th...