Sciweavers

483 search results - page 62 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
146
Voted
ECAI
2006
Springer
15 years 6 months ago
Automatic Term Categorization by Extracting Knowledge from the Web
This paper addresses the problem of categorizing terms or lexical entities into a predefined set of semantic domains exploiting the knowledge available on-line in the Web. The prop...
Leonardo Rigutini, Ernesto Di Iorio, Marco Ernande...
123
Voted
IEEEMM
2007
146views more  IEEEMM 2007»
15 years 2 months ago
Learning Microarray Gene Expression Data by Hybrid Discriminant Analysis
— Microarray technology offers a high throughput means to study expression networks and gene regulatory networks in cells. The intrinsic nature of high dimensionality and small s...
Yijuan Lu, Qi Tian, Maribel Sanchez, Jennifer L. N...
128
Voted
WWW
2005
ACM
16 years 3 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
118
Voted
WWW
2010
ACM
15 years 9 months ago
Large-scale bot detection for search engines
In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
138
Voted
KDD
2007
ACM
184views Data Mining» more  KDD 2007»
16 years 3 months ago
Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis
To unravel the concept structure and dynamics of the bioinformatics field, we analyze a set of 7401 publications from the Web of Science and MEDLINE databases, publication years 1...
Bart De Moor, Frizo A. L. Janssens, Wolfgang Gl&au...