Sciweavers

483 search results - page 64 / 97
» Sampling the Web as Training Data for Text Classification
Sort
View
AAAI
2008
15 years 5 months ago
Transfer Learning via Dimensionality Reduction
Transfer learning addresses the problem of how to utilize plenty of labeled data in a source domain to solve related but different problems in a target domain, even when the train...
Sinno Jialin Pan, James T. Kwok, Qiang Yang
148
Voted
WWW
2004
ACM
16 years 3 months ago
Web taxonomy integration using support vector machines
We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only currently pervasive on the web, but also important to the eme...
Dell Zhang, Wee Sun Lee
121
Voted
WWW
2008
ACM
16 years 3 months ago
Mining for personal name aliases on the web
We propose a novel approach to find aliases of a given name from the web. We exploit a set of known names and their aliases as training data and extract lexical patterns that conv...
Danushka Bollegala, Taiki Honma, Yutaka Matsuo, Mi...
131
Voted
ERCIMDL
2005
Springer
305views Education» more  ERCIMDL 2005»
15 years 8 months ago
Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines
Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number...
George Almpanidis, Constantine Kotropoulos, Ioanni...
ICMLA
2008
15 years 4 months ago
Highly Scalable SVM Modeling with Random Granulation for Spam Sender Detection
Spam sender detection based on email subject data is a complex large-scale text mining task. The dataset consists of email subject lines and the corresponding IP address of the em...
Yuchun Tang, Yuanchen He, Sven Krasser