Sciweavers

794 search results - page 151 / 159
» On the Information Content of Semi-Structured Databases
Sort
View
KDD
2009
ACM
229views Data Mining» more  KDD 2009»
14 years 8 months ago
Relational learning via latent social dimensions
Social media such as blogs, Facebook, Flickr, etc., presents data in a network format rather than classical IID distribution. To address the interdependency among data instances, ...
Lei Tang, Huan Liu
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 8 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee
SIGMOD
2008
ACM
142views Database» more  SIGMOD 2008»
14 years 8 months ago
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
Approximate queries on a collection of strings are important in many applications such as record linkage, spell checking, and Web search, where inconsistencies and errors exist in...
Xiaochun Yang, Bin Wang, Chen Li
DASFAA
2007
IEEE
240views Database» more  DASFAA 2007»
14 years 2 months ago
A Comparative Study of Ontology Based Term Similarity Measures on PubMed Document Clustering
Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic simila...
Xiaodan Zhang, Liping Jing, Xiaohua Hu, Michael K....
DASFAA
2005
IEEE
106views Database» more  DASFAA 2005»
14 years 1 months ago
Real Datasets for File-Sharing Peer-to-Peer Systems
The fundamental drawback of unstructured peer-to-peer (P2P) networks is the flooding-based query processing protocol that seriously limits their scalability. As a result, a signiï...
Shen-Tat Goh, Panos Kalnis, Spiridon Bakiras, Kian...