This paper proposes the integration of semantic information drawn from a web application’s domain knowledge into all phases of the web usage mining process (preprocessing, patte...
The problem of automatically classifying the gender of a blog author has important applications in many commercial domains. Existing systems mainly use features such as words, wor...
This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation...
Existing sequence mining algorithms mostly focus on mining for subsequences. However, a large class of applications, such as biological DNA and protein motif mining, require effici...
Logistic Regression (LR) has been widely used in statistics for many years, and has received extensive study in machine learning community recently due to its close relations to S...
Jian Zhang, Rong Jin, Yiming Yang, Alexander G. Ha...