Sciweavers

542 search results - page 18 / 109
» Learning author-topic models from text corpora
Sort
View
ICML
2008
IEEE
14 years 8 months ago
Semi-supervised learning of compact document representations with deep networks
Finding good representations of text documents is crucial in information retrieval and classification systems. Today the most popular document representation is based on a vector ...
Marc'Aurelio Ranzato, Martin Szummer
IPM
2008
196views more  IPM 2008»
13 years 7 months ago
Author identification: Using text sampling to handle the class imbalance problem
Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorizatio...
Efstathios Stamatatos
ICML
2006
IEEE
14 years 8 months ago
Pachinko allocation: DAG-structured mixture models of topic correlations
Latent Dirichlet allocation (LDA) and other related topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, LDA does not ca...
Wei Li, Andrew McCallum
IJCNN
2007
IEEE
14 years 1 months ago
Text Representations for Text Categorization: A Case Study in Biomedical Domain
— In vector space model (VSM), textual documents are represented as vectors in the term space. Therefore, there are two issues in this representation, i.e. (1) what should a term...
Man Lan, Chew Lim Tan, Jian Su, Hwee-Boon Low
JAIR
2007
151views more  JAIR 2007»
13 years 7 months ago
Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email
Previous work in social network analysis (SNA) has modeled the existence of links from one entity to another, but not the attributes such as language content or topics on those li...
Andrew McCallum, Xuerui Wang, Andrés Corrad...