Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for intelligent services. For this reason, feature extraction ...
We present a mixture model based approach for learning individualized behavior models for the Web users. We investigate the use of maximum entropy and Markov mixture models for ge...
Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for diverse purposes. We describe a new m...
Zhiyuan Chen, Aryya Gangopadhyay, George Karabatis...
—Discovering program behaviors and functionalities can ease program comprehension and verification. Existing program analysis approaches have used text mining algorithms to infer...
Given a string s, the Parikh vector of s, denoted p(s), counts the multiplicity of each character in s. Searching for a match of Parikh vector q (a “jumbled string”) in the tex...
Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, ...