Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domai...
The problem of Named Entity Generation is expressed as a conditional probability model over a structured domain. By defining a factor-graph model over the mentions of a text, we o...
We describe a compression model for semistructured documents, called Structural Contexts Model (SCM), which takes advantage of the context information usually implicit in the stru...
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector r...
Rowena Chau, Ah Chung Tsoi, Markus Hagenbuchner, V...
In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), which uses rich syntactic information in predicting the next word based on words ...