Sciweavers

ACML
2009
Springer

Injecting Structured Data to Generative Topic Model in Enterprise Settings

14 years 7 months ago
Injecting Structured Data to Generative Topic Model in Enterprise Settings
Enterprises have accumulated both structured and unstructured data steadily as computing resources improve. However, previous research on enterprise data mining often treats these two kinds of data independently and omits mutual benefits. We explore the approach to incorporate a common type of structured data (i.e. organigram) into generative topic model. Our approach, the Partially Observed Topic model (POT), not only considers the unstructured words, but also takes into account the structured information in its generation process. By integrating the structured data implicitly, the mixed topics over document are partially observed during the Gibbs sampling procedure. This allows POT to learn topic pertinently and directionally, which makes it easy tuning and suitable for end-use application. We evaluate our proposed new model on a real-world dataset and show the result of improved expressiveness over traditional LDA. In the task of document classification, POT also demonstrates more...
Han Xiao, Xiaojie Wang, Chao Du
Added 25 May 2010
Updated 25 May 2010
Type Conference
Year 2009
Where ACML
Authors Han Xiao, Xiaojie Wang, Chao Du
Comments (0)