Recent innovations have resulted in a plethora of social applications on the Web, such as blogs, social networks, and community photo and video sharing applications. Such applicat...
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
Background: A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs require...
Lukas A. Mueller, Adri A. Mills, Beth Skwarecki, R...
Ambiguous person names are a problem in many forms of written text, including that which is found on the Web. In this paper we explore the use of unsupervised clustering techniques...
Vocabulary restrictions in large vocabulary continuous speech recognition (LVCSR) systems mean that out-of-vocabulary (OOV) words are lost in the output. However, OOV words tend t...
Carolina Parada, Abhinav Sethy, Mark Dredze, Frede...