In this paper, we describe CALM, a method for building statistical language models for the Web. CALM addresses several unique challenges dealing with the Web contents. First, CALM...
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-proce...
This study borrowed sequence analysis techniques from the genetic sciences and applied them to a similar problem in email filtering and web searching. Genre identification is the ...
BreakingStory is an interactive system for visualizing change in online news. The system regularly collects the text from the front pages of international daily news web sites. It...
Jean Anne Fitzpatrick, James Reffell, Moryma Aydel...
Background: Cellular processes require the interaction of many proteins across several cellular compartments. Determining the collective network of such interactions is an importa...
Harold J. Drabkin, Christopher Hollenbeck, David P...