We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In co...
With the emergence of XML as the de facto standard to exchange and disseminate information, the problem of regulating access to XML documents has attracted a considerable attentio...
This paper focuses on the creation of a first order predicate calculus based regulation compliance-assistance system built upon an XML framework. Two areas of research that suppor...
We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, wit...
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, C...
In this paper, we adapt a statistical learning approach, inspired by automated topic segmentation techniques in speech-recognized documents to the challenging protein segmentation ...
Betty Yee Man Cheng, Jaime G. Carbonell, Judith Kl...