One may need to build a statistical parser for a new language, using only a very small labeled treebank together with raw text. We argue that bootstrapping a parser is most promis...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
As the rapid growth of PDF document in digital libraries, recognizing the document structure and detecting specific document components are useful for document storage, classifica...
Background: The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the in...
Zheng Yin, Xiaobo Zhou, Chris Bakal, Fuhai Li, You...
Background: Alternative splicing (AS) is a process which generates several distinct mRNA isoforms from the same gene by splicing different portions out of the precursor transcript...
Johannes Eichner, Georg Zeller, Sascha Laubinger, ...