We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore co...
Stanislav Angelov, Boulos Harb, Sampath Kannan, Sa...
Thispaper describes speed-up of string pattern matchingby rearrangingstates inAho-Corasickpattern matching machine, which is a kind of afinite automaton. Werealized speed-up of st...
T. Nishimura, Shuichi Fukamachi, Takeshi Shinohara
The problem of question/answering (Q/A) is to find answers to open-domain questions by searching large collections of documents. Unlike information retrieval systems, very common ...
Mihai Surdeanu, Dan I. Moldovan, Sanda M. Harabagi...
Local tag structures have become frequent through Web 2.0: Users "tag" their data without specifying the underlying semantics. Every user annotates items in an individual...