While empirical evaluations are a common research method in some areas of Artificial Intelligence (AI), others still neglect this approach. This article outlines both the opportun...
We propose a novel way to induce a random field from an energy function on discrete labels. It amounts to locally injecting noise to the energy potentials, followed by finding t...
Training a good text detector requires a large amount of labeled data, which can be very expensive to obtain. Cotraining has been shown to be a powerful semi-supervised learning t...
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
— This paper introduces a novel method, GAIS, for detecting interleaved sequential patterns from databases. A case, where data is of low quality and has errors is considered. Pat...