Recently, a flash memory has become a major database storage in building portable information devices because of its non-volatile, shock-resistant, power-economic nature, and fast...
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
In recent years, there has been a prevalence of search engines being employed to find useful information in the Web as they efficiently explore hyperlinks between web pages which ...
Zhenglu Yang, Lin Li, Botao Wang, Masaru Kitsurega...
This paper describes an intelligent agent to facilitate bitext mining from the Web via automatic discovery of URL pairing patterns (or keys) for retrieving parallel web pages. The...
When we describe a Web page informally, we often use phrases like it looks like a newspaper site", there are several unordered lists" or it's just a collection of li...
Isabel F. Cruz, Slava Borisov, Michael A. Marks, T...