Improving the precision of information retrieval has been a challenging issue on Chinese Web. As exemplified by Chinese recipes on the Web, it is not easy/natural for people to us...
Time series motif discovery is an important problem with applications in a variety of areas that range from telecommunications to medicine. Several algorithms have been proposed t...
Personalization systems based upon users' surfing behavior analysis imply three phases: data collection, pattern discovery and recommendation. Due to the dimension of log file...
This paper presents a novel algorithm to cluster emails according to their contents and the sentence styles of their subject lines. In our algorithm, natural language processing t...
With the explosion of social media, scalability becomes a key challenge. There are two main aspects of the problems that arise: 1) data volume: how to manage and analyze huge data...
Ching-Yung Lin, Jimeng Sun, Nan Cao, Shixia Liu, S...