We propose a model for user purchase behavior in online stores that provide recommendation services. We model the purchase probability given recommendations for each user based on...
We investigate the effect of search engine brand (i.e., the identifying name or logo that distinguishes a product from its competitors) on evaluation of system performance. This r...
The Biased Minimax Probability Machine (BMPM) constructs a classifier which deals with the imbalanced learning tasks. In this paper, we propose a Second Order Cone Programming (SO...
It is now feasible to view media at home as easily as text-based pages were viewed when the World Wide Web (WWW) first emerged. This development has supported media sharing and se...
Automated email-based password reestablishment (EBPR) is an efficient, cost-effective means to deal with forgotten passwords. In this technique, email providers authenticate users ...
PageRank is known to be an efficient metric for computing general document importance in the Web. While commonly used as a one-size-fits-all measure, the ability to produce topica...
Due to resource constraints, Web archiving systems and search engines usually have difficulties keeping the entire local repository synchronized with the Web. We advance the state...
Qingzhao Tan, Ziming Zhuang, Prasenjit Mitra, C. L...
Many text documents on the Web are not originally created but forwarded or copied from other source documents. The phenomenon of document forwarding or transmission between variou...
Enterprise and web data processing and content aggregation systems often require extensive use of human-reviewed data (e.g. for training and monitoring machine learning-based appl...
Qi Su, Dmitry Pavlov, Jyh-Herng Chow, Wendell C. B...
Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our st...
Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christ...