The increasing complexity of enterprise databases and the prevalent lack of documentation incur significant cost in both understanding and integrating the databases. Existing solu...
Named entity recognition studies the problem of locating and classifying parts of free text into a set of predefined categories. Although extensive research has focused on the de...
A major obstacle that decreases the performance of text classifiers is the extremely high dimensionality of text data. To reduce the dimension, a number of approaches based on rou...
Following the advent of the Internet technology and the rapid growth of its applications, users have spent long periods of time browsing through the ocean of information found in ...
Web search logs contain extremely sensitive data, as evidenced by the recent AOL incident. However, storing and analyzing search logs can be very useful for many purposes (i.e. in...