A substantial subset of the web data follows some kind of underlying structure. Nevertheless, HTML does not contain any schema or semantic information about the data it represents...
We propose a Web recommendation system based on a maximum entropy model. Under the maximum entropy principle, we can combine multiple levels of knowledge about users’ navigation...
The Web is a distributed network of information sources where the individual sources are autonomously created and maintained. Consequently, syntactic and semantic heterogeneity of ...
Classification algorithms and document representation approaches are two key elements for a successful document classification system. In the past, much work has been conducted to...
When a user is served with a ranked list of relevant documents by the standard document search engines, his search task is usually not over. He has to go through the entire docume...