Sciweavers

1815 search results - page 66 / 363
» From Web Pages to Web Communities
Sort
View
WWW
2006
ACM
16 years 5 months ago
WebKhoj: Indian language IR from multiple character encodings
Today web search engines provide the easiest way to reach information on the web. In this scenario, more than 95% of Indian language content on the web is not searchable due to mu...
Prasad Pingali, Jagadeesh Jagarlamudi, Vasudeva Va...
KDD
2002
ACM
293views Data Mining» more  KDD 2002»
16 years 5 months ago
Automatic Categorization of Web Pages and User Clustering with Mixtures of Hidden Markov Models
We propose mixtures of hidden Markov models for modelling clickstreams of web surfers. Hence, the page categorization is learned from the data without the need for a (possibly cumb...
Alexander Ypma, Tom Heskes
WWW
2007
ACM
16 years 5 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
WWW
2007
ACM
16 years 5 months ago
Homepage live: automatic block tracing for web personalization
The emergence of personalized homepage services, e.g. personalized Google Homepage and Microsoft Windows Live, has enabled Web users to select Web contents of interest and to aggr...
Jie Han, Dingyi Han, Chenxi Lin, Hua-Jun Zeng, Zhe...
ITCC
2005
IEEE
15 years 10 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang