Abstract. Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, Web spammers are making a significant effort to man...
One of the biggest challenges in speaker recognition is dealing with speaker-emotion variability. The basic problem is how to train the emotion GMMs of the speakers from their neu...
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
A ubiquitous city is where everything is interconnected with everything else, where information is instantaneously shared. In a U-city, people can access a variety of web data in ...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...