Sciweavers

WWW
2010
ACM
14 years 2 months ago
Predicting positive and negative links in online social networks
We study online social networks in which relationships can be either positive (indicating relations such as friendship) or negative (indicating relations such as opposition or ant...
Jure Leskovec, Daniel P. Huttenlocher, Jon M. Klei...
WWW
2010
ACM
14 years 2 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
WWW
2010
ACM
14 years 2 months ago
Using landing pages for sponsored search ad selection
We explore the use of the landing page content in sponsored search ad selection. Specifically, we compare the use of the ad’s intrinsic content to augmenting the ad with the wh...
Yejin Choi, Marcus Fontoura, Evgeniy Gabrilovich, ...
WWW
2010
ACM
14 years 2 months ago
Collaborative location and activity recommendations with GPS history data
With the increasing popularity of location-based services, such as tour guide and location-based social network, we now have accumulated many location data on the Web. In this pap...
Vincent Wenchen Zheng, Yu Zheng, Xing Xie, Qiang Y...
WWW
2010
ACM
14 years 2 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
WWW
2010
ACM
14 years 2 months ago
Regular expressions considered harmful in client-side XSS filters
Cross-site scripting flaws have now surpassed buffer overflows as the world’s most common publicly-reported security vulnerability. In recent years, browser vendors and resea...
Daniel Bates, Adam Barth, Collin Jackson
WWW
2010
ACM
14 years 2 months ago
The social honeypot project: protecting online communities from spammers
We present the conceptual framework of the Social Honeypot Project for uncovering social spammers who target online communities and initial empirical results from Twitter and MySp...
Kyumin Lee, James Caverlee, Steve Webb
WWW
2010
ACM
14 years 2 months ago
SourceRank: relevance and trust assessment for deep web sources based on inter-source agreement
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
Raju Balakrishnan, Subbarao Kambhampati
WWW
2010
ACM
14 years 2 months ago
RESTler: crawling RESTful services
Service descriptions allow designers to document, understand, and use services, creating new useful and complex services with aggregated business value. Unlike RPC-based services,...
Rosa Alarcón, Erik Wilde
WWW
2010
ACM
14 years 2 months ago
Visual structure-based web page clustering and retrieval
Clustering and retrieval of web pages dominantly relies on analyzing either the content of individual web pages or the link structure between them. Some literature also suggests t...
Paul Bohunsky, Wolfgang Gatterbauer