Work stealing for interactive services to meet target latency

9 years 10 months ago

Download www.cse.wustl.edu

Interactive web services increasingly drive critical business workloads such as search, advertising, games, shopping, and ﬁnance. Whereas optimizing parallel programs and distributed server systems have historically focused on average latency and throughput, the primary metric for interactive applications is instead consistent responsiveness, i.e., minimizing the number of requests that miss a target latency. This paper is the ﬁrst to show how to generalize work-stealing, which is traditionally used to minimize the makespan of a single parallel job, to optimize for a target latency in interactive services with multiple parallel requests. We design a new adaptive work stealing policy, called tailcontrol, that reduces the number of requests that miss a target latency. It uses instantaneous request progress, system load, and a target latency to choose when to parallelize requests with stealing, when to admit new requests, and when to limit parallelism of large requests. We implement ...

Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He

Real-time Traffic

Distributed And Parallel Computing | PPOPP 2016 |

claim paper

» User Interface Adaptation of WebBased Services on the Semantic Web

» Virtual private caches

Post Info
More Details (n/a)

Added	09 Apr 2016
Updated	09 Apr 2016
Type	Journal
Year	2016
Where	PPOPP
Authors	Jing Li, Kunal Agrawal, Sameh Elnikety, Yuxiong He, I-Ting Angelina Lee, Chenyang Lu, Kathryn S. McKinley

Comments (0)

Sciweavers

Work stealing for interactive services to meet target latency

Distributed And Parallel Computing | PPOPP 2016 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers