Sciweavers

567 search results - page 48 / 114
» Regularized Policy Iteration
Sort
View
CCS
2005
ACM
14 years 1 months ago
Preventing attribute information leakage in automated trust negotiation
Automated trust negotiation is an approach which establishes trust between strangers through the bilateral, iterative disclosure of digital credentials. Sensitive credentials are ...
Keith Irwin, Ting Yu
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 7 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
KDD
2009
ACM
228views Data Mining» more  KDD 2009»
14 years 8 months ago
A generalized Co-HITS algorithm and its application to bipartite graphs
Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Examples include queries and URLs in query logs, and authors and ...
Hongbo Deng, Michael R. Lyu, Irwin King
ISLPED
1999
ACM
91views Hardware» more  ISLPED 1999»
13 years 12 months ago
Stochastic modeling of a power-managed system: construction and optimization
-- The goal of a dynamic power management policy is to reduce the power consumption of an electronic system by putting system components into different states, each representing ce...
Qinru Qiu, Qing Wu, Massoud Pedram
ECAI
2006
Springer
13 years 11 months ago
Strategic Foresighted Learning in Competitive Multi-Agent Games
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-agent games. We make the observation that in a competitive setting with adaptive...
Pieter Jan't Hoen, Sander M. Bohte, Han La Poutr&e...