Sciweavers

147 search results - page 10 / 30
» Policy Gradient in Continuous Time
Sort
View
ICRA
2008
IEEE
129views Robotics» more  ICRA 2008»
15 years 9 months ago
Compliant manipulation for peg-in-hole: Is passive compliance a key to learn contact motion?
— We examine the usefulness of passive compliance in a manipulator that learns contact motion. Based on the notice that humans outperforms robots with the contact motion, we foll...
Seung-kook Yun
IJCNN
2000
IEEE
15 years 7 months ago
The Inefficiency of Batch Training for Large Training Sets
Multilayer perceptrons are often trained using error backpropagation (BP). BP training can be done in either a batch or continuous manner. Claims have frequently been made that bat...
D. Randall Wilson, Tony R. Martinez
MST
2011
200views Hardware» more  MST 2011»
14 years 10 months ago
Performance of Scheduling Policies in Adversarial Networks with Non-synchronized Clocks
In this paper we generalize the Continuous Adversarial Queuing Theory (CAQT) model [5] by considering the possibility that the router clocks in the network are not synchronized. W...
Antonio Fernández Anta, José Luis L&...
MM
1994
ACM
90views Multimedia» more  MM 1994»
15 years 7 months ago
Scheduling Policies for an On-Demand Video Server with Batching
In an on-demand video server environment, clients make requests for movies to a centralized video server. Due to the stringent response time requirements, continuous delivery of a...
Asit Dan, Dinkar Sitaram, Perwez Shahabuddin
UAI
2000
15 years 4 months ago
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Andrew Y. Ng, Michael I. Jordan