Sciweavers

147 search results - page 10 / 30
» Policy Gradient in Continuous Time
Sort
View
ICRA
2008
IEEE
129views Robotics» more  ICRA 2008»
14 years 4 months ago
Compliant manipulation for peg-in-hole: Is passive compliance a key to learn contact motion?
— We examine the usefulness of passive compliance in a manipulator that learns contact motion. Based on the notice that humans outperforms robots with the contact motion, we foll...
Seung-kook Yun
IJCNN
2000
IEEE
14 years 2 months ago
The Inefficiency of Batch Training for Large Training Sets
Multilayer perceptrons are often trained using error backpropagation (BP). BP training can be done in either a batch or continuous manner. Claims have frequently been made that bat...
D. Randall Wilson, Tony R. Martinez
MST
2011
200views Hardware» more  MST 2011»
13 years 4 months ago
Performance of Scheduling Policies in Adversarial Networks with Non-synchronized Clocks
In this paper we generalize the Continuous Adversarial Queuing Theory (CAQT) model [5] by considering the possibility that the router clocks in the network are not synchronized. W...
Antonio Fernández Anta, José Luis L&...
MM
1994
ACM
90views Multimedia» more  MM 1994»
14 years 1 months ago
Scheduling Policies for an On-Demand Video Server with Batching
In an on-demand video server environment, clients make requests for movies to a centralized video server. Due to the stringent response time requirements, continuous delivery of a...
Asit Dan, Dinkar Sitaram, Perwez Shahabuddin
UAI
2000
13 years 11 months ago
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Andrew Y. Ng, Michael I. Jordan