Diff of /README.md [000000] .. [687a25]

Switch to unified view

a b/README.md
1
# -NIPS-2017-Learning-to-Run
2
Reinforcement learning environments with musculoskeletal models   
3
https://www.crowdai.org/challenges/nips-2017-learning-to-run  
4
  
5
CANDIDATE ALGORITHMS:  
6
7
Depp Deterministic Policy Gradient---DDPG(https://arxiv.org/abs/1509.02971)  
8
Recurrent Deterministic Policy Gradient---RDPG(https://arxiv.org/pdf/1512.04455.pdf)  
9
Trust  
10
MOTIVATION:  
11
In this case, we are manipulating the muscles rather than the velocity of the body parts, i.e., if I understood the problem correctly, we are changing the acceleration, which should be considered a second-order markov chain. LSTM and recurrent network might be able to capture the long short term dependencies.  
12
  
13
FOlDERS AND FILES:  
14
DDPG: The standard implementation of ddpg in tensorflow, following the pesudocode in the paper.  
15
RDPG: The recurrent version of the ddpg in tensorflow(There are complications that have not been resolved)
16
  
17
DEPENDENCIES:
18
Tensorflow
19
numpy
20
math