Diff of /README.md [000000] .. [687a25]

Switch to side-by-side view

--- a
+++ b/README.md
@@ -0,0 +1,20 @@
+# -NIPS-2017-Learning-to-Run
+Reinforcement learning environments with musculoskeletal models   
+https://www.crowdai.org/challenges/nips-2017-learning-to-run  
+  
+CANDIDATE ALGORITHMS:  
+
+Depp Deterministic Policy Gradient---DDPG(https://arxiv.org/abs/1509.02971)  
+Recurrent Deterministic Policy Gradient---RDPG(https://arxiv.org/pdf/1512.04455.pdf)  
+Trust  
+MOTIVATION:  
+In this case, we are manipulating the muscles rather than the velocity of the body parts, i.e., if I understood the problem correctly, we are changing the acceleration, which should be considered a second-order markov chain. LSTM and recurrent network might be able to capture the long short term dependencies.  
+  
+FOlDERS AND FILES:  
+DDPG: The standard implementation of ddpg in tensorflow, following the pesudocode in the paper.  
+RDPG: The recurrent version of the ddpg in tensorflow(There are complications that have not been resolved)
+  
+DEPENDENCIES:
+Tensorflow
+numpy
+math