osim-rl / Git / Diff of /docs/_docs/nips2017/evaluation.md

Models:
JoelW/
osim-rl
Downloads: 1
Diff of /docs/_docs/nips2017/evaluation.md [000000] .. [077a87]
Switch to side-by-side view

--- a
+++ b/docs/_docs/nips2017/evaluation.md
@@ -0,0 +1,59 @@
+---
+title: Evaluation
+---
+
+Your task is to build a function `f` which takes the current state `observation` (a 41 dimensional vector) and returns the muscle excitations `action` (18 dimensional vector) in a way that maximizes the reward.
+
+The trial ends either if the pelvis of the model goes below `0.65` meters or if you reach `1000` iterations (corresponding to `10` seconds in the virtual environment). Your total reward is the position of the pelvis on the `x` axis after the last iteration minus a penalty for using ligament forces. Ligaments are tissues which prevent your joints from bending too much - overusing these tissues leads to injuries, so we want to avoid it. The penalty in the total reward is equal to the sum of forces generated by ligaments over the trial, divided by `10,000,000`.
+
+After each iteration you get a reward equal to the change of the `x` axis of pelvis during this iteration minus the magnitude of the ligament forces used in that iteration.
+
+You can test your model on your local machine. For submission, you will need to interact with the remote environment: [crowdAI](https://www.crowdai.org/challenges/nips-2017-learning-to-run) sends you the current `observation` and you need to send back the action you take in the given state. You will be evaluated at three different levels of difficulty.
+
+<!--
+For details, please refer to [Details of the environment](#details-of-the-environment).
+
+## Submission
+
+Assuming your controller is trained and is represented as a function `my_controller(observation)` returning an `action` you can submit it to [crowdAI](https://www.crowdai.org/challenges/nips-2017-learning-to-run) through interaction with an environment there:
+
+```python
+import opensim as osim
+from osim.http.client import Client
+from osim.env import RunEnv
+
+# Settings
+remote_base = "http://grader.crowdai.org:1729"
+crowdai_token = "[YOUR_CROWD_AI_TOKEN_HERE]"
+
+client = Client(remote_base)
+
+# Create environment
+observation = client.env_create(crowdai_token)
+
+# IMPLEMENTATION OF YOUR CONTROLLER
+# my_controller = ... (for example the one trained in keras_rl)
+
+while True:
+    [observation, reward, done, info] = client.env_step(my_controller(observation), True)
+    print(observation)
+    if done:
+        observation = client.env_reset()
+        if not observation:
+            break
+
+client.submit()
+```
+
+In the place of `[YOUR_CROWD_AI_TOKEN_HERE]` put your token from the profile page from [crowdai.org](http://crowdai.org/) website.
+
+Note that during the submission, the environment will get restarted. Since the environment is stochastic, you will need to submit three trials -- this way we make sure that your model is robust.
+
+## Rules
+
+In order to avoid overfitting to the training environment, the top participants (those who obtained 15.0 points or more) will be asked to resubmit their solutions in the second round of the challenge. Environments in the second round will have the same structure but **10 obstacles** and different seeds. In each submission, there will be **10 simulation**. Each participant will have a limit of **3 submissions**. The final ranking will be based on the results from the second round.
+
+Additional rules:
+* You are not allowed to use external datasets (e.g., kinematics of people walking)
+* Organizers reserve the right to modify challenge rules as required.
+-->