osim-rl / Git / Diff of /docs/_posts/2017-10-02-osim-rl-1.4.1-released.markdown

Models:

JoelW/

osim-rl

Downloads: 1

Diff of /docs/_posts/2017-10-02-osim-rl-1.4.1-released.markdown [000000] .. [077a87]

Switch to side-by-side view

--- a
+++ b/docs/_posts/2017-10-02-osim-rl-1.4.1-released.markdown
@@ -0,0 +1,17 @@
+---
+title: 'Version 1.4.1 released'
+date: 2017-10-02 11:56:01 -0500
+author: kidzik
+version: 1.4.1
+categories: [release]
+---
+
+After discussing the way the reward function is computed ([issue #43](https://github.com/stanfordnmbl/osim-rl/issues/43)), we decided to further update the environment. Uptill version 1.3, the reward received at every step was the total distance travelled from the starting point minus the ligament forces. As a result, the total reward was the cummulative sum of total distances over all steps (or discreet integral of position in time) minus the total sum of ligament forces.
+
+Since, this reward is unconventional in reinforcement learning, we updated the reward function at each step to the distance increment between the two steps minus the ligament forces. As a result, the total reward is the total distance travelled minus the ligament forces.
+
+In order to switch to the new environment you need to update the `osim-rl` scripts with the following command:
+
+    pip install git+https://github.com/stanfordnmbl/osim-rl.git -U
+
+Note that this will change the order of magnitude of the total reward from ~1000 to ~10 (now measured in meters travelled). The change does not affect the API of observations and actions. Moreover the measures are strongly correlated and a good model in the old version should perform well in the current version.
\ No newline at end of file