Time Relation of Actions and Observations

On the real robot, we are using time series for all robot data like actions and observations. The user does not send actions directly to the robot but appends them to the “desired actions” time series which serves as a queue. At each time step, identified by a time index t, the action at position t is taken from the “desired actions” time series and processed. At the same time an observation is acquired from the robot and added to the “observation” time series. This means that the effect of the desired action a_t is not yet visible in the observation y_t as is illustrated below. (a'_t corresponds to the applied action, see Desired vs Applied Action)

../_images/applied_action_dependency.png

In the robot_interfaces software, the method append_desired_action() is used to append actions to the time series. It returns the time index t at which the appended action will be executed. Methods like get_observation expect a time index as input and will return the data corresponding to this time step. If the given time index refers to a point in the future, these methods will block and wait until that point is reached.

As the simulation is not real-time critical, the behaviour is a bit different here:

  • append_desired_action() will directly apply the action and step the simulation.

  • There is no actual time series. The API in the simulation follows the same principle to make the transition to the real robot easier. However, it is implemented with a buffer size of 1, so the getter methods only provide data for the current time step.

  • It is possible to access information from t + 1. In a typical gym environment, it is expected that the observation returned by step(action) belongs to the moment after the given action is executed (this corresponds to the time index t + 1). To make it easier to get started, we therefore allow to access the observations of this time index in the simulation. Note, however, that this is not possible on the real robot!

For more information on the API of the real robot, see our publication TriFinger: An Open-Source Robot for Learning Dexterity.