Cartpole

Cart-pole task adapted from: https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py

class ivy_gym.CartPole[source]

Bases: Env

__init__()[source]

Initialize CartPole environment

action_space: Space[ActType]
close()[source]

Close environment.

get_observation()[source]

Get observation from environment.

Returns

ret – observation array

get_reward()[source]

Get reward based on current state

Returns

ret – Reward array

get_state()[source]

Get current state in environment.

Returns

ret – x, x velocity, angle, and angular velocity arrays

metadata: Dict[str, Any] = {'render.modes': ['human', 'rgb_array'], 'video.frames_per_second': 30}
observation_space: Space[ObsType]
render(mode='human')[source]

Renders the environment. The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

  • human: render to the current display or terminal and return nothing. Usually for human consumption.

  • rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.

  • ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Parameters

mode – Render mode, one of [human|rgb_array], default human

Returns

ret – Rendered image.

reset()[source]

Resets the environment to an initial state and returns the initial observation.

This method can reset the environment’s random number generator(s) if seed is an integer or if the environment has not yet initialized a random number generator. If the environment already has a random number generator and reset() is called with seed=None, the RNG should not be reset. Moreover, reset() should (in the typical use case) be called with an integer seed right after initialization and then never again.

Args:
seed (optional int): The seed that is used to initialize the environment’s PRNG.

If the environment does not already have a PRNG and seed=None (the default option) is passed, a seed will be chosen from some source of entropy (e.g. timestamp or /dev/urandom). However, if the environment already has a PRNG and seed=None is passed, the PRNG will not be reset. If you pass an integer, the PRNG will be reset even if it already exists. Usually, you want to pass an integer right after the environment has been initialized and then never again. Please refer to the minimal example above to see this paradigm in action.

options (optional dict): Additional information to specify how the environment is reset (optional,

depending on the specific environment)

Returns:
observation (object): Observation of the initial state. This will be an element of observation_space

(typically a numpy array) and is analogous to the observation returned by step().

info (dictionary): This dictionary contains auxiliary information complementing observation. It should be analogous to

the info returned by step().

set_state(state)[source]

Set current state in environment.

Parameters

state – tuple of x, x_velocity, angle, and angular_velocity

Returns

ret – observation array

step(action)[source]
Parameters

action