Spaces#

Spaces define the valid format of observation and action spaces for an environment.

General Functions#

Each space implements the following functions:

gym.spaces.Space.sample(self) → gym.spaces.space.T_cov#: Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.

gym.spaces.Space.contains(self, x) → bool#: Return boolean specifying if x is a valid member of this space.

property Space.shape: Optional[Tuple[int, ...]]#: Return the shape of the space as an immutable property.

property gym.spaces.Space.dtype#: Return the data type of this space.

gym.spaces.Space.seed(self, seed: Optional[int] = None) → list#: Seed the PRNG of this space and possibly the PRNGs of subspaces.

gym.spaces.Space.to_jsonable(self, sample_n: Sequence[gym.spaces.space.T_cov]) → list#: Convert a batch of samples from this space to a JSONable data type.

gym.spaces.Space.from_jsonable(self, sample_n: list) → List[gym.spaces.space.T_cov]#: Convert a JSONable data type to a batch of samples from this space.

Box#

class gym.spaces.Box(low: typing.Union[typing.SupportsFloat, numpy.ndarray], high: typing.Union[typing.SupportsFloat, numpy.ndarray], shape: typing.Optional[typing.Sequence[int]] = None, dtype: typing.Type = <class 'numpy.float32'>, seed: typing.Optional[typing.Union[int, gym.utils.seeding.RandomNumberGenerator]] = None)#

A (possibly unbounded) box in \(\mathbb{R}^n\).

Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of \([a, b]\), \((-\infty, b]\), \([a, \infty)\), or \((-\infty, \infty)\).

There are two common use cases:

Identical bound for each dimension:

>>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32)
Box(3, 4)

Independent bound for each dimension:

>>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32)
Box(2,)

is_bounded(manner: str = 'both') → bool#

Checks whether the box is bounded in some sense.

Parameters: manner (str) – One of "both", "below", "above".
Returns: If the space is bounded
Raises: ValueError – If manner is neither "both" nor "below" or "above"

sample() → numpy.ndarray#

Generates a single random sample inside the Box.

In creating a sample of the box, each coordinate is sampled (independently) from a distribution that is chosen according to the form of the interval:

\([a, b]\) : uniform distribution
\([a, \infty)\) : shifted exponential distribution
\((-\infty, b]\) : shifted negative exponential distribution
\((-\infty, \infty)\) : normal distribution

Returns: A sampled value from the Box

Discrete#

class gym.spaces.Discrete(n: int, seed: Optional[Union[int, gym.utils.seeding.RandomNumberGenerator]] = None, start: int = 0)#

A space consisting of finitely many elements.

This class represents a finite subset of integers, more specifically a set of the form \(\{ a, a+1, \dots, a+n-1 \}\).

Example:

>>> Discrete(2)            # {0, 1}
>>> Discrete(3, start=-1)  # {-1, 0, 1}

MultiBinary#

class gym.spaces.MultiBinary(n: Union[numpy.ndarray, Sequence[int], int], seed: Optional[Union[int, gym.utils.seeding.RandomNumberGenerator]] = None)#

An n-shape binary space.

Elements of this space are binary arrays of a shape that is fixed during construction.

Example Usage:

>>> observation_space = MultiBinary(5)
>>> observation_space.sample()
    array([0, 1, 0, 1, 0], dtype=int8)
>>> observation_space = MultiBinary([3, 2])
>>> observation_space.sample()
    array([[0, 0],
        [0, 1],
        [1, 1]], dtype=int8)

MultiDiscrete#

class gym.spaces.MultiDiscrete(nvec: typing.Union[numpy.ndarray, typing.List[int]], dtype=<class 'numpy.int64'>, seed: typing.Optional[typing.Union[int, gym.utils.seeding.RandomNumberGenerator]] = None)#

This represents the cartesian product of arbitrary Discrete spaces.

It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space.

Note

Some environment wrappers assume a value of 0 always represents the NOOP action.

e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:

Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4
Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

It can be initialized as MultiDiscrete([ 5, 2, 2 ])

Dict#

class gym.spaces.Dict(spaces: Optional[Dict[str, gym.spaces.space.Space]] = None, seed: Optional[Union[dict, int, gym.utils.seeding.RandomNumberGenerator]] = None, **spaces_kwargs: gym.spaces.space.Space)#

A dictionary of Space instances.

Elements of this space are (ordered) dictionaries of elements from the constituent spaces.

Example usage:

>>> from gym.spaces import Dict, Discrete
>>> observation_space = Dict({"position": Discrete(2), "velocity": Discrete(3)})
>>> observation_space.sample()
OrderedDict([('position', 1), ('velocity', 2)])

Example usage [nested]:

>>> from gym.spaces import Box, Dict, Discrete, MultiBinary, MultiDiscrete
>>> Dict(
...     {
...         "ext_controller": MultiDiscrete([5, 2, 2]),
...         "inner_state": Dict(
...             {
...                 "charge": Discrete(100),
...                 "system_checks": MultiBinary(10),
...                 "job_status": Dict(
...                     {
...                         "task": Discrete(5),
...                         "progress": Box(low=0, high=100, shape=()),
...                     }
...                 ),
...             }
...         ),
...     }
... )

It can be convenient to use Dict spaces if you want to make complex observations or actions more human-readable. Usually, it will be not be possible to use elements of this space directly in learning code. However, you can easily convert Dict observations to flat arrays by using a gym.wrappers.FlattenObservation wrapper. Similar wrappers can be implemented to deal with Dict actions.

Tuple#

class gym.spaces.Tuple(spaces: Iterable[gym.spaces.space.Space], seed: Optional[Union[int, List[int], gym.utils.seeding.RandomNumberGenerator]] = None)#

A tuple (more precisely: the cartesian product) of Space instances.

Elements of this space are tuples of elements of the constituent spaces.

Example usage:

>>> from gym.spaces import Box, Discrete
>>> observation_space = Tuple((Discrete(2), Box(-1, 1, shape=(2,))))
>>> observation_space.sample()
(0, array([0.03633198, 0.42370757], dtype=float32))

Utility Functions#

gym.spaces.utils.flatdim(space: gym.spaces.space.Space) → int#

Return the number of dimensions a flattened equivalent of this space would have.

Example usage:

>>> from gym.spaces import Discrete
>>> space = Dict({"position": Discrete(2), "velocity": Discrete(3)})
>>> flatdim(space)
5

Parameters: space – The space to return the number of dimensions of the flattened spaces
Returns: The number of dimensions for the flattened spaces
Raises: NotImplementedError – if the space is not defined in gym.spaces.

gym.spaces.utils.flatten_space(space: gym.spaces.space.Space) → gym.spaces.box.Box#

Flatten a space into a single Box.

This is equivalent to flatten(), but operates on the space itself. The result always is a Box with flat boundaries. The box has exactly flatdim() dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.

Example:

>>> box = Box(0.0, 1.0, shape=(3, 4, 5))
>>> box
Box(3, 4, 5)
>>> flatten_space(box)
Box(60,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that flattens a discrete space:

>>> discrete = Discrete(5)
>>> flatten_space(discrete)
Box(5,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that recursively flattens a dict:

>>> space = Dict({"position": Discrete(2), "velocity": Box(0, 1, shape=(2, 2))})
>>> flatten_space(space)
Box(6,)
>>> flatten(space, space.sample()) in flatten_space(space)
True

Parameters: space – The space to flatten
Returns: A flattened Box
Raises: NotImplementedError – if the space is not defined in gym.spaces.

gym.spaces.utils.flatten(space: gym.spaces.space.Space[gym.spaces.utils.T], x: gym.spaces.utils.T) → numpy.ndarray#

gym.spaces.utils.flatten(space: gym.spaces.multi_binary.MultiBinary, x) → numpy.ndarray

gym.spaces.utils.flatten(space: gym.spaces.box.Box, x) → numpy.ndarray

gym.spaces.utils.flatten(space: gym.spaces.discrete.Discrete, x) → numpy.ndarray

gym.spaces.utils.flatten(space: gym.spaces.multi_discrete.MultiDiscrete, x) → numpy.ndarray

gym.spaces.utils.flatten(space: gym.spaces.tuple.Tuple, x) → numpy.ndarray

gym.spaces.utils.flatten(space: gym.spaces.dict.Dict, x) → numpy.ndarray

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Parameters

space – The space that x is flattened by
x – The value to flatten

Returns

The flattened ``x``, always returns a 1D array.

Raises

NotImplementedError – If the space is not defined in gym.spaces.

gym.spaces.utils.unflatten(space: gym.spaces.space.Space[gym.spaces.utils.T], x: numpy.ndarray) → gym.spaces.utils.T#

Unflatten a data point from a space.

This reverses the transformation applied by flatten(). You must ensure that the space argument is the same as for the flatten() call.

Parameters

space – The space used to unflatten x
x – The array to unflatten

Returns

A point with a structure that matches the space.

Raises

NotImplementedError – if the space is not defined in gym.spaces.