*, index: int = Ellipsis, format: str = 'rgb24', filter_sequence: Optional[List[Tuple[str, Union[str, dict]]]] = None, filter_graph: Optional[Tuple[dict, List]] = None, constant_framerate: Optional[bool] = None, thread_count: int = 0, thread_type: Optional[str] = None) ndarray[source]#

Read frames from the video.

If index is an integer, this function reads the index-th frame from the file. If index is … (Ellipsis), this function reads all frames from the video, stacks them along the first dimension, and returns a batch of frames.


The index of the frame to read, e.g. index=5 reads the 5th frame. If ..., read all the frames in the video and stack them along a new, prepended, batch dimension.


Set the returned colorspace. If not None (default: rgb24), convert the data into the given format before returning it. If None return the data in the encoded format if it can be expressed as a strided array; otherwise raise an Exception.

filter_sequenceList[str, str, dict]

If not None, apply the given sequence of FFmpeg filters to each ndimage. Check the (module-level) plugin docs for details and examples.

filter_graph(dict, List)

If not None, apply the given graph of FFmpeg filters to each ndimage. The graph is given as a tuple of two dicts. The first dict contains a (named) set of nodes, and the second dict contains a set of edges between nodes of the previous dict. Check the (module-level) plugin docs for details and examples.


If True assume the video’s framerate is constant. This allows for faster seeking inside the file. If False, the video is reset before each read and searched from the beginning. If None (default), this value will be read from the container format.


How many threads to use when decoding a frame. The default is 0, which will set the number using ffmpeg’s default, which is based on the codec, number of available cores, threadding model, and other considerations.


The threading model to be used. One of

  • “SLICE”: threads assemble parts of the current frame

  • “FRAME”: threads may assemble future frames

  • None (default): Uses "FRAME" if index=... and ffmpeg’s default otherwise.


A numpy array containing loaded frame data.


Accessing random frames repeatedly is costly (O(k), where k is the average distance between two keyframes). You should do so only sparingly if possible. In some cases, it can be faster to bulk-read the video (if it fits into memory) and to then access the returned ndarray randomly.

The current implementation may cause problems for b-frames, i.e., bidirectionaly predicted pictures. I lack test videos to write unit tests for this case.

Reading from an index other than ..., i.e. reading a single frame, currently doesn’t support filters that introduce delays.