Skip to content

Dask Executor

flowdapt.compute.executor.dask.DaskExecutor

Bases: Executor

An Executor built on top of Dask. Supports both local and distributed execution, and can be configured to use GPUs. Often best for larger than memory workloads.

To use this Executor, set the services > compute > executor config target, for example:

services:
  compute:
    executor:
      target: flowdapt.compute.executor.dask.DaskExecutor

Parameters:

Name Type Description Default
cluster_address str | None

The address of an existing Dask cluster to connect to.

None
scheduler_host str

The host to start the scheduler on.

'127.0.0.1'
scheduler_port int

The port to start the scheduler on.

6684
scheduler_scheme str

The protocol to use for the scheduler.

'tcp'
dashboard_port int

The port to start the dashboard on.

9968
cpus int | Literal['auto']

The number of CPUs to use. If set to "auto", will use the number of CPUs on the machine.

'auto'
gpus int | str

The number of GPUs to use, or a comma delimited list of device ID's. If set to "auto", will use the number of GPUs on the machine. Defaults to "disabled".

'disabled'
threads int | Literal['auto']

The number of threads to use. Defaults to all available.

'auto'
memory str

The amount of memory to use. Defaults to all available.

'auto'
adaptive bool

Whether to use adaptive scaling. Defaults to False.

False
resources dict[str, float]

A dictionary of custom resource labels to use for the cluster. These labels are logical and not physical, and are used to determine if a stage can be run on a worker. Base resources such as CPUs and memory are automatically added. Defaults to an empty dictionary.

{}
pip list[str]

A list of pip packages to install on the workers. Defaults to an empty list.

[]
env_vars dict[str, str]

A dictionary of environment variables to set on the workers. Defaults to an empty dictionary.

{}
upload_plugins bool

Whether to upload plugins to the workers. This is ignored if running an in-process cluster. Defaults to False.

False