Skip to content

API Reference

Main entry points

meadowrun.run_function(function, host, resources=None, deployment=None, args=None, kwargs=None, sidecar_containers=None, ports=None, wait_for_result=True) async

Runs function on a remote machine, specified by "host".

Parameters:

Name Type Description Default
function Union[Callable[..., _T], str]

A reference to a function (e.g. package.module.function_name), a lambda, or a string like "package.module.function_name" (which is useful if the function cannot be referenced in the current environment but can be referenced in the deployed environment)

required
host Host

Specifies where to run the function. See Host and derived classes.

required
resources Optional[Resources]

Specifies the resources (e.g. CPU, RAM) needed by the function. For some hosts, this is optional, for other hosts it is required. See Resources.

None
deployment Union[Deployment, Awaitable[Deployment], None]

See Deployment. Specifies the environment (code and libraries) that are needed to run this command. This can be an actual Deployment object, or it can be an Awaitable that will produce a Deployment object. The default, None, is equivalent to mirror_local

None
args Optional[Sequence[Any]]

Passed to the function like function(*args)

None
kwargs Optional[Dict[str, Any]]

Passed to the function like function(**kwargs)

None
sidecar_containers Union[Iterable[ContainerInterpreterBase], ContainerInterpreterBase, None]

Additional containers that will be available from the main job as sidecar-container-0 sidecar-container-1, etc.

None
ports Union[Iterable[str], str, Iterable[int], int, None]

A specification of ports to make available on the machine that runs this job. E.g. 8000, "8080-8089" (inclusive). Ports will be opened just for the duration of this job. Be careful as other jobs could be running on the same machine at the same time!

None
wait_for_result bool

If this is set to False, we will run in "fire and forget" mode, which kicks off the function and doesn't wait for it to return.

True

Returns:

Type Description
_T

If wait_for_result is True (which is the default), the return value will be the result of calling function. If wait_for_result is False, the return value will always be None.

meadowrun.run_command(args, host, resources=None, deployment=None, context_variables=None, sidecar_containers=None, ports=None, wait_for_result=True) async

Runs the specified command on a remote machine

Parameters:

Name Type Description Default
args Union[str, Sequence[str]]

Specifies the command to run, can be a string (e.g. "jupyter nbconvert --to html analysis.ipynb") or a list of strings (e.g. ["jupyter", --"nbconvert", "--to", "html", "analysis.ipynb"])

required
host Host

Specifies where to run the function. See Host and derived classes.

required
resources Optional[Resources]

Specifies the resources (e.g. CPU, RAM) needed by the command. For some hosts, this is optional, for other hosts it is required. See Resources.

None
deployment Union[Deployment, Awaitable[Deployment], None]

See Deployment. Specifies the environment (code and libraries) that are needed to run this command. This can be an actual Deployment object, or it can be an Awaitable that will produce a Deployment object. The default, None, is equivalent to mirror_local

None
context_variables Optional[Dict[str, Any]]

Experimental feature

None
sidecar_containers Union[Iterable[ContainerInterpreterBase], ContainerInterpreterBase, None]

Additional containers that will be available from the main job as sidecar-container-0 sidecar-container-1, etc.

None
ports Union[Iterable[str], str, Iterable[int], int, None]

A specification of ports to make available on the machine that runs this job. E.g. 8000, "8080-8089" (inclusive). Ports will be opened just for the duration of this job. Be careful as other jobs could be running on the same machine at the same time!

None
wait_for_result bool

If this is set to False, we will run in "fire and forget" mode, which kicks off the command and doesn't wait for it to return.

True

Returns:

Type Description
JobCompletion[None]

A JobCompletion object that contains metadata about the running of the job.

meadowrun.run_map(function, args, host, resources_per_task=None, deployment=None, num_concurrent_tasks=None, sidecar_containers=None, ports=None, wait_for_result=True, max_num_task_attempts=1, retry_with_more_memory=False) async

Equivalent to map(function, args), but runs distributed and in parallel.

Parameters:

Name Type Description Default
function Callable[[_T], _U]

A reference to a function (e.g. package.module.function_name) or a lambda

required
args Sequence[_T]

A list of objects, each item in the list represents a "task", where each "task" is an invocation of function on the item in the list

required
resources_per_task Optional[Resources]

The resources (e.g. CPU and RAM) required to run a single task. For some hosts, this is optional, for other hosts it is required. See Resources.

None
host Host

Specifies where to get compute resources from. See Host and derived classes.

required
num_concurrent_tasks Optional[int]

The number of workers to launch. This can be less than or equal to the number of args/tasks. Will default to half the total number of tasks plus one, rounded down if set to None.

None
deployment Union[Deployment, Awaitable[Deployment], None]

See Deployment. Specifies the environment (code and libraries) that are needed to run this command. This can be an actual Deployment object, or it can be an Awaitable that will produce a Deployment object. The default, None, is equivalent to mirror_local

None
sidecar_containers Union[Iterable[ContainerInterpreterBase], ContainerInterpreterBase, None]

Additional containers that will be available from the main job as sidecar-container-0 sidecar-container-1, etc.

None
ports Union[Iterable[str], str, Iterable[int], int, None]

A specification of ports to make available on the machines that runs tasks for this job. E.g. 8000, "8080-8089" (inclusive). Ports will be opened just for the duration of this job. Be careful as other jobs could be running on the same machine at the same time!

None
wait_for_result bool

If this is set to False, we will run in "fire and forget" mode, which kicks off the command and doesn't wait for it to return.

True
max_num_task_attempts int

If this is set to more than 1, tasks that fail will be retried. If this parameter is e.g. 3, a task that fails will be attempted a total of 3 times.

1
retry_with_more_memory bool

This is an experimental feature and the API will likely change. If this is set to True, when a task fails, if the task at some point used more than 95% of the requested memory, the task will be retried with more memory. Each attempt will be allocated (original requested memory) * (attempt number).

False

Returns:

Type Description
Optional[Sequence[_U]]

If wait_for_result is True (which is the default), the return value will be the result of running function on each of args. If wait_for_result is False, the return value will always be None.

meadowrun.run_map_as_completed(function, args, host, resources_per_task=None, deployment=None, num_concurrent_tasks=None, sidecar_containers=None, ports=None, max_num_task_attempts=1, retry_with_more_memory=False) async

Equivalent to run_map, but returns results from tasks as they are completed as an AsyncIterable. This means that to access the results, you need to iterate using async for, and call result_or_raise on the returned TaskResult objects. Usage for approximating run_map behavior is:

sorted_tasks = sorted(
    [task async for task in run_map_as_completed(...)],
    key=lambda t: t.task_id
)
results = [task.result_or_raise() for task in sorted_tasks]

This will not have exactly the same behavior as run_map because run_map waits for all of the tasks to execute and then returns a list of results, ordered corresponding to how the args parameter was ordered, whereas run_map_as_completed returns results as they complete. For simple use cases.

Parameters:

Name Type Description Default
function Callable[[_T], _U]

A reference to a function (e.g. package.module.function_name) or a lambda

required
args Sequence[_T]

A list of objects, each item in the list represents a "task", where each "task" is an invocation of function on the item in the list

required
resources_per_task Optional[Resources]

The resources (e.g. CPU and RAM) required to run a single task. For some hosts, this is optional, for other hosts it is required. See Resources.

None
host Host

Specifies where to get compute resources from. See Host and derived classes.

required
num_concurrent_tasks Optional[int]

The number of workers to launch. This can be less than or equal to the number of args/tasks. Will default to half the total number of tasks plus one, rounded down if set to None.

None
deployment Union[Deployment, Awaitable[Deployment], None]

See Deployment. Specifies the environment (code and libraries) that are needed to be an Awaitable that will produce a Deployment object. The default, None, is equivalent to mirror_local

None
sidecar_containers Union[Iterable[ContainerInterpreterBase], ContainerInterpreterBase, None]

Additional containers that will be available from the main job as sidecar-container-0 sidecar-container-1, etc.

None
ports Union[Iterable[str], str, Iterable[int], int, None]

A specification of ports to make available on the machines that runs tasks for this job. E.g. 8000, "8080-8089" (inclusive). Ports will be opened just for the duration of this job. Be careful as other jobs could be running on the same machine at the same time!

None
max_num_task_attempts int

If this is set to more than 1, tasks that fail will be retried. If this parameter is e.g. 3, a task that fails will be attempted a total of 3 times.

1
retry_with_more_memory bool

This is an experimental feature and the API will likely change. If this is set to True, when a task fails, if the task at some point used more than 95% of the requested memory, the task will be retried with more memory. Each attempt will be allocated (original requested memory) * (attempt number).

False

Returns:

Type Description
AsyncIterable[TaskResult[_U]]

An async iterable returning TaskResult objects.

meadowrun.TaskResult dataclass

Bases: Generic[_T]

The result of a run_map_as_completed task.

Attributes:

Name Type Description
task_id int

The index of the task as it was originally passed to run_map_as_completed.

is_success bool

True if the task completed successfully, False if the task raised an exception

result Optional[_T]

If is_success, the result of the task. Otherwise, None. See also result_or_raise

exception Optional[Tuple[str, str, str]]

If not is_success, a Tuple describing the exception that the task raised. Otherwise, None. See also result_or_raise.

attempt int

1-based number indicating which attempt of the task this is. 1 means first attempt, 2 means second attempt, etc.

result_or_raise()

Returns a successful task result, or raises a TaskException.

Raises:

Type Description
TaskException

if the task did not finish successfully.

Returns:

Name Type Description
_T _T

the unpickled result if the task finished successfully.

meadowrun.TaskException

Bases: Exception

Represents an exception that occurred in a task.

Specifying resource requirements

meadowrun.Resources dataclass

Specifies the requirements for a job or for each task within a job

Attributes:

Name Type Description
logical_cpu Optional[float]

Specifies logical CPU (aka vCPU) required. E.g. 2 means we require 2 logical CPUs

memory_gb Optional[float]

Specifies RAM required. E.g. 1.5 means we requires 1.5 GB of RAM

max_eviction_rate float

Specifies what eviction rate (aka interruption probability) we're okay with as a percent. E.g. 80 means that any instance type with an eviction rate less than 80% can be used. Use 0 to indicate that only on-demand instance are acceptable (i.e. do not use spot instances)

gpus Optional[float]

Number of GPUs required. If gpu_memory is set, but this value is not set, this is implied to be 1

gpu_memory Optional[float]

Total GPU memory (aka VRAM) required across all GPUs

flags_required Optional[float]

E.g. "intel", "avx512", etc.

ephemeral_storage Optional[float]

GB of local storage (aka local disk). Currently only supported on Kubernetes

Specifying hosts

meadowrun.Host

Bases: abc.ABC

Host is an abstract class for specifying where to run a job. See implementations below.

meadowrun.AllocVM

Bases: Host, abc.ABC

An abstract class that provides shared implementation for AllocEC2Instance and AllocAzureVM

meadowrun.AllocEC2Instance dataclass

Bases: AllocVM

Specifies that the job should be run on a dynamically allocated EC2 instance. Any existing Meadowrun-managed EC2 instances will be reused if available. If none are available, Meadowrun will launch the cheapest instance type that meets the resource requirements for a job.

resources_required must be provided with the AllocEC2Instance Host.

Attributes:

Name Type Description
region_name Optional[str]

Specifies the region name for EC2, e.g. "us-east-2". None will use the default region_name.

ami_id Union[str, None]

An AMI ID for EC2, e.g. "ami-006426834f282c3d7". This image must be available in the specified region. The AMI specified must be built off of the Meadowrun AMIs as Meadowrun expects certain python environments and folders to be available. See Use a custom AMI (machine image) on AWS

If this is not specified, Meadowrun will use the default Meadowrun-supplied images when launching new images. If there are existing instances, the job will run on any instance, regardless of what image was used to launch it.

subnet_id Optional[str]

The subnet id for EC2, e.g. "subnet-02e4e2996d2fb96d9". The subnet must be in the specified region. If this is not specified, the default subnet in the default VPC will be used.

If you specify a subnet, you'll need to think about how to connect to the machine in your subnet. One option is to set your subnet to auto-assign IP addresses. If an instance does not have a public IP address, Meadowrun will try to connect via the private IP address. This will only work if the machine launching the job can access the private IP address. The default subnet in a new AWS account will be set to auto-assign IP addresses.

If your subnet does not have access to the internet, you'll need to make sure that any dependencies (e.g. pip packages, container images) that Meadowrun will try to download are available without internet access. The default subnet in a new AWS account will have the appropriate Route Tables and Internet Gateway configuration to be able to access the internet.

security_group_ids Union[str, Sequence[str], None]

A list of security group ids, e.g. "sg-0690853a8374b9b6d". The security group must be in the same VPC as the specified subnet_id (security groups are specific to each VPC), and it must allow you to SSH over port 22 to the machine from your current IP address. If this is not specified, Meadowrun will use a security group created at install time in the default VPC. Meadowrun's install step sets up a security group and opens port 22 for that security group for the current IP whenever Meadowrun is used. If you specify a subnet that is not in the default VPC, this parameter is required, as the default security group will not be available. Please also see the "ports" argument on the run_* commands.

iam_role_instance_profile Optional[str]

The name of an instance profile for an IAM role name (not to be confused with the IAM role itself!), e.g. "meadowrun_ec2_role_instance_profile". The EC2 instance will be launched under this IAM role. By default, Meadowrun will use an IAM role created at install time called meadowrun_ec2_role that has the permissions needed for a Meadowrun-managed EC2 instance. Any IAM role you specify must have a superset of the permissions granted by meadowrun_ec2_role. The easiest way to implement this is to attach the Meadowrun-generated "meadowrun_ec2_policy" to your IAM role (in addition to any custom policies you wish to add). Please also see Access resources from Meadowrun jobs

meadowrun.AllocAzureVM dataclass

Bases: AllocVM

Specifies that the job should be run on a dynamically allocated Azure VM. Any existing Meadowrun-managed VMs will be reused if available. If none are available, Meadowrun will launch the cheapest VM type that meets the resource requirements for a job.

resources_required must be provided with the AllocAzureVM Host.

Attributes:

Name Type Description
location Optional[str]

Specifies the location for the Azure VM, e.g. "eastus". None will use the default location.

meadowrun.AllocCloudInstance(cloud_provider, region_name=None)

This function is deprecated and will be removed in version 1.0.0. Please use AllocEC2Instance or AllocAzureVM directly.

meadowrun.Kubernetes dataclass

Bases: Host

Specifies a Kubernetes cluster to run a Meadowrun job on. resources_required is optional with the Kubernetes Host.

Attributes:

Name Type Description
storage_spec Optional[StorageBucketSpec]

Specifies the object storage system to use. See derived classes of StorageBucketSpec. This can only be omitted if you are only using run_command and you've specified a specific container image to run on (i.e. rather than an EnvironmentSpec of some sort)

kube_config_context Optional[str]

Specifies the kube config context to use. Default is None which means use the current context (i.e. kubectl config current-context)

kubernetes_namespace str

The Kubernetes namespace that Meadowrun will create Jobs in. This should usually not be left to the default value ("default") for any "real" workloads.

resuable_pods str

When set to True, starts generic long-lived pods that can be reused for multiple jobs. When set to False, starts a new pod(s) for every job

pod_customization Optional[Callable[[kubernetes_client.V1PodTemplateSpec], kubernetes_client.V1PodTemplateSpec]]

A function like pod_customization(pod_template_spec) that will be called on the PodTemplateSpec just before we submit it to Kubernetes. You can make changes like specifying a serviceAccountName, adding ephemeral storage, etc. You can either modify the pod_template_spec argument in place and return it as the result, or construct a new V1PodTemplateSpec and return that.

StorageBucketSpecs for Kubernetes

meadowrun.StorageBucketSpec

Bases: abc.ABC

An abstract class that specifies an object storage system that Meadowrun can use to send data back and forth from remote workers

meadowrun.GenericStorageBucketSpec dataclass

Bases: StorageBucketSpec

Specifies a bucket in an object storage system. The object storage must be S3-compatible and use username/password authentication. The arguments provided will be used along the lines of:

import boto3
boto3.Session(
    aws_access_key_id=username, aws_secret_access_key=password
).client(
    "s3", endpoint_url=endpoint_url
).download_file(
    Bucket=bucket, Key="test.file", Filename="test.file"
)

username and password should be the values provided by username_password_secret. (boto3 is built to be used with AWS S3, but it should work with any S3-compatible object store like Minio, Ceph, etc.)

Attributes:

Name Type Description
bucket str

The name of the bucket to use

endpoint_url str

The endpoint_url for the object storage system

endpoint_url_in_cluster Optional[str]

Defaults to None which means use endpoint_url. You can set this to a different URL if you need to use a different URL from inside the Kubernetes cluster to access the storage endpoint

username_password_secret Optional[str]

This should be the name of a Kubernetes secret that has a "username" and "password" key, where the username and password can be used to authenticate with the storage API.

meadowrun.GoogleBucketSpec dataclass

Bases: StorageBucketSpec

Specifies a bucket in Google Cloud Storage. Requires that credentials are available. This usually means that on the client you've logged in via the Google Cloud CLI, and in the Kubernetes cluster you're running with a service account (via the pod_customization parameter on Kubernetes) that has access to the specified bucket.

Attributes:

Name Type Description
bucket str

The name of the bucket to use

Specifying deployments

meadowrun.Deployment dataclass

container_image(repository, tag='latest', username_password_secret=None, environment_variables=None) classmethod

A deployment based on a docker container image

Parameters:

Name Type Description Default
repository str

The name of the docker container image repository, e.g. python or quay.io/minio/minio.

required
tag str

Combined with repository, will be used like {repository}:{tag}. Defaults to latest

'latest'
environment_variables Optional[Dict[str, str]]

e.g. {"PYTHONHASHSEED": "0"}. These environment variables will be set in the remote environment.

None

container_image_at_digest(repository, digest, username_password_secret=None, environment_variables=None) classmethod

A deployment based on a docker container image

Parameters:

Name Type Description Default
repository str

The name of the docker container image repository, e.g. python or quay.io/minio/minio.

required
digest str

Combined with repository, will be used like {repository}@{tag}. Defaults to latest if digest is not specified.

required
environment_variables Optional[Dict[str, str]]

e.g. {"PYTHONHASHSEED": "0"}. These environment variables will be set in the remote environment.

None

git_repo(repo_url, branch=None, commit=None, path_to_source=None, interpreter=None, environment_variables=None, ssh_key_secret=None, editable_install=True) classmethod

A deployment based on a git repo.

Parameters:

Name Type Description Default
repo_url str

e.g. "https://github.com/meadowdata/test_repo"

required
branch Optional[str]

defaults to "main" if neither branch nor commit are specified.

None
commit Optional[str]

can be provided instead of branch to use a specific commit hash, e.g. "d018b54"

None
path_to_source Optional[str]

e.g. "src/python" to use a subdirectory of the repo

None
interpreter Union[InterpreterSpecFile, ContainerInterpreterBase, PreinstalledInterpreter, None]

Specifies the python interpreter and libraries to use. Derived classes of InterpreterSpecFile allow you to specify a file (e.g. conda.yml or requirements.txt) in the git repo that can be used to build an environment. Derived classes of ContainerInterpreterBase allow you to specify a custom container image, and Meadowrun will not build/cache the environment for you. PreinstalledInterpreter specifies an interpreter that is already available on the remote machine.

None
environment_variables Optional[Dict[str, str]]

e.g. {"PYTHONHASHSEED": "0"}. These environment variables will be set in the remote environment

None
ssh_key_secret Optional[Secret]

A secret that contains the contents of a private SSH key that has read access to repo_url, e.g. AwsSecret("my_ssh_key"). See How to use a private git repo for AWS or Azure

None
editable_install bool

Execute editable installs in the project (e.g. -e in pip). If True (the default), the interpreter environment executes an install, which makes entry points, registered plugins and metadata work. If False, editable installs are filtered out.

True

Returns:

Type Description
Deployment

A Deployment object that can be passed to the run_* functions.

mirror_local(include_sys_path=True, additional_sys_paths=tuple(), include_sys_path_contents=True, interpreter=None, globs=None, environment_variables=None) classmethod async

A deployment that mirrors the local environment and code.

Parameters:

Name Type Description Default
include_sys_path bool

if True, syncs the sys.path variable to the remote machine. Usually used with include_python_path_code=True which is responsible for syncing the contents of the folders specified by sys.path. Excludes the paths on sys.path which are part of the interpreter, e.g. site-packages, as this should be taken care of by the interpreter argument. It's also possible to use include_sys_path=True, include_python_path_code=False, and set globs to include code in the sys.path directories. Setting globs to upload code and setting include_sys_path=False will result in the uploaded code not being on sys.path (and therefore potentially inaccessible) on the remote machine.

True
additional_sys_paths Union[Iterable[str], str]

local code paths that will be treated as if they were on sys.path (see the include_sys_path parameter)

tuple()
include_sys_path_contents Union[bool, Iterable[str], str]

If True, uploads the python code in sys.path/additional_sys_paths depending on how you set include_sys_path and additional_sys_paths. You can also set this parameter to an iterable of strings like [".py", ".so", ".txt"] to tell Meadowrun to copy the specified extensions rather than the default of just .py and .so files. If sys.path=False and additional_sys_paths=None, this parameter is ignored. If this parameter is False, you most likely want to use globs to explicitly specify the files on sys.path that you want to upload to the remote machine.

True
interpreter Union[LocalInterpreter, InterpreterSpecFile, ContainerInterpreterBase, PreinstalledInterpreter, None]

Specifies the environment/interpreter to use. Defaults to None which will detect the currently activated env. Derived classes of LocalInterpreter allow you to specify a python interpreter that exists on the local machine to rebuild on the remote machine. Derived classes of InterpreterSpecFile allow you to specify a file (e.g. conda.yml or requirements.txt) that exists locally that can be used to build an environment. Derived classes of ContainerInterpreterBase allow you to specify a custom container image, and Meadowrun will not build/cache the environment for you. PreinstalledInterpreter specifies an interpreter that is already available on the remote machine.

None
globs Union[str, Iterable[str], None]

This parameter can be used for two main purposes.

One purpose is to include files from your current working directory that can then be accessed on the remote machine using relative paths. E.g. you can specify "foo/bar.txt", and then your remote code will be able to access that file via the relative path "foo/bar.txt". Other examples are: "*.txt" will specify txt files in your current directory (but not recursively). "**/*.txt" will specify all txt files in your current directory recursively (e.g. will capture both 1.txt and foo/2.txt). "foo/**/*.txt" will capture all txt files in the foo directory. Note that most of the time, your current working directory will be on sys.path so any .py and .so files in your local working directory will be included by default, but any other file extensions will be ignored by default.

The second purpose is to include an explicit list of files from sys.path. So for example, you can set include_sys_path_contents=False and then use this variable to pass an explicit list of files that you want to include from sys.path.

In either case, it is okay to specify absolute paths or relative paths, but if globs specifies files that are outside of the current working directory's parent and outside of any paths on sys.path/additional_sys_paths, this function will raise an exception because those files will be inaccessible in the remote process.

None
environment_variables Optional[Dict[str, str]]

e.g. {"PYTHONHASHSEED": "0"}. These environment variables will be set in the remote environment.

None

Returns:

Type Description
Deployment

A Deployment object that can be passed to the run_* functions.

preinstalled_interpreter(path_to_interpreter, environment_variables=None) classmethod

A deployment for using an interpreter that is already installed on the remote machine. This makes sense if e.g. you're using a custom AMI.

Parameters:

Name Type Description Default
path_to_interpreter str

The path to the python executable you want to use in the Meadowrun AMI. This will usually only make sense if you are specifying a custom AMI. There is also a MEADOWRUN_INTERPRETER constant that you can provide here that will tell Meadowrun to use the same interpreter that is running the Meadowrun agent. You should only use this if you don't care about the version of python or what libraries are installed.

required
environment_variables Optional[Dict[str, str]]

e.g. {"PYTHONHASHSEED": "0"}. These environment variables will be set in the remote environment.

None

Specifying interpreters

meadowrun.LocalInterpreter

Bases: abc.ABC

An abstract base class for specifying a python interpreter on the local machine

meadowrun.LocalCondaInterpreter dataclass

Bases: LocalInterpreter

Specifies a locally installed conda environment

Attributes:

Name Type Description
environment_name_or_path str

Either the name of a conda environment (e.g. my_env_name) or the full path to the folder of a conda environment (e.g. /home/user/miniconda3/envs/my_env_name). Will be passed to conda env export

additional_software Union[Sequence[str], str, None]

apt packages that need to be installed to make the environment work

meadowrun.LocalPipInterpreter dataclass

Bases: LocalInterpreter

Specifies a locally available interpreter. It can be a "regular" install of a python interpreter, a virtualenv, or anything based on pip.

Attributes:

Name Type Description
path_to_interpreter str

The path to the python executable. E.g. /home/user/my_virtual_env/bin/python

python_version str

A python version like "3.9" or "3.9.5". The version must be available on docker: https://hub.docker.com/_/python as python--slim-bullseye.

additional_software Union[Sequence[str], str, None]

apt packages that need to be installed to make the environment work


meadowrun.InterpreterSpecFile

Bases: abc.ABC

An abstract base class for specifying a Conda environment.yml file or a pip requirements.txt file

meadowrun.CondaEnvironmentFile dataclass

Bases: InterpreterSpecFile

Specifies a file that can be used to create a conda environment. The file can either be in the format generated by conda env export (often referred to as environment.yml files), or by conda list --explicit.

Attributes:

Name Type Description
path_to_file str

In the context of mirror_local, this is the path to a file on the local disk. In the context of git_repo, this is a path to a file in the git repo

file_format Literal[None, env_export, list]

"env_export" specifies the conda env export format. "list" specifies the conda list --explicit format. None will auto-detect between these two formats

additional_software Union[Sequence[str], str, None]

apt packages that need to be installed to make the environment work

meadowrun.PipRequirementsFile dataclass

Bases: InterpreterSpecFile

Specifies a requirements.txt file generated by pip freeze.

Attributes:

Name Type Description
path_to_requirements_file str

In the context of mirror_local, this is the path to a file on the local disk. In the context of git_repo, this is a path to a file in the git repo

python_version str

A python version like "3.9" or "3.9.5". The version must be available on docker: https://hub.docker.com/_/python as python--slim-bullseye.

additional_software Union[Sequence[str], str, None]

apt packages that need to be installed to make the environment work

meadowrun.PoetryProjectPath dataclass

Bases: InterpreterSpecFile

Specifies a poetry project

Attributes:

Name Type Description
path_to_project str

In the context of mirror_local, this is the path to a folder on the local disk that contains pyproject.toml and poetry.lock. In the context of git_repo, this is a path to a folder in the git repo (use "" to indicate that pyproject.toml and poetry.lock are at the root of the repo).

python_version str

A python version like "3.9" or "3.9.5". The version must be available on docker: https://hub.docker.com/_/python as python--slim-bullseye. This python version must be compatible with the requirements in pyproject.toml

additional_software Union[Sequence[str], str, None]

apt packages that need to be installed to make the environment work


meadowrun.ContainerInterpreterBase

Bases: abc.ABC

An abstract base class for specifying a container as a python interpreter

meadowrun.ContainerInterpreter dataclass

Bases: ContainerInterpreterBase

Specifies a container image. The container image must be configured so that running docker run -it repository_name:tag python runs the right python interpreter.

Attributes:

Name Type Description
repository_name str

E.g. python, or <my_azure_registry>.azurecr.io/foo

tag str

E.g. 3.9-slim-bullseye or latest (the default)

username_password_secret Optional[Secret]

An AWS or Azure secret that has a username and password for connecting to the container registry (as specified or implied in image_name). Only needed if the image/container registry is private.

always_use_local bool

If this is True, only looks for the image on the EC2 instance and does not try to download the image from a container registry. This will only work if you've preloaded the image into the AMI via Use a custom AMI on AWS

meadowrun.ContainerAtDigestInterpreter dataclass

Bases: ContainerInterpreterBase

Like ContainerInterpreter but specifies a digest instead of a tag. Running docker run -it repository_name@digest python must run the right python interpreter.

Attributes:

Name Type Description
repository_name str

E.g. python, or <my_azure_registry>.azurecr.io/foo

digest str

E.g. sha256:97725c608...

username_password_secret Optional[Secret]

An AWS or Azure secret that has a username and password for connecting to the container registry (as specified or implied in image_name). Only needed if the image/container registry is private.

always_use_local bool

If this is True, only looks for the image on the EC2 instance and does not try to download the image from a container registry. This will only work if you've preloaded the image into the AMI via Use a custom AMI on AWS


meadowrun.PreinstalledInterpreter dataclass

Represents an interpreter that has been pre-installed on the remote machine. This is useful if you're using a custom AMI.

Attributes:

Name Type Description
path_to_interpreter str

The path to the python executable, e.g. /var/myenv/bin/python

Specifying secrets

meadowrun.Secret

Bases: abc.ABC

An abstract class for specifying a secret, e.g. a username/password or an SSH key

meadowrun.AwsSecret dataclass

Bases: Secret

An AWS secret

Attributes:

Name Type Description
secret_name str

The name of the secret (also sometimes called the id)

meadowrun.AzureSecret dataclass

Bases: Secret

An Azure secret

Attributes:

Name Type Description
secret_name str

The name of the secret

vault_name Optional[str]

The name of the Key Vault that the secret is in. Defaults to None, which implies the Meadowrun-managed Key Vault (mr)