A job is defined as any computational task that runs for a certain time to completion, such as a model training or a data pipeline.
The workspace image can also be used to execute arbitrary Python code without starting any of the pre-installed tools. This provides a seamless way to productize your ML projects since the code that has been developed interactively within the workspace will have the same environment and configuration when run as a job via the same workspace image.
Run Python code as a job via the workspace image (click to expand...)
To run Python code as a job, you need to provide a path or URL to a code directory (or script) via EXECUTE_CODE. The code can be either already mounted into the workspace container or downloaded from a version control system (e.g., git or svn) as described in the following sections. The selected code path needs to be python executable. In case the selected code is a directory (e.g., whenever you download the code from a VCS) you need to put a __main__.py file at the root of this directory. The __main__.py needs to contain the code that starts your job.
Run code from version control system
You can execute code directly from Git, Mercurial, Subversion, or Bazaar by using the pip-vcs format as described in . For example, to execute code from a of a git repository, just run:
docker run --env EXECUTE_CODE="git+https://github.com/ml-tooling/ml-workspace.git#subdirectory=resources/tests/ml-job" mltooling/ml-workspace:0.10.4
๐ For additional information on how to specify branches, commits, or tags please refer to .
Run code mounted into the workspace
In the following example, we mount and execute the current working directory (expected to contain our code) into the /workspace/ml-job/ directory of the workspace:
docker run -v "${PWD}:/workspace/ml-job/" --env EXECUTE_CODE="/workspace/ml-job/" mltooling/ml-workspace:0.10.4
Install Dependencies
In the case that the pre-installed workspace libraries are not compatible with your code, you can install or change dependencies by just adding one or multiple of the following files to your code directory:
requirements.txt: for pip-installable dependencies.
environment.yml: to create a separate Python environment.
setup.sh: A shell script executed via /bin/bash.
The execution order is 1. environment.yml -> 2. setup.sh -> 3. requirements.txt
Test job in interactive mode
You can test your job code within the workspace (started normally with interactive tools) by executing the following python script:
It is also possible to embed your code directly into a custom job image, as shown below:
FROM mltooling/ml-workspace:0.10.4
# Add job code to image
COPY ml-job /workspace/ml-job
ENV EXECUTE_CODE=/workspace/ml-job
# Install requirements only
RUN python /resources/scripts/execute_code.py --requirements-only
# Execute only the code at container startup
CMD ["python", "/resources/docker-entrypoint.py", "--code-only"]