Containers

Cocoa is distributed as Docker containers pre-built for a range of GPU and CPU architectures. This is the easiest way to get started without compiling from source. The same images also run under Singularity/Apptainer on HPC systems (see Running with Singularity / Apptainer below).

Note

Cocoa computes in double precision; a few bandwidth-sensitive fields are stored in single precision internally (see Numerical Methods). This is built in and requires no configuration, so there is a single build per architecture rather than separate precision variants.

Image Families and Naming

Cocoa images are split into three families by accelerator toolchain, identified by the tag suffix. <tag> is the Cocoa release (for example latest or a version such as 1.0):

Image	Hardware	Base
`cocoaorg/cocoa:<tag>-cpu`	CPU only (x86-64 and ARM64)	Ubuntu
`cocoaorg/cocoa:<tag>-cuda`	NVIDIA GPUs (CUDA)	`nvidia/cuda` (Ubuntu)
`cocoaorg/cocoa:<tag>-rocm`	AMD GPUs (ROCm/HIP)	`rocm/dev-ubuntu`

The families are kept separate because the CUDA and ROCm toolkits are mutually exclusive at the hardware level and each is large; a host is either NVIDIA or AMD, so a combined image would only add bloat. Splitting the CPU build out of the GPU images also keeps CPU and CI users from pulling a multi-gigabyte GPU toolkit they will never use.

The -cpu image is a single multi-architecture tag: it carries both linux/amd64 and linux/arm64 variants behind one name, and docker pull automatically selects the one matching your host. The ARM64 variant runs natively on Apple Silicon Macs through Docker Desktop, as well as on AWS Graviton, Ampere Altra, and NVIDIA Grace.

The -cuda and -rocm images are GPU-only: they do not include a CPU fallback build. Use the -cpu image to run on a node without a supported GPU.

The Trilinos base images follow the same convention: cocoaorg/trilinos_base:<tag>-cpu, -cuda, and -rocm.

Supported Architectures

CPU (-cpu image, x86-64 and ARM64):

Architecture	Backend	ISA tuning
`serial`	Single-threaded	Haswell (x86-64) / Neoverse-N1 (ARM64)
`openmp`	Multi-threaded	Haswell (x86-64) / Neoverse-N1 (ARM64)

NVIDIA GPUs (-cuda image):

Architecture	Hardware	Compute Capability
`volta70`	NVIDIA Volta (V100)	7.0
`turing75`	NVIDIA Turing (T4)	7.5
`ampere80`	NVIDIA Ampere (A100)	8.0
`ampere86`	NVIDIA Ampere (A10)	8.6
`ada89`	NVIDIA Ada (L40S)	8.9
`hopper90`	NVIDIA Hopper (H100)	9.0
`blackwell100`	NVIDIA Blackwell (B100)	10.0

AMD GPUs (-rocm image):

Architecture	Hardware	GFX ISA
`mi300`	AMD Instinct MI300X / MI300A	gfx942
`mi200`	AMD Instinct MI210 / MI250 / MI250X	gfx90a

Within an image, the variant name is simply the architecture name, e.g., ampere80, mi300, or openmp.

Running the Container

CPU (x86-64 or ARM64, auto-selected):

docker run -it -v $(pwd):/workspace cocoaorg/cocoa:latest-cpu

NVIDIA GPU (requires the NVIDIA Container Toolkit):

docker run -it --gpus all -v $(pwd):/workspace cocoaorg/cocoa:latest-cuda

AMD GPU (requires the ROCm kernel driver on the host):

docker run -it --device=/dev/kfd --device=/dev/dri \
    --group-add video --security-opt seccomp=unconfined \
    -v $(pwd):/workspace cocoaorg/cocoa:latest-rocm

The -v $(pwd):/workspace flag mounts your current directory into the container’s working directory so Cocoa can access your mesh and configuration files.

Selecting a Variant

Each image bundles the variants for its own family and selects a sensible default: the -cpu image defaults to serial, while the GPU images default to a representative architecture (ampere80 for -cuda, mi200 for -rocm). Use the select_cocoa command to switch at runtime:

# List the variants available in this image
source select_cocoa --help

# NVIDIA: select the A100 backend (in the -cuda image)
source select_cocoa ampere80

# AMD: select the MI300 backend (in the -rocm image)
source select_cocoa mi300

# CPU: select the multi-threaded build (in the -cpu image)
source select_cocoa openmp

# Verify selection
which cocoa

A variant exists only in the image for its family – for example ampere80 is present only in -cuda and mi300 only in -rocm. The selection persists for the duration of the shell session. To set it at launch, pass the COCOA_ARCH environment variable:

# Run on an A100
docker run -it --gpus all -e COCOA_ARCH=ampere80 \
    -v $(pwd):/workspace cocoaorg/cocoa:latest-cuda

Running a Simulation

Once inside the container with the appropriate architecture selected:

cocoa -i your_config.yaml

An example simulation is included in the container at /opt/cocoa/examples:

cp -r /opt/cocoa/examples/* .
source select_cocoa serial
cocoa -i simple.yaml

See Quick Start for details on configuration files and expected output.

Converting ADCIRC Meshes

The container includes the cocoa_mesh_tools.py utility for converting ADCIRC model files to Cocoa’s NetCDF mesh format. Python 3 with netCDF4 and numpy are pre-installed.

Basic mesh conversion (fort.14 only):

python3 /opt/cocoa/utils/cocoa_mesh_tools.py from_adcirc \
    --mesh fort.14 \
    --output mesh.nc

With nodal attributes (fort.13):

python3 /opt/cocoa/utils/cocoa_mesh_tools.py from_adcirc \
    --mesh fort.14 \
    --attributes fort.13 \
    --output mesh.nc

With self-attraction and loading (fort.24):

python3 /opt/cocoa/utils/cocoa_mesh_tools.py from_adcirc \
    --mesh fort.14 \
    --attributes fort.13 \
    --sal fort.24 \
    --output mesh.nc

Table 2 Conversion Script Options
Flag	Required	Description
`--mesh`	Yes	Path to ADCIRC fort.14 mesh file
`--output`	Yes	Path for output NetCDF file
`--attributes`	No	Path to ADCIRC fort.13 nodal attributes file
`--sal`	No	Path to self-attraction/loading file (fort.24 ASCII or NetCDF)

See Mesh Preparation for details on the NetCDF mesh format and supported nodal attributes.

Mounting Data Volumes

Mount your simulation directory into the container so input files are accessible and output files persist after the container exits:

# Mount a single directory
docker run -it --gpus all \
    -v /path/to/simulation:/workspace \
    cocoaorg/cocoa:latest-cuda

# Mount input and output separately
docker run -it --gpus all \
    -v /path/to/meshes:/data/meshes:ro \
    -v /path/to/output:/workspace \
    cocoaorg/cocoa:latest-cuda

Tip

Use :ro (read-only) for input data mounts to prevent accidental modification of source files.

Non-Interactive Execution

Run a simulation without entering the container interactively:

docker run --gpus all \
    -v $(pwd):/workspace \
    -e COCOA_ARCH=ampere80 \
    cocoaorg/cocoa:latest-cuda \
    cocoa -i config.yaml

Running with Singularity / Apptainer

Many HPC clusters use Singularity (or its successor Apptainer) instead of Docker, since it runs unprivileged and integrates cleanly with schedulers such as SLURM. Singularity can pull and convert the same images directly from Docker Hub – no separate build or image format is required. The commands below use singularity; substitute apptainer if that is what your site provides (the two are command-line compatible).

1. Pull and convert the image from Docker Hub. Singularity fetches the Docker image and converts it into a single .sif file:

singularity pull cocoa-cuda.sif docker://cocoaorg/cocoa:latest-cuda

Note

The GPU images are several gigabytes, and Singularity unpacks them into a temporary directory before assembling the .sif. If /tmp is small (a common default on login nodes), point the cache and temporary directories at a filesystem with sufficient free space before pulling:

export SINGULARITY_CACHEDIR=/path/to/scratch/singularity_cache
export SINGULARITY_TMPDIR=/path/to/scratch/singularity_tmp
mkdir -p "$SINGULARITY_CACHEDIR" "$SINGULARITY_TMPDIR"

2. Confirm the GPU is visible. The --nv flag passes the host NVIDIA driver into the container (use --rocm for the -rocm image). The quickest check is to run nvidia-smi inside the container:

singularity exec --nv cocoa-cuda.sif nvidia-smi

If this prints your GPU’s statistics, the card was passed through successfully.

3. Select the architecture and confirm Cocoa runs. As with the Docker images, COCOA_ARCH selects the GPU variant. Choose the name matching your hardware from the architecture table above:

singularity run --nv --env COCOA_ARCH=ampere80 cocoa-cuda.sif cocoa --version

You should see the version banner, for example:

cocoa <version>
Cocoa - Coastal and Ocean Circulation on Accelerators
(c) 2026 Zach Cobell

Important

Setting COCOA_ARCH matters: each image bundles builds for many GPUs, and this variable selects the one matching your card. Use singularity run (not exec) when relying on it – the container’s entrypoint translates COCOA_ARCH into the correct PATH only on run. Under singularity exec or singularity shell the entrypoint does not run, so select the variant explicitly instead with source select_cocoa ampere80.

4. Launch a clean session with your data mounted. HPC login shells often inject their own environment – module systems, an activated Conda base, a custom LD_LIBRARY_PATH – and because Singularity inherits the host environment and bind-mounts your home directory by default, these can leak into the container and shadow its Python and libraries. Start the container isolated from the host environment and mount your working directory to /workspace:

singularity run -c --cleanenv --nv \
    --env COCOA_ARCH=ampere80 \
    --bind "$PWD:/workspace" --pwd /workspace \
    cocoa-cuda.sif

-c / --cleanenv isolate the container from the host environment so it uses its own Python, libraries, and PATH.
--bind "$PWD:/workspace" mounts your current directory (any path works) so inputs are visible and outputs persist after the container exits.
--pwd /workspace starts you in that directory.

From inside this session you can convert an ADCIRC mesh and run a model exactly as with the Docker image:

python3 /opt/cocoa/utils/cocoa_mesh_tools.py from_adcirc \
    --mesh fort.14 --attributes fort.13 --output mesh.nc

See Converting ADCIRC Meshes above for the full set of conversion options, and Quick Start for configuring the YAML input file. Small example problems are included in the container under /opt/cocoa/examples.

Building the Container

Each family is built in two stages: first the Trilinos base image, then the Cocoa image on top of it. The three families share one build context per image and select the family with a per-family Dockerfile (Dockerfile.cpu, Dockerfile.cuda, Dockerfile.rocm), so the build scripts and entrypoints are not duplicated. Substitute the family suffix throughout.

1. Build the Trilinos base image (example: CUDA):

cd containers/base_trilinos_container
docker build -f Dockerfile.cuda -t cocoaorg/trilinos_base:latest-cuda .

2. Build the Cocoa image:

cd containers/cocoa_container
DOCKER_BUILDKIT=1 docker build --ssh default \
    -f Dockerfile.cuda -t cocoaorg/cocoa:latest-cuda .

The --ssh default flag forwards your SSH agent for private repository access during the build. Ensure your SSH agent is running with the appropriate key loaded (ssh-add).

The -cpu family is multi-architecture. Build and push both platform variants under one tag with buildx:

cd containers/cocoa_container
docker buildx build --platform linux/amd64,linux/arm64 \
    -f Dockerfile.cpu -t cocoaorg/cocoa:latest-cpu --push .

On a cluster the build is driven by the SLURM batch scripts in containers/slurm/. submit_all.sh queues every family, making each Cocoa image depend on its Trilinos base; see those scripts for the exact buildx invocation and push steps.

Note

Building a GPU family compiles Trilinos and Cocoa once per architecture in that family, which is resource-intensive and may take several hours. The AMD (-rocm) images are currently validated by compilation; runtime validation on AMD hardware is ongoing. The ARM64 half of the -cpu build runs natively on an ARM64 builder or, more slowly, under qemu emulation.