Installation

This guide covers how to build and install Cocoa on your system.

Prerequisites

Before building Cocoa, ensure you have the following dependencies installed:

Compilers

Cocoa requires a C++20 compatible compiler. The minimum versions are dictated by Kokkos 5.0 (included with Trilinos 17), which sets stricter requirements than C++20 alone.

Table 1 Minimum Compiler Requirements
Compiler	Minimum Version	Notes
GCC	10.4.0	Recommended for CPU and as CUDA host compiler
Clang (CPU)	14.0.0	For CPU-only builds
Clang (CUDA host)	15.0.0	When used as nvcc host compiler
NVIDIA nvcc	12.2	Requires CUDA Toolkit 12.2+
Intel icpx (CPU)	2022.0.0	Intel oneAPI DPC++/C++ Compiler
Intel icpx (SYCL)	2024.2.1	For SYCL backend builds
ROCm (HIPCC)	6.2.0	For AMD GPU builds
NVIDIA HPC SDK (NVC++)	22.3	Alternative to nvcc for NVIDIA GPUs

Note

These requirements are set by Kokkos 5.0. See the Kokkos Requirements documentation for the most up-to-date information.

Build System

CMake 3.23 or later
GNU Make or Ninja build system

Required Libraries

The following libraries must be pre-installed on your system:

Trilinos 17.0 or later (with Kokkos, KokkosKernels, Tpetra, Belos, Ifpack2, Zoltan2 enabled)
NetCDF-C (4.9.3+ recommended; for mesh and output I/O)
HDF5 (development headers required; installed automatically as a NetCDF-C dependency)

Warning

NetCDF-C versions prior to 4.9.3 have a bug (#2674) that causes spurious HDF5 error messages on stderr when reading variables. Ubuntu 24.04 ships NetCDF-C 4.9.2; if using that distribution, build NetCDF-C 4.9.3+ from source.

ParMETIS (for mesh partitioning, required via Zoltan2 for MPI builds)

Note

Trilinos 17.0+ is required because it includes Kokkos 5.0+ which uses APIs that Cocoa depends on.

Automatically Fetched Dependencies

The following dependencies are automatically downloaded and built via CPM during CMake configuration:

yaml-cpp (configuration file parsing)
spdlog (logging)
fmt (string formatting)
Catch2 (unit testing)

Optional Dependencies

CUDA Toolkit (for NVIDIA GPU support, required for Trilinos CUDA build)
ROCm (for AMD GPU support, required for Trilinos HIP build)
MPI (for distributed computing, if Trilinos was built with MPI)

Building from Source

Clone the Repository

git clone https://github.com/cocoaorg/cocoa.git
cd cocoa

Configure with CMake

Basic Build:

mkdir build && cd build
cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DNETCDF_DIR=/path/to/netcdf-c \
    -DTrilinos_DIR=/path/to/trilinos

With custom install prefix:

cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=/path/to/install \
    -DNETCDF_DIR=/path/to/netcdf-c \
    -DTrilinos_DIR=/path/to/trilinos

CMake Options

Option	Description	Default
`NETCDF_DIR`	Hint path for CMake’s `FindNetCDF` module. Point to the NetCDF-C installation prefix.	(auto-detected; required if not in system paths)
`Trilinos_DIR`	Path to the Trilinos CMake config directory (e.g., `<prefix>/lib/cmake/Trilinos`).	(auto-detected; required if not in system paths)
`cocoa_BACKEND`	Combined execution space and MPI configuration. Available options depend on the Trilinos build. Examples: `CUDA+MPI`, `CUDA`, `HIP+MPI`, `OPENMP+MPI`, `OPENMP`, `SERIAL+MPI`, `SERIAL`.	`DEFAULT` – auto-selects the best available backend from Trilinos (prefers GPU over CPU, MPI over non-MPI)
`CMAKE_BUILD_TYPE`	Build type. Options: `Release`, `Debug`, `RelWithDebInfo`, `MinSizeRel`.	`RelWithDebInfo` (if not specified)
`CMAKE_INSTALL_PREFIX`	Installation directory for `make install`.	`/usr/local`
`BUILD_TESTING`	Build the unit test suite (requires Catch2, fetched automatically).	`OFF`
`cocoa_MAINTAINER_MODE`	Enable strict compiler warnings, sanitizers, cppcheck, and hardening. Automatically enabled when building as the top-level project.	`OFF` (`ON` when top-level project)
`cocoa_CUDA_MEMORY_SPACE`	CUDA memory space. `CUDA` for device memory, `CUDAUVM` for unified virtual memory. Only applies to CUDA backends.	`CUDA`
`cocoa_USE_THRUST`	Use the Thrust stream-compaction fast path for the wet/dry index lists (advanced). See Thrust Stream-Compaction Fast Path below. Requires a Thrust/CCCL installation for non-CUDA backends.	`ON` for CUDA, `OFF` otherwise
`cocoa_PRINT_LOG_TIME`	Print elapsed wall-clock time in screen log output.	`OFF`

Floating-Point Precision

Cocoa computes in double precision (FP64). There is no build-time precision option: to reduce GPU memory traffic, a fixed set of bandwidth-sensitive fields is stored as float and promoted to double on read, while all arithmetic stays in double. See Numerical Methods for the list of mixed-precision fields and the rationale.

Thrust Stream-Compaction Fast Path

Each step, the wet/dry solver rebuilds compacted lists of the currently wet elements and nodes – a stream compaction (copy_if over an index range). Cocoa has two implementations of this, selected at build time by cocoa_USE_THRUST:

Thrust fast path (cocoa_USE_THRUST=ON): thrust::copy_if. On CUDA this is a single-pass CUB DeviceSelect on the Kokkos execution-space stream, which measured roughly 1–2% of total simulation time on a V100 versus the fallback. On a host backend it runs through the Thrust device system (for example, OpenMP).
Portable fallback (cocoa_USE_THRUST=OFF): a fused Kokkos parallel_scan. Both paths use the same selection predicate and produce identical results.

The option defaults to ON for CUDA backends, where Thrust ships with the CUDA Toolkit and no extra setup is needed. It is OFF by default for all other backends and is an advanced option.

To enable the fast path on a non-CUDA (for example, OpenMP) backend you must provide a Thrust installation, typically via NVIDIA CCCL, and point CMake at it:

cmake .. \
    -Dcocoa_BACKEND=OPENMP+MPI \
    -Dcocoa_USE_THRUST=ON \
    -DCCCL_DIR=/path/to/cccl/lib/cmake/cccl
    # or, for a standalone Thrust:
    # -DThrust_DIR=/path/to/thrust/lib/cmake/thrust

Cocoa configures the host Thrust device system to match the selected backend (OpenMP or Serial). If cocoa_USE_THRUST=ON is requested but no Thrust/CCCL installation is found, or the backend is not CUDA/OpenMP/Serial, configuration fails with an explanatory error. To turn the fast path off (including on a CUDA build), pass -Dcocoa_USE_THRUST=OFF.

Compile

make -j$(nproc)

Install

make install

Verifying the Installation

Run the test suite to verify your installation:

ctest --output-on-failure