Installation

This guide covers how to build and install Cocoa on your system.

Prerequisites

Before building Cocoa, ensure you have the following dependencies installed:

Compilers

Cocoa requires a C++20 compatible compiler. The minimum versions are dictated by Kokkos 5.0 (included with Trilinos 17), which sets stricter requirements than C++20 alone.

Table 1 Minimum Compiler Requirements

Compiler

Minimum Version

Notes

GCC

10.4.0

Recommended for CPU and as CUDA host compiler

Clang (CPU)

14.0.0

For CPU-only builds

Clang (CUDA host)

15.0.0

When used as nvcc host compiler

NVIDIA nvcc

12.2

Requires CUDA Toolkit 12.2+

Intel icpx (CPU)

2022.0.0

Intel oneAPI DPC++/C++ Compiler

Intel icpx (SYCL)

2024.2.1

For SYCL backend builds

ROCm (HIPCC)

6.2.0

For AMD GPU builds

NVIDIA HPC SDK (NVC++)

22.3

Alternative to nvcc for NVIDIA GPUs

Note

These requirements are set by Kokkos 5.0. See the Kokkos Requirements documentation for the most up-to-date information.

Build System

  • CMake 3.23 or later

  • GNU Make or Ninja build system

Required Libraries

The following libraries must be pre-installed on your system:

  • Trilinos 17.0 or later (with Kokkos, KokkosKernels, Tpetra, Belos, Ifpack2, Zoltan2 enabled)

  • NetCDF-C (4.9.3+ recommended; for mesh and output I/O)

  • HDF5 (development headers required; installed automatically as a NetCDF-C dependency)

Warning

NetCDF-C versions prior to 4.9.3 have a bug (#2674) that causes spurious HDF5 error messages on stderr when reading variables. Ubuntu 24.04 ships NetCDF-C 4.9.2; if using that distribution, build NetCDF-C 4.9.3+ from source.

  • ParMETIS (for mesh partitioning, required via Zoltan2 for MPI builds)

Note

Trilinos 17.0+ is required because it includes Kokkos 5.0+ which uses APIs that Cocoa depends on.

Automatically Fetched Dependencies

The following dependencies are automatically downloaded and built via CPM during CMake configuration:

  • yaml-cpp (configuration file parsing)

  • spdlog (logging)

  • fmt (string formatting)

  • Catch2 (unit testing)

Optional Dependencies

  • CUDA Toolkit (for NVIDIA GPU support, required for Trilinos CUDA build)

  • ROCm (for AMD GPU support, required for Trilinos HIP build)

  • MPI (for distributed computing, if Trilinos was built with MPI)

Building from Source

Clone the Repository

git clone https://github.com/cocoaorg/cocoa.git
cd cocoa

Configure with CMake

Basic Build:

mkdir build && cd build
cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DNETCDF_DIR=/path/to/netcdf-c \
    -DTrilinos_DIR=/path/to/trilinos

With custom install prefix:

cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=/path/to/install \
    -DNETCDF_DIR=/path/to/netcdf-c \
    -DTrilinos_DIR=/path/to/trilinos

CMake Options

Option

Description

Default

NETCDF_DIR

Hint path for CMake’s FindNetCDF module. Point to the NetCDF-C installation prefix.

(auto-detected; required if not in system paths)

Trilinos_DIR

Path to the Trilinos CMake config directory (e.g., <prefix>/lib/cmake/Trilinos).

(auto-detected; required if not in system paths)

cocoa_BACKEND

Combined execution space and MPI configuration. Available options depend on the Trilinos build. Examples: CUDA+MPI, CUDA, HIP+MPI, OPENMP+MPI, OPENMP, SERIAL+MPI, SERIAL.

DEFAULT – auto-selects the best available backend from Trilinos (prefers GPU over CPU, MPI over non-MPI)

CMAKE_BUILD_TYPE

Build type. Options: Release, Debug, RelWithDebInfo, MinSizeRel.

RelWithDebInfo (if not specified)

CMAKE_INSTALL_PREFIX

Installation directory for make install.

/usr/local

BUILD_TESTING

Build the unit test suite (requires Catch2, fetched automatically).

OFF

cocoa_MAINTAINER_MODE

Enable strict compiler warnings, sanitizers, cppcheck, and hardening. Automatically enabled when building as the top-level project.

OFF (ON when top-level project)

cocoa_CUDA_MEMORY_SPACE

CUDA memory space. CUDA for device memory, CUDAUVM for unified virtual memory. Only applies to CUDA backends.

CUDA

cocoa_USE_THRUST

Use the Thrust stream-compaction fast path for the wet/dry index lists (advanced). See Thrust Stream-Compaction Fast Path below. Requires a Thrust/CCCL installation for non-CUDA backends.

ON for CUDA, OFF otherwise

cocoa_PRINT_LOG_TIME

Print elapsed wall-clock time in screen log output.

OFF

Floating-Point Precision

Cocoa computes in double precision (FP64). There is no build-time precision option: to reduce GPU memory traffic, a fixed set of bandwidth-sensitive fields is stored as float and promoted to double on read, while all arithmetic stays in double. See Numerical Methods for the list of mixed-precision fields and the rationale.

Thrust Stream-Compaction Fast Path

Each step, the wet/dry solver rebuilds compacted lists of the currently wet elements and nodes – a stream compaction (copy_if over an index range). Cocoa has two implementations of this, selected at build time by cocoa_USE_THRUST:

  • Thrust fast path (cocoa_USE_THRUST=ON): thrust::copy_if. On CUDA this is a single-pass CUB DeviceSelect on the Kokkos execution-space stream, which measured roughly 1–2% of total simulation time on a V100 versus the fallback. On a host backend it runs through the Thrust device system (for example, OpenMP).

  • Portable fallback (cocoa_USE_THRUST=OFF): a fused Kokkos parallel_scan. Both paths use the same selection predicate and produce identical results.

The option defaults to ON for CUDA backends, where Thrust ships with the CUDA Toolkit and no extra setup is needed. It is OFF by default for all other backends and is an advanced option.

To enable the fast path on a non-CUDA (for example, OpenMP) backend you must provide a Thrust installation, typically via NVIDIA CCCL, and point CMake at it:

cmake .. \
    -Dcocoa_BACKEND=OPENMP+MPI \
    -Dcocoa_USE_THRUST=ON \
    -DCCCL_DIR=/path/to/cccl/lib/cmake/cccl
    # or, for a standalone Thrust:
    # -DThrust_DIR=/path/to/thrust/lib/cmake/thrust

Cocoa configures the host Thrust device system to match the selected backend (OpenMP or Serial). If cocoa_USE_THRUST=ON is requested but no Thrust/CCCL installation is found, or the backend is not CUDA/OpenMP/Serial, configuration fails with an explanatory error. To turn the fast path off (including on a CUDA build), pass -Dcocoa_USE_THRUST=OFF.

Compile

make -j$(nproc)

Install

make install

Verifying the Installation

Run the test suite to verify your installation:

ctest --output-on-failure