============ Installation ============ This guide covers how to build and install Cocoa on your system. Prerequisites ------------- Before building Cocoa, ensure you have the following dependencies installed: Compilers ^^^^^^^^^ Cocoa requires a C++20 compatible compiler. The minimum versions are dictated by Kokkos 5.0 (included with Trilinos 17), which sets stricter requirements than C++20 alone. .. list-table:: Minimum Compiler Requirements :header-rows: 1 :widths: 30 20 50 * - Compiler - Minimum Version - Notes * - GCC - 10.4.0 - Recommended for CPU and as CUDA host compiler * - Clang (CPU) - 14.0.0 - For CPU-only builds * - Clang (CUDA host) - 15.0.0 - When used as nvcc host compiler * - NVIDIA nvcc - 12.2 - Requires CUDA Toolkit 12.2+ * - Intel icpx (CPU) - 2022.0.0 - Intel oneAPI DPC++/C++ Compiler * - Intel icpx (SYCL) - 2024.2.1 - For SYCL backend builds * - ROCm (HIPCC) - 6.2.0 - For AMD GPU builds * - NVIDIA HPC SDK (NVC++) - 22.3 - Alternative to nvcc for NVIDIA GPUs .. note:: These requirements are set by Kokkos 5.0. See the `Kokkos Requirements `_ documentation for the most up-to-date information. Build System ^^^^^^^^^^^^ - CMake 3.23 or later - GNU Make or Ninja build system Required Libraries ^^^^^^^^^^^^^^^^^^ The following libraries must be pre-installed on your system: - **Trilinos 17.0 or later** (with Kokkos, KokkosKernels, Tpetra, Belos, Ifpack2, Zoltan2 enabled) - **NetCDF-C** (4.9.3+ recommended; for mesh and output I/O) - **HDF5** (development headers required; installed automatically as a NetCDF-C dependency) .. warning:: NetCDF-C versions prior to 4.9.3 have a bug (`#2674 `_) that causes spurious HDF5 error messages on stderr when reading variables. Ubuntu 24.04 ships NetCDF-C 4.9.2; if using that distribution, build NetCDF-C 4.9.3+ from source. - **ParMETIS** (for mesh partitioning, required via Zoltan2 for MPI builds) .. note:: Trilinos 17.0+ is required because it includes Kokkos 5.0+ which uses APIs that Cocoa depends on. Automatically Fetched Dependencies ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following dependencies are automatically downloaded and built via `CPM `_ during CMake configuration: - yaml-cpp (configuration file parsing) - spdlog (logging) - fmt (string formatting) - Catch2 (unit testing) Optional Dependencies ^^^^^^^^^^^^^^^^^^^^^ - **CUDA Toolkit** (for NVIDIA GPU support, required for Trilinos CUDA build) - **ROCm** (for AMD GPU support, required for Trilinos HIP build) - **MPI** (for distributed computing, if Trilinos was built with MPI) Building from Source -------------------- Clone the Repository ^^^^^^^^^^^^^^^^^^^^ .. code-block:: bash git clone https://github.com/zcobell/cocoa.git cd cocoa Configure with CMake ^^^^^^^^^^^^^^^^^^^^ **Basic Build**: .. code-block:: bash mkdir build && cd build cmake .. \ -DCMAKE_BUILD_TYPE=Release \ -DNETCDF_DIR=/path/to/netcdf-c \ -DTrilinos_DIR=/path/to/trilinos **With custom install prefix**: .. code-block:: bash cmake .. \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=/path/to/install \ -DNETCDF_DIR=/path/to/netcdf-c \ -DTrilinos_DIR=/path/to/trilinos CMake Options ^^^^^^^^^^^^^ .. list-table:: :header-rows: 1 :widths: 28 47 25 * - Option - Description - Default * - ``NETCDF_DIR`` - Hint path for CMake's ``FindNetCDF`` module. Point to the NetCDF-C installation prefix. - (auto-detected; required if not in system paths) * - ``Trilinos_DIR`` - Path to the Trilinos CMake config directory (e.g., ``/lib/cmake/Trilinos``). - (auto-detected; required if not in system paths) * - ``cocoa_BACKEND`` - Combined execution space and MPI configuration. Available options depend on the Trilinos build. Examples: ``CUDA+MPI``, ``CUDA``, ``HIP+MPI``, ``OPENMP+MPI``, ``OPENMP``, ``SERIAL+MPI``, ``SERIAL``. - ``DEFAULT`` -- auto-selects the best available backend from Trilinos (prefers GPU over CPU, MPI over non-MPI) * - ``CMAKE_BUILD_TYPE`` - Build type. Options: ``Release``, ``Debug``, ``RelWithDebInfo``, ``MinSizeRel``. - ``RelWithDebInfo`` (if not specified) * - ``CMAKE_INSTALL_PREFIX`` - Installation directory for ``make install``. - ``/usr/local`` * - ``BUILD_TESTING`` - Build the unit test suite (requires Catch2, fetched automatically). - ``OFF`` * - ``cocoa_MAINTAINER_MODE`` - Enable strict compiler warnings, sanitizers, cppcheck, and hardening. Automatically enabled when building as the top-level project. - ``OFF`` (``ON`` when top-level project) * - ``cocoa_CUDA_MEMORY_SPACE`` - CUDA memory space. ``CUDA`` for device memory, ``CUDAUVM`` for unified virtual memory. Only applies to CUDA backends. - ``CUDA`` * - ``cocoa_USE_THRUST`` - Use the Thrust stream-compaction fast path for the wet/dry index lists (advanced). See :ref:`thrust-fast-path` below. Requires a Thrust/CCCL installation for non-CUDA backends. - ``ON`` for CUDA, ``OFF`` otherwise * - ``cocoa_PRINT_LOG_TIME`` - Print elapsed wall-clock time in screen log output. - ``OFF`` Floating-Point Precision ^^^^^^^^^^^^^^^^^^^^^^^^ Cocoa computes in double precision (FP64). There is no build-time precision option: to reduce GPU memory traffic, a fixed set of bandwidth-sensitive fields is *stored* as float and promoted to double on read, while all arithmetic stays in double. See :doc:`/theory/numerical_methods` for the list of mixed-precision fields and the rationale. .. _thrust-fast-path: Thrust Stream-Compaction Fast Path ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Each step, the wet/dry solver rebuilds compacted lists of the currently wet elements and nodes -- a stream compaction (``copy_if`` over an index range). Cocoa has two implementations of this, selected at build time by ``cocoa_USE_THRUST``: - **Thrust fast path** (``cocoa_USE_THRUST=ON``): ``thrust::copy_if``. On CUDA this is a single-pass CUB ``DeviceSelect`` on the Kokkos execution-space stream, which measured roughly 1--2% of total simulation time on a V100 versus the fallback. On a host backend it runs through the Thrust device system (for example, OpenMP). - **Portable fallback** (``cocoa_USE_THRUST=OFF``): a fused Kokkos ``parallel_scan``. Both paths use the same selection predicate and produce identical results. The option defaults to ``ON`` for CUDA backends, where Thrust ships with the CUDA Toolkit and no extra setup is needed. It is ``OFF`` by default for all other backends and is an *advanced* option. To enable the fast path on a non-CUDA (for example, OpenMP) backend you must provide a `Thrust `_ installation, typically via `NVIDIA CCCL `_, and point CMake at it: .. code-block:: bash cmake .. \ -Dcocoa_BACKEND=OPENMP+MPI \ -Dcocoa_USE_THRUST=ON \ -DCCCL_DIR=/path/to/cccl/lib/cmake/cccl # or, for a standalone Thrust: # -DThrust_DIR=/path/to/thrust/lib/cmake/thrust Cocoa configures the host Thrust device system to match the selected backend (OpenMP or Serial). If ``cocoa_USE_THRUST=ON`` is requested but no Thrust/CCCL installation is found, or the backend is not CUDA/OpenMP/Serial, configuration fails with an explanatory error. To turn the fast path off (including on a CUDA build), pass ``-Dcocoa_USE_THRUST=OFF``. Compile ^^^^^^^ .. code-block:: bash make -j$(nproc) Install ^^^^^^^ .. code-block:: bash make install Verifying the Installation -------------------------- Run the test suite to verify your installation: .. code-block:: bash ctest --output-on-failure