==================== Trilinos Integration ==================== Cocoa uses Trilinos for distributed linear algebra and iterative solvers. This document describes the specific type aliases and usage patterns. Trilinos Type Aliases --------------------- Cocoa defines Trilinos types in ``core/types/LinearAlgebraTypes.hpp`` to ensure consistency with the Kokkos execution space: LinearAlgebraTypes ^^^^^^^^^^^^^^^^^^ .. code-block:: cpp namespace Cocoa::Types { struct LinearAlgebraTypes { // Node type wraps Kokkos execution space for Tpetra using ExecutionNode = Tpetra::KokkosCompat::KokkosDeviceWrapperNode; // Core Tpetra types parameterized by ordinal types and node using Map = Tpetra::Map; using Vector = Tpetra::Vector; using MultiVector = Tpetra::MultiVector; using Import = Tpetra::Import; using Export = Tpetra::Export; // Finite element matrix types using FECrsGraph = Tpetra::FECrsGraph; using FECrsMatrix = Tpetra::FECrsMatrix; // Operator interface for solvers using Operator = Tpetra::Operator; }; } SolverTypes ^^^^^^^^^^^ .. code-block:: cpp namespace Cocoa::Types { struct SolverTypes { // Belos template parameters using SC = Scalar; using MV = LinearAlgebraTypes::MultiVector; using OP = LinearAlgebraTypes::Operator; // Ifpack2 preconditioner using PreconditionerType = Ifpack2::Preconditioner< LinearAlgebraTypes::FECrsMatrix::scalar_type, LinearAlgebraTypes::FECrsMatrix::local_ordinal_type, LinearAlgebraTypes::FECrsMatrix::global_ordinal_type, LinearAlgebraTypes::FECrsMatrix::node_type>; // Belos solver types using LinearProblem = Belos::LinearProblem; using SolverManager = Belos::SolverManager; }; } GWCE Matrix Assembly -------------------- Cocoa uses ``Tpetra::FECrsMatrix`` for finite element assembly, which provides efficient element-wise accumulation. Graph Construction ^^^^^^^^^^^^^^^^^^ The sparsity pattern is built from mesh connectivity: .. code-block:: cpp // Create map (one-to-one in serial) auto map = Teuchos::rcp(new LinearAlgebraTypes::Map( num_nodes, 0, comm)); // Create FE graph with estimated entries per row auto graph = Teuchos::rcp(new LinearAlgebraTypes::FECrsGraph( map, max_entries_per_row)); // Insert connectivity for each element for (int elem = 0; elem < num_elements; ++elem) { auto nodes = get_element_nodes(elem); for (int i = 0; i < 3; ++i) { graph->insertGlobalIndices(nodes[i], nodes); } } graph->fillComplete(); Matrix Assembly Pattern ^^^^^^^^^^^^^^^^^^^^^^^ From ``GwceMatrixAssemblyConsistentKernels.hpp``: .. code-block:: cpp // Begin assembly phase matrix->beginAssembly(); // Get local matrix for device access auto local_matrix = matrix->getLocalMatrixDevice(); Kokkos::parallel_for("GWCE Matrix Assembly", Types::DefaultRangePolicy(0, num_elements), KOKKOS_LAMBDA(const int elem_idx) { // Compute elemental matrix... const auto vals = compute_stiffness_values(/* ... */); // Insert using sorted column indices local_matrix.sumIntoValues( row_node, sorted_col_nodes.data(), 3, vals.data(), true, // sorted columns true); // atomic }); // End assembly phase matrix->endAssembly(); Boundary Condition Application ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Dirichlet BCs are applied by modifying matrix rows: .. code-block:: cpp // For each BC node Kokkos::parallel_for("Zero Dirichlet Rows", Policy(0, num_bc_nodes), KOKKOS_LAMBDA(int bc_idx) { auto row_view = local_matrix.row(bc_node); for (int j = 0; j < row_view.length; ++j) { if (row_view.colidx(j) == bc_node) { row_view.value(j) = elevation_penalty; // Diagonal } else { row_view.value(j) = 0.0; // Off-diagonal } } }); Linear Solver Configuration --------------------------- Cocoa solves the GWCE system with a Jacobi-preconditioned Conjugate Gradient solver, configured in ``ConjugateGradientSolver.hpp``. The default implementation is the Tpetra-native pipelined CG (``CgSolverType::TpetraCgPipeline``). Two other implementations are selectable via ``CgSolverType``: ``TpetraSingleReduce`` (Tpetra-native single-reduce CG) and ``BelosCG`` (standard Belos CG). Solver Configuration ^^^^^^^^^^^^^^^^^^^^ The ``SolverConfig`` struct controls solver behavior: .. code-block:: cpp struct SolverConfig { CgSolverType solver_type{CgSolverType::TpetraCgPipeline}; // CG implementation double tolerance{1e-5}; // Convergence tolerance int max_iterations{50}; // Maximum CG iterations int verbosity{0}; // 0=silent, 1=summary, 2=detailed ResidualScaling residual_scaling{ResidualScaling::NormOfInitialResidual}; // Convergence scaling }; Residual Scaling Options ^^^^^^^^^^^^^^^^^^^^^^^^ The ``ResidualScaling`` enum controls how residual norms are scaled for convergence testing: .. code-block:: cpp enum class ResidualScaling { NormOfInitialResidual, // ||r|| / ||r_0|| < tol (default) NormOfPrecInitRes, // ||M^{-1}r|| / ||M^{-1}r_0|| < tol None // ||r|| < tol (absolute) }; **NormOfInitialResidual** is the Cocoa default, providing iteration counts comparable to ADCIRC's ITPACK JCG solver at similar tolerance values. Solver Setup ^^^^^^^^^^^^ .. code-block:: cpp // Create linear problem auto problem = Teuchos::rcp(new SolverTypes::LinearProblem( matrix, solution, rhs)); // Configure solver parameters auto params = Teuchos::rcp(new Teuchos::ParameterList()); params->set("Maximum Iterations", config.max_iterations); // Default: 50 params->set("Convergence Tolerance", config.tolerance); // Default: 1e-5 params->set("Implicit Residual Scaling", "Norm of Initial Residual"); // Default scaling // Verbosity: base level always includes Errors and Warnings int verbosity = Belos::Errors | Belos::Warnings; if (config.verbosity >= 1) { verbosity |= Belos::FinalSummary; } if (config.verbosity >= 2) { verbosity |= Belos::IterationDetails; params->set("Output Frequency", 1); } params->set("Verbosity", verbosity); // Create CG solver Belos::SolverFactory factory; auto solver = factory.create("CG", params); Jacobi Preconditioner ^^^^^^^^^^^^^^^^^^^^^ From ``JacobiPreconditioner.hpp``: .. code-block:: cpp // Create Ifpack2 preconditioner via factory instance Ifpack2::Factory factory; auto prec = factory.create("RELAXATION", matrix); // Configure as Jacobi Teuchos::ParameterList prec_params; prec_params.set("relaxation: type", "Jacobi"); prec_params.set("relaxation: sweeps", 1); prec_params.set("relaxation: damping factor", 1.0); prec->setParameters(prec_params); // Initialize sparsity pattern prec->initialize(); // Compute values (must be called when matrix changes) prec->compute(); // Set on linear problem problem->setRightPrec(prec); Solving ^^^^^^^ .. code-block:: cpp // Set up the problem problem->setProblem(); // Solve auto result = solver->solve(); if (result != Belos::Converged) { // Handle non-convergence } Vector Operations ----------------- Tpetra vectors integrate with Kokkos views for device-side access: .. code-block:: cpp // Create vector auto vec = Teuchos::rcp(new LinearAlgebraTypes::Vector(map)); // Get device view for parallel operations auto view = vec->getLocalViewDevice(Tpetra::Access::ReadWrite); // Use in Kokkos kernel Kokkos::parallel_for("Set values", Types::DefaultRangePolicy(0, num_nodes), KOKKOS_LAMBDA(int i) { view(i, 0) = initial_value(i); }); Built-in Vector Operations ^^^^^^^^^^^^^^^^^^^^^^^^^^ Tpetra provides efficient built-in operations that should be preferred over custom Kokkos kernels when applicable. These handle MPI communication automatically and are optimized for the target architecture. **Dot Product:** .. code-block:: cpp // Compute sum of squares (for RMS calculations) const auto sum_sq = vec->dot(*vec); // = sum(vec[i]^2) // Elevation penalty = RMS of diagonal const auto ep = std::sqrt(sum_sq / num_nodes); **Reciprocal:** .. code-block:: cpp // Element-wise reciprocal: result[i] = 1/input[i] result->reciprocal(*input); **Element-wise Multiply:** .. code-block:: cpp // result = scalarThis * result + scalarAB * A * B (element-wise) result->elementWiseMultiply(scalarAB, A, B, scalarThis); // Example: result = A * B result->elementWiseMultiply(1.0, A, B, 0.0); **Lumped Diagonal Solve Pattern:** The lumped GWCE solver uses these built-ins for efficient diagonal solve: .. code-block:: cpp // Solve: solution = rhs / diagonal // Using two Tpetra operations instead of custom kernel: solution->reciprocal(*diagonal); // solution = 1/diagonal solution->elementWiseMultiply(1.0, *rhs, *solution, 0.0); // solution = rhs * solution This pattern: - Eliminates custom kernel boilerplate - Handles edge cases (e.g., division safety) consistently - Enables future MPI parallelism without code changes **Norms:** .. code-block:: cpp // L2 norm const auto l2_norm = vec->norm2(); // L1 norm (sum of absolute values) const auto l1_norm = vec->norm1(); // Infinity norm (max absolute value) const auto inf_norm = vec->normInf(); Elevation Penalty Computation ----------------------------- The elevation penalty (EP) maintains numerical conditioning for BC rows: .. code-block:: cpp Types::Scalar compute_elevation_penalty( const Teuchos::RCP& matrix, int num_nodes) { auto local_matrix = matrix->getLocalMatrixDevice(); Types::Scalar sum_sq = 0.0; Kokkos::parallel_reduce("Compute diagonal RMS", Types::DefaultRangePolicy(0, num_nodes), KOKKOS_LAMBDA(int row, Types::Scalar& local_sum) { auto row_view = local_matrix.row(row); for (int j = 0; j < row_view.length; ++j) { if (row_view.colidx(j) == row) { local_sum += row_view.value(j) * row_view.value(j); break; } } }, sum_sq); return std::sqrt(sum_sq / num_nodes); } Implementation Files -------------------- - ``core/types/LinearAlgebraTypes.hpp``: Type aliases - ``numeric/continuity/GwceSolver.hpp``: Main GWCE solver class - ``numeric/continuity/consistent/ConjugateGradientSolver.hpp``: Belos CG wrapper - ``numeric/continuity/consistent/JacobiPreconditioner.hpp``: Ifpack2 Jacobi wrapper - ``numeric/continuity/consistent/GwceMatrixAssemblyConsistentKernels.hpp``: Matrix assembly kernels