Can preCICE be used with GPUs or other accelerator cards?

My target system features GPUs or other accelerator cards.

Can preCICE use them to accelerate the simulation?

preCICE itself does not use GPUs or other accelerator cards and there is a good reason for this.

The primary task of preCICE is to couple solvers (programs) to form a complete simulation. This is mainly organisational work, involving a lot of communication. The runtime portion of preCICE should be minuscule in comparison to that of the coupled solvers. Thus, the vast majority of the compute resources should be used by the solvers.

That said, you may run into a situation, where a used solvers is not designed to use GPUs. In this case, preCICE could take advantage of available yet idle GPUs.

The only compute-heavy tasks that could take advantage of this hardware are the RBF mappings. These are implemented using PETSc, which is currently heavily investing in GPU support. Thus, preCICE may provide GPU-accelerated RBF mapping in the future.

Important detail: as explained above preCICE itself currently does not use or support GPUs. However, preCICE was already successfully used to couple solvers that are using one or more GPUs. An example is the coupling of the ray tracing engine NVIDIA OptiX with THerMoS. Here, the compute-heavy part of the ray tracing takes place on the GPU. The results are then transferred to the CPU (relatively small cost compared to the ray tracing). The CPU then takes care of the coupling to THerMoS via preCICE.

Moving the coupling data from the GPU to the CPU is currently necessary. But this situation might be improved with a new feature that will give the GPU direct access to the data buffer of preCICE (see Allow direct access to data buffers for GPGPU support · Issue #1146 · precice/precice · GitHub).

1 Like

Another work that features a GPU-enabled solver coupled via preCICE is the dissertation of Marta Camps Santasmasas (University of Manchester), Hybrid GPU / CPU Navier-Stokes lattice Boltzmann method for urban wind flow.

1 Like

There are recently some updates on the answer Frederic gave in the first place.

How minuscule the workload of preCICE in comparison to the coupled solvers is depends heavily on the size of the coupling interface compared to the global DoFs used by the solver. For a very basic interface coupling, the ratio of interface DoFs/solver DoFs might be small, but the larger the ratio becomes (e.g. refined meshes at the coupling interface or volume coupling) the more significant get the computations carried out by preCICE. Also, available GPU resources on your compute platform might enable you to compute much larger problem sizes in the solver and even smaller ratios of interface DoFs/solver DoFs might pose a bottleneck for the coupled simulation if preCICE is restricted to the usage of CPUs only.

For this reason, we actively developed and integrated support for RBF data mappings on the GPU throughout the course of Timo’s master thesis OPUS: Efficient application of accelerator cards for the coupling library preCICE. The implementation is available with the (upcoming) preCICE version 3.0. See also the corresponding PR Added Ginkgo + GPU support for RBF data mapping by Timo-Schrader · Pull Request #1581 · precice/precice · GitHub

The thesis (link above) also gives different examples with preCICE and or the solvers running on different hardware in a coupled simulation. Using the implementation, the rbf data interpolation can now be performed using OpenMP (multi-threaded CPU), CUDA (for Nvidia GPUs) or HIP (for AMD/Nvidia) GPUs.

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.