Running CalculiX and OpenFOAM on HPC

Hi,
I split the topic after the python problem disappeared.

CalculiX on HPC

The CCX adapter itself is not designed for being run in parallel with MPI. That said, it can be still launched on its own node using MPI mpirun -n 1 --hostfile=X.

To my understanding, CCX is best used on fat nodes of your cluster, using OpenMP threads to take advantage of the whole node.
If performance is a problematic, you can switch to their PastiX solver with optional CUDA support.

CacluliX discourse is probaly your best sources of information here. @mattfrei is there any information to be added?

CCX and Slurm session partitioning

and

Create a hostfile with the first node for CalculiX and a hostfile with the remaining nodes for OpenFOAM. In the documentation we use something like this:

head -1 hosts.ompi > hosts.ccx
tail +2 hosts.ompi > hosts.of

Then start one CCX with one rank using the hosts.ccx hostfile, and OpenFOAM with the other.

This is pretty much what you are doing right now.

This may be different on heterogenous clusters, and we have no experience with this so far.

It may be that you need to use job farming, by queuing one job of one node on the partition with the fat nodes, and another job for n nodes on the partition with the normal nodes. These jobs need to be launched together in order not to waste resources.

The LRZ, home the SuperMUC-NG, has some documentation on this subject.

In any way, it’s probably best to get hands-on time with your system admin to figure this out.

Communication cost in coupled simulations

It is always tricky to make claims about communication cost when coupling simulations using preCICE.

The communication cost in terms of pure transfer is generally not an issue.
The observed communication cost includes various waiting times and is heavily influenced by

  1. the used coupling scheme (especially serial)
  2. the amount of ranks per participant
  3. the load balance of your participant including the runtime of the coupled solver and data mapping schemes in preCICE.

This is why we developed profiling tools that give you a visual overview of all ranks and participants at the same time. The visual representation of these wait time is invaluable to localizing the problem at hand.

I recommend trimming your simulation with <max-time-windows ... /> and enabling <profiling mode="all" /> to get the full picture.

Hope that helps!

1 Like