Running CalculiX and OpenFOAM on HPC

Umut · February 27, 2025, 11:50am

I’m extremely delighted and grateful for the guidance I’ve received here, on https://precice.discourse.group/t/does-the-executable-binary-file-of-calculix-support-running-on-a-slurm-cluster-via-mpi/2336/6 and on the CCX forum: https://calculix.discourse.group/t/can-calculix-run-across-multiple-nodes/1316/6. Thank you all for your help. Below, I’ve summarized my situation and outlined my next steps.

Goal

I aim to transition from running my simulation on a 6-core CPU on my PC to an HPC system with 128-core nodes, targeting at least a 10x speed-up (using 20+ times more cores). However, so far, I’ve only achieved a 1x to 4x speed-up.

Case Details

I’m running a steady-state Conjugate Heat Transfer (CHT) case with radiation, involving one fluid and one solid participant. The coupling is handled using parallel-implicit mode with the same preCICE configuration as in the heat-exchanger tutorial.

For radiation modeling, I have two options:

fvDOM in OpenFOAM: After 4–5 timesteps, coupling iterations per timestep drop to 1 (almost like explicit coupling).
Cavity radiation in CCX: Requires ~10 coupling iterations per timestep but provides more reliable results.

Performance on HPC

On my PC (6 cores), the case runs successfully.
On HPC (128-core nodes), I expected a 10x speed-up when using 1–2 nodes (20–40x more cores).
However, results show:
- fvDOM in OpenFOAM: ~4x speed-up.
- Cavity radiation in CCX: <2x speed-up, despite a 20x increase in core count.

From OpenFOAM’s executionTime output, I see that OpenFOAM scales well (tested up to 100 cores). However, the overall simulation time does not decrease significantly, suggesting an issue with CCX or coupling.

My assuptions

If CCX is correctly configured (with Spooles, Pardiso, or PaStiX and a proper Slurm script), it should scale reasonably well up to ~100 cores in a single node using OpenMP, rather than just 4–8 cores.
If this is true, the issue could be:
1. A bad Slurm script
2. The need to switch solvers (from Spooles to Pardiso/PaStiX)

Next Steps

Enable deeper profiling via adding lines toprecice-config.xml as @fsimonis suggested, to track communication and CCX execution time.
Fix the Slurm script: Run CCX on a single node and OpenFOAM on another, avoiding synchronization issues. (hopefully I can fix the problem where simulation stuck at participants waiting for each other)
Install PaStiX (Spack installation available).
Install Pardiso.
Test different CPU allocations and solvers (Spooles, Pardiso, PaStiX) on the HPC and compare performance results.

I have limited experience with HPC installations, Slurm scripts, and hostfiles, and I also have other responsibilities, so progress might be slow. However, I will share my findings here as I move forward.

Meanwhile, if anyone with experience in CCX on HPC has additional insight to share, I would greatly appreciate it.

Kind regards,
Umut

Topic		Replies	Views
Using SLURM with Openfoam and Fenics Using preCICE openfoam , fsi , fenics , slurm	11	1520	April 12, 2022
[OpenFOAM] Unable to run SLURM partitioned simulation Using preCICE openfoam , slurm	12	1550	October 10, 2022
Calculix parallelization Official adapters and tutorials openfoam , calculix , fsi	11	3467	April 19, 2021
Problem in calling the adapted Calculix on HPC Official adapters and tutorials adapters , configuration , calculix	11	603	July 28, 2023
CalculiX not being linked to preCICE properly Official adapters and tutorials openfoam , calculix	18	2621	January 2, 2022