preCICE on Large HPC Cluster with Multiple Network Names

smjo1021 · February 21, 2026, 12:05am

I am using preCICE on a large HPC (~order of 10^3 nodes), leading to that the nodes have multiple network names that might need to be fed into precice-config.xml file to enable communication across more than one nodes.

I searched some of previous articles on this channel and apparently, the suggested solution by far is to get network name (e.g., ip link) and feed it to the config file mannually. I think this works fine still for a small cluster. In a very large cluster, like what I’m working with, has multiple network name and using slurm like task manager, it is almost impossible to know apriori which network name would be assigned to my job. This creates huge bottle neck for my simulation setup with precice on the cluster.. I wonder if there is any update on this aspect!

Makis · February 22, 2026, 7:49am

Hi @smjo1021,

is this a homogeneous cluster, or are you trying to use nodes with different configurations (such as different CPUs)? Are the different networks for different partitions? The typical use case is running in multiple nodes of the same architecture.

The main idea for the network attribute is that the default network is typically one only accessible within one node/island, while there is typically a larger common network connecting nodes.

In any case, if the setup is different, it would be interesting to know more about it to find out how we could support it.

smjo1021 · February 22, 2026, 4:23pm

Hi @Makis, I appreciate for your reply.

I wonder what kinds of information that I can provide further to let you better understand my difficulty. For example, are there any suggestions of terminal commands that I can type and put what returned in this conversation? I kind of understood what your comments are about, but would like to learn what the information should be.

Thank you again.

Makis · February 22, 2026, 6:01pm

Are you using a cluster with public documentation? The networks should be documented there, and a link could help.

Please forgive my question (I have no clue who I am talking to), but have you already talked about this with your cluster admin? The issue might primarily be system-specific, and then maybe something we could address.

smjo1021 · February 22, 2026, 7:32pm

I have not talked about this issue with the cluster admin yet.

I am using this cluster: RCAC - Knowledge Base: Anvil User Guide: Anvil User Guide

There are some notes on ‘Network’, but they are about the speed of the network communication (OOO Gbs speed in communication, etc.). I am not sure what information is exactly needed from the documentation, neither what information exactly I should ask to the cluster admin.

preCICE doc. says that users should provide network name for the m2n socket communication in case the communication across mutliple nodes, but, I have not explained & inquired about this to the cluster admin. This is mainly because the similar issue has caused to me whenever I tried to use preCICE in large-scale hpc cluster (not only for this one), meaning that I had to find proper network name via trial-and-error approach. However, at this time, even this trial-and-error approach (meaning that switching the network name one-by-one until my slurm job finally runs) seems not working, unfortunately.

If there is a suggestion of question that I can ask to the cluster admin, I can ask, but at this point, I am not sure what the question is that I need to ask for.

ajaust · February 23, 2026, 9:22pm

This sounds like an interesting case, but I also expect that this should be solvable. Other parallel application seem to work on the cluster.

It would be interesting to get some more information to get a better understanding of what is not working as intended.

Do you have a sample preCICE configuration and a SLURM script that do not work that you could share?
On the compute nodes, you could run ip link show or netstat -i to get an overview over available network interfaces. Could you share the output of this command?
What is the actual error that run into? Does the simulation not start or is it simply very slow?
Could you share the preCICE log of the participants that have issues?

Topic		Replies	Views
Running preCICE on a Cluster Using preCICE mpi , slurm	17	1275	August 28, 2021
Running OpenFOAM with preCICE on multiple nodes Using preCICE openfoam , mpi , communication	0	44	July 16, 2025
Slower performance on cluster than local machine Using preCICE performance	4	278	May 30, 2023
getLeaderRank: ERROR: Unknown accessor name Using preCICE communication , inactive	5	916	April 17, 2020
Point-to-point communication on a cluster Using preCICE communication , configuration , parallel	2	573	June 28, 2021

preCICE on Large HPC Cluster with Multiple Network Names

Related topics