Is there any experience how the coupling behavior (number of coupling iterations per time step are needed) depends on platform (network, CPU etc.) and number of threads used? I run the same simulation on two different machines. I get very similar results, but the number of coupling iterations needed and also the residuals in preCICE differ a lot. See
On both machines most of the software (including preCICE) have been installed with Spack. Thus most packages should be compiled with reasonable optimization flags.
I would need a hint which part of my software stack to blame/debug further. Is this behavior that could happen due to preCICE running on a different machine (distributed memory vs. shared memory) or is it rather a problem that my solver itself behaves differently on different machines?
Thanks in advance!