Running the case on the cluster without errors, but hanging for a long time in the first step

Hi everyone, I have encountered a problem. I am simulating the problem of elastic structures entering into water on the cluster, openfoam handles fluid, calculix handles solid and precice addresses the coupling. Is it normal for simulations to get stuck in the first step for a long time, even several days? It should be noted that I only called one node, so there should be no cross node communication issues. There is no precice-run file that has already been generated. I use Slurm to submit assignments. These are my fluid and solid interface mesh.



This is my precice configuration file, seems no problem for me.
precice-config.xml (2.5 KB)

The cluster submits the job files and output files as follows.
slurm.txt (417 Bytes)
utput=Job_Results1950.out|attachment (23.6 KB)
Some of the cases are getting some results even after a few days, but the initialization process is very slow, I would like to know if this is normal and if I have any problems submitting jobs using slurm? Because I know very little about clusters. Can anyone give me some ideas on troubleshooting the problem, my experience with precice is still limited. Appreciate it.

Hi,

You are using PETSc-based compact RBF mappings with a relatively big case. This can quickly get out of hands in terms of memory and compute time. I recommend you to try a nearest-neighbour mapping to see if the simulation starts as expected.

To better understand the behaviour of the various mappings, please have a look at section 3.2.3 of the second preCICE reference paper. Especially the Figure 11 should be interesting to you.

https://open-research-europe.ec.europa.eu/articles/2-51/v2

If you can isolate the hang to the computation of the RBF mapping, then you could try to reduce the support radius of your basis functions to make the RBF system matrices sparser.

If the repartitioning phase is the problem, then you could try enabling two-level initialization.

In all cases, upgrading preCICE and adapters to the latest releases will make a significant difference. We improved the runtime of many of these components. You could even try the new partition of unity RBF mapping.

Best
Frédéric

Dear developer,
Thank you for your prompt response. Based on your suggestion, I think I should try different mapping schemes. Specifically, the output of openfoam remains stuck for a long time on

—[precice] 0m Setting up master communication to coupling partner/s

and the output of calculix remains stuck on

—[precice] 0m I am participant “solid”
Set ID Found

What’s going on at this stage?
The cluster is using CentOS 7 system. I would like to know if the latest precice support version is still 2.3.0.

Best Regards
Sun

It’s difficult to say what exactly is going on as both solver outputs are interleaved.
My best guess is that the CalculiX adapter hangs while OpenFOAM waits to establish a communication.
Could you please output the run scripts to separate files? That would help a lot.

It may also be possible that the calulix version and adapter version are not compatible. Which versions are you using?

preCICE doesn’t support CentOS as such. It supports the latest two LTS versions of Ubuntu. If the compilers and dependencies are present on a system, then preCICE should work.
But given that CentOS 7 has reached end of maintenance this June, there won’t be any package updates or similar for your platform.
If you want to update, using spack is probably your best bet.

Best,
Frédéric

Dear developer,

I’ve run some 2D cases on the cluster before, which means that there should be no compatibility issues with the installed software. I am using calculix 2.20 and the corresponding version of calculix-adapter. My initial judgement is that the problem should be in the coupling mesh and data mapping settings, which I am still actively tweaking.
Thank you for your help.

Sun

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.