Parallel run with preCICE coupling openFOAM and Calculix

Is there a tutorial dedicated to running FSI with preCICE using MPI? I couldn’t find it on the web. Currently, my model works with 1 cpu for openFOAM and calculix. But the preCICE: Mapping takes longer and longer to finish. I hope to use MPI to improve the efficiency.

I tried the following command in the xml file. Both codes just wait without handshaking.
Thank you!

<m2n:mpi
       exchange-directory="."
       acceptor="Solid-Solver"
       connector="Fluid-Solver"
       enforce-gather-scatter="false"
       use-two-level-initialization="false"/>

Hi,

The m2n:mpi uses the MPI back-end to connect participants; we recommend sticking to sockets until the communication becomes an issue.

Many OpenFOAM tutorials can be run using ./run.sh -parallel.

The Calculix adapter doesn’t support running using MPI.

Got it. Thank you! So to run the FSI analysis in parallel, I need to first decompose the fluid mesh in openFOAM. After this step, basically, I don’t need to do any changes to the configuration files. Just add -parallel on the command line. Is it correct?

Second issue is the time spent on Mapping “Force“ becomes unbearable. The pattern is every 8x10^{-5} sec, the analysis got stuck in this step “preCICE: Mapping “Force” for t=0.005050000000000001 from “Fluid-Mesh-Faces” to “Solid-Mesh”“ for several minutes. Then it will quickly pass another 8x10^{-5} sec with a very nice convergence rate. Do you know why this is happening? I think since the time step is so tiny, preCICE doesn’t need to do a full search on the nearest neighbors this often. Something is accumulating over the time, I guess.

precice-config.xml (2.3 KB)

The run.sh scripts of the OpenFOAM cases will also run decomposePar. But you would need to specify how you want to decompose the cases in the decomposeParDict.

CalculiX itself only supports shared memory parallelization (OpenMP), which is independent of the coupling.

This sounds strange; it shouldn’t take that long. Looking at your config:

  • You are computing both mappings on the Fluid participant. Given that this is also the parallel one, this is good.
  • Nearest-neighbor mapping is the fastest.
  • You set <time-window-size value="1.e-5" />, meaning that you are coupling every 1e-5. Strange that the pattern appears every 8e-5.

How many MPI ranks are you using, and how many cores do you have? Are you running on your laptop or on a cluster? How is the memory consumption? I wonder if this is just a matter of computing resources / network buffer.

If you change:

-  <sink filter="%Severity% > debug"
+  <sink filter="%Severity% >= trace"

and you have preCICE built in Debug mode, you should be able to see where exactly the time is being spent.

Edit: Long shot, but another relevant aspect could be the IMVJ acceleration.

I am not running in parallel now. It’s using one cpu on my desktop (shared memory). Memory consumption is at 80%. It. It indeed exchange data every 1.e-5 sec. But after 8 such exchanges, it got stuck for 10 min (I estimate))

Could you try another acceleration method and report if you see the same behavior?

I am using 12 cores for openFOAM. But the bottle neck is still at Mapping force step. If you pay attention to “ExecutionTime=205.37 s“, later on this time can grow to 600 s although individual components are all converged fast. I am curious, what activities does code work on when this message is shown “preCICE: Mapping “Force” for t=0.0008900000000000001 from “Fluid-Mesh-Faces” to “Solid-Mesh”. I simply can not understand why the mapping takes this long. The meshes are not complex or huge at all.

I will try other acceleration method. By the way, here is the output from running the code. I am wondering if you could take a look at it. If there is anything suspicious. Thanks.

log.txt (146.0 KB)

I turned off “Always build jacobian” and now its progressing fast. Hopefully, this won’t have detrimental effects on the solution.

I was able to launch the analysis. However, no matter what I tried, the analysis seems to diverge (on Calculix side) at certain amount of deflection of the membrane. I used extremely small time increment(1.e-6 sec) for Calculix, OpenSim and preCICE. This led to 1.4 mm deflection. Then from the mesh, it seems hourglassing? The membrane is very thin, 0.5 mm and very soft. And the membrane material is of the same density as the fluid. I know it’s gonna be very hard due to the added mass effects. But I am wondering if anyone has any advice or suggestions to help solve this problem. Or can we say, it’s mission impossible? Thank you very much! I can provide my settings if needed. Thank you very much!

So, this answers where the time is being spent. It’s not the mapping, but the IMVJ rebuilding the Jacobian.

I am curious: Why are you choosing IQN-IMVJ instead of IQN-ILS? The latter is definitely more widely used.

Also, FYI: You can restrict the output to one rank per participant. See the filter we use in our tutorials:

I have since changed to IQN-ILS. Attached is the latest file I am using. As I mentioned in my previous post, Calculix can’t pass 1.4 mm. The deformation is weird. Any suggestions? Thanks.

precice-config.xml (3.6 KB)

So, you are now using RBF mappings:

    <mapping:rbf direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative">
      <basis-function:compact-polynomial-c6 support-radius="2.e-3" />
    </mapping:rbf>
    <mapping:rbf direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent">
      <basis-function:compact-polynomial-c6 support-radius="2.e-3" />
    </mapping:rbf>

a parallel-implicit coupling scheme, with <time-window-size value="1.e-6" />, relative-convergence-measure limit="5e-3" for both displacement and forces, and IQN-ILS acceleration:

    <acceleration:IQN-ILS>
      <initial-relaxation value="0.0001"/>
      <max-used-iterations value="100"/>
      <time-windows-reused value="15"/>
      <data name="Displacement" mesh="Solid-Mesh"/>
      <data name="Force" mesh="Solid-Mesh"/>
      <filter limit="1e-2" type="QR2"/>
      <preconditioner type="residual-sum"/>
     </acceleration:IQN-ILS>

I understand that you use the same time step size for both solvers (1.e-6), i.e., no subcycling.

True that this is not an easy case.

Do the values look smooth and reasonable in the first time windows? You could try exporting the interface meshes and visualizing them in ParaView, to see that the mapping does what you want. Given the sharp edges near the walls, and the very coarse solid mesh near those regions, I could imagine that the RBF mapping needs some further tuning. Where exactly is the membrane fixed?

In the IQN configuration, why are you using such a small initial-relaxation? What happens if you try with 0.1?

As for filter, you could try using the newer QR3 filter.

Thanks for the response.
The data exchange has been successful for many cycles. I looked at the deformation and stress and they all looked normal to me. But Calculix always acts up after certain number of time steps (roughly corresponding to max. deflection = 1.4 mm). Before this, Calculix always converged after 2 iterations. But in this last time step, it couldn’t converge, because a fixed time increment was used (1.e-6 s), Calculix could not make cutbacks, but kept iterating. Ultimately, it blew up.

The membrane was fixed in dof1-3 along the circumference (every node on the circumference) . URF was so small because I though it might give the codes a soft start. I remember trying 0.1 for the very early attempt but it failed. So I kept using 0.001 since then.

I will try a denser grid for the membrane.

When I used subcycling, preCICE always gave a warning saying subcyling was not stable or something to that effect. And indeed, I found the FSI failed much earlier than if no subcycling was used.

You could also try to run the structure simulation by itself with a fixed load of comparable magnitude, and ensure that the structure by itself is working fine. For example, what type of finite elements are you using for your CalculiX model? In our tutorials, we typically use the C3D8I elements, which we found to work much better than the basic C3D8 we were previously using in the perpendicular flap tutorial.

I looked at that tutorial and copied lots of idea from it. I am using shell element S4R due to the very thin membrane. If we switch to solid elements, many more elements have to be used. But maybe it could also be worth a try if nothing else is working.

Hi @yangyangnj :waving_hand:

Your results suggest something is unstable.
Could you please post your iterations and your convergence files?

I could also imagine that the mesh deformation in OpenFOAM could give issues. Try inspecting how the cells look a few time steps before crashing. Is anything getting too extended?

precice-Fluid-Solver-iterations.log (1.5 MB)

precice-Solid-Solver-watchpoint-Membrane-Center.log (8.4 MB)

precice-Solid-Solver-iterations.log (2.7 MB)

Here are the convergency histories. Thanks.

Quasi-Newton convergence looks good at the end of your current solution though the coupling strength appears to be high (which could also be an indicator that sth else is wrong).
I would try simplifying the problem and look at individual aspects, e.g. a unidirectional coupling or a solid-only setup as @Makis suggest above.

When I was debugging, I came to realize that the results I got can not repeat themselves even though I kept using the same set of parameters. Then I set the time-window-reused to 0, it suddenly worked. So how precice deals with history data? Say, one analysis failed, stopped. Then when I started another analysis, the “bad” information was still been used by the code? I am using sockets to transmit information. I would imagine that after one job fails, the memory should be released. Right? What is the best practice to have a clean restart? Like in commerical software, if you kill one job, the next time you launch a new job, it won’t be influenced by the previous job. A related question, since dynamic meshing is used in openFOAM. If one job failed dramatically, and the mesh was moved. Will this deformed mesh be stored and be reused by the subsequent jobs? Should we re-read the mesh every time a job fails? Sorry for these beginners questions. Thanks.