How mesh coupling works for parallel participants

Hi, I’m developing a FEniCSx adapter that was discussed here, and is currently under review in the Pull Request mentioned in that thread. The version in the pull request gives an error when running in parallel, which I’m trying to fix. I have managed to eliminate the error, but now the read and write data are always empty, running in parallel. I’m using the perpendicular-flap tutorial to test it.

This is the log of the solid participant, as we can see read and write data max/min values are 0.

Rank 0, Block size 2 Num local dofs 1204
Rank 1, Block size 2 Num local dofs 1206
---[precice]  This is preCICE version 3.1.1
---[precice]  Revision info: no-info [git failed to run]
---[precice]  Build type: Debug
---[precice]  Configuring preCICE with configuration "/home/infralab/Desktop/Firvida/openfoam/precice-fenicsx-adapter/tutorial/perpendicular-flap/solid-fenicsx/../precice-config.xml"
---[precice]  I am participant "Solid"
---[precice]  Connecting Primary rank to 1 Secondary ranks
---[precice]  Setting up primary communication to coupling partner/s
---[precice]  Primary ranks are connected
---[precice]  Setting up preliminary secondary communication to coupling partner/s
---[precice]  Prepare partition for mesh Solid-Mesh
---[precice]  Gather mesh Solid-Mesh
---[precice]  Send global mesh Solid-Mesh
---[precice]  Setting up secondary communication to coupling partner/s
---[precice]  Secondary ranks are connected
---[precice]  iteration: 1 of 50 (min 1), time-window: 1, time: 0 of 5, time-window-size: 0.01, max-time-step-size: 0.01, ongoing: yes, time-window-complete: no, write-iteration-checkpoint 
===============================================
Rank 1, Min Read Data 0.0
Rank 1, Max Read Data 0.0
===============================================
Rank 0, Min Read Data 0.0
Rank 0, Max Read Data 0.0
----------------------------------------------
Rank 1, Min Write Data 0.0
Rank 1, Max Write Data 0.0
===============================================
----------------------------------------------
Rank 0, Min Write Data 0.0
Rank 0, Max Write Data 0.0
===============================================
---[precice]  relative convergence measure: relative two-norm diff of data "Displacement" = inf, limit = 5.00e-03, normalization = 0.00e+00, conv = true
---[precice]  relative convergence measure: relative two-norm diff of data "Force" = inf, limit = 5.00e-03, normalization = 0.00e+00, conv = true
---[precice]  All converged
---[precice] WARNING:  The coupling residual equals almost zero. There is maybe something wrong in your adapter. Maybe you always write the same data or you call advance without providing new data first or you do not use available read data. Or you just converge much further than actually necessary.
---[precice] WARNING:  The IQN matrix has no columns.
---[precice]  Time window completed
---[precice]  iteration: 1 of 50 (min 1), time-window: 2, time: 0.01 of 5, time-window-size: 0.01, max-time-step-size: 0.01, ongoing: yes, time-window-complete: yes, write-iteration-checkpoint 
===============================================
Rank 1, Min Read Data 0.0
Rank 1, Max Read Data 0.0
===============================================
Rank 0, Min Read Data 0.0
Rank 0, Max Read Data 0.0
----------------------------------------------
Rank 1, Min Write Data 0.0
Rank 1, Max Write Data 0.0
===============================================
----------------------------------------------
Rank 0, Min Write Data 0.0
Rank 0, Max Write Data 0.0
===============================================
---[precice]  relative convergence measure: relative two-norm diff of data "Displacement" = inf, limit = 5.00e-03, normalization = 0.00e+00, conv = true
---[precice]  relative convergence measure: relative two-norm diff of data "Force" = inf, limit = 5.00e-03, normalization = 0.00e+00, conv = true
---[precice]  All converged
---[precice] WARNING:  The coupling residual equals almost zero. There is maybe something wrong in your adapter. Maybe you always write the same data or you call advance without providing new data first or you do not use available read data. Or you just converge much further than actually necessary.
---[precice] WARNING:  The IQN matrix has no columns.
---[precice] ERROR:  Sending data to another participant (using sockets) failed with a system error: write: Broken pipe [system:32]. This often means that the other participant exited with an error (look there).

Now looking into the fluid participant (OpenFOAM) log, I see that there is a problems with the data interpolation:

---[preciceAdapter] Loaded the OpenFOAM-preCICE adapter - v1.3.0.
---[preciceAdapter] Reading preciceDict...
---[precice] e[0m This is preCICE version 3.1.1
---[precice] e[0m Revision info: no-info [git failed to run]
---[precice] e[0m Build type: Release (without debug log)
---[precice] e[0m Configuring preCICE with configuration "../precice-config.xml"
---[precice] e[0m I am participant "Fluid"
---[precice] e[0m Connecting Primary rank to 3 Secondary ranks
---[precice] e[0m Setting up primary communication to coupling partner/s
---[precice] e[0m Primary ranks are connected
---[precice] e[0m Setting up preliminary secondary communication to coupling partner/s
---[precice] e[0m Prepare partition for mesh Fluid-Mesh
---[precice] e[0m Receive global mesh Solid-Mesh
---[precice] e[0m Broadcast mesh Solid-Mesh
---[precice] e[0m Filter mesh Solid-Mesh by mappings
---[precice] e[0m Feedback distribution for mesh Solid-Mesh
---[precice] e[0m Setting up secondary communication to coupling partner/s
---[precice] e[0m Secondary ranks are connected
---[precice] e[0m Automatic RBF mapping alias from mesh "Fluid-Mesh" to mesh "Solid-Mesh" in "write" direction resolves to "partition-of-unity RBF" .
---[precice] e[0m Computing "partition-of-unity RBF" mapping from mesh "Fluid-Mesh" to mesh "Solid-Mesh" in "write" direction.
---[precice] e[31mERROR: e[0m The interpolation matrix of the RBF mapping from mesh "Solid-Mesh" to mesh "Fluid-Mesh" is not invertable. This means that the mapping problem is not well-posed. Please check if your coupling meshes are correct (e.g. no vertices are duplicated) or reconfigure your basis-function (e.g. reduce the support-radius).

And now my question are:

  • how the participant interpolations occurs in parallel run?
  • do I have to combine the data of all mpi ranks before sending the data, and send data in a single rank?

this is the precice-config.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<precice-configuration>
  <log>
    <sink
      filter="%Severity% > debug and %Rank% = 0"
      format="---[precice] %ColorizedSeverity% %Message%"
      enabled="true" />
  </log>

  <data:vector name="Force" />
  <data:vector name="Displacement" />

  <mesh name="Fluid-Mesh" dimensions="2">
    <use-data name="Force" />
    <use-data name="Displacement" />
  </mesh>

  <mesh name="Solid-Mesh" dimensions="2">
    <use-data name="Displacement" />
    <use-data name="Force" />
  </mesh>

  <participant name="Fluid">
    <provide-mesh name="Fluid-Mesh" />
    <receive-mesh name="Solid-Mesh" from="Solid" />
    <write-data name="Force" mesh="Fluid-Mesh" />
    <read-data name="Displacement" mesh="Fluid-Mesh" />
    <mapping:rbf direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative">
      <basis-function:compact-polynomial-c6 support-radius="0.05" />
    </mapping:rbf>
    <mapping:rbf direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent">
      <basis-function:compact-polynomial-c6 support-radius="0.05" />
    </mapping:rbf>
  </participant>

  <participant name="Solid">
    <provide-mesh name="Solid-Mesh" />
    <write-data name="Displacement" mesh="Solid-Mesh" />
    <read-data name="Force" mesh="Solid-Mesh" />
    <watch-point mesh="Solid-Mesh" name="Flap-Tip" coordinate="0.0;1" />
  </participant>

  <m2n:sockets acceptor="Fluid" connector="Solid" exchange-directory=".." />

  <coupling-scheme:parallel-implicit>
    <time-window-size value="0.01" />
    <max-time value="5" />
    <participants first="Fluid" second="Solid" />
    <exchange data="Force" mesh="Solid-Mesh" from="Fluid" to="Solid" />
    <exchange data="Displacement" mesh="Solid-Mesh" from="Solid" to="Fluid" />
    <max-iterations value="50" />
    <relative-convergence-measure limit="5e-3" data="Displacement" mesh="Solid-Mesh" />
    <relative-convergence-measure limit="5e-3" data="Force" mesh="Solid-Mesh" />
    <acceleration:IQN-ILS>
      <data name="Displacement" mesh="Solid-Mesh" />
      <data name="Force" mesh="Solid-Mesh" />
      <preconditioner type="residual-sum" />
      <filter type="QR2" limit="1e-2" />
      <initial-relaxation value="0.5" />
      <max-used-iterations value="100" />
      <time-windows-reused value="15" />
    </acceleration:IQN-ILS>
  </coupling-scheme:parallel-implicit>
</precice-configuration>

Regarding the interpolation (mapping), have you tried using the default RBF mapping? In v3, many aspects have improved there:

How the interpolation occurs in parallel is a rather broad question but, in short: it happens inside preCICE. And no, you don’t need to combine the data into one MPI rank before giving them to preCICE.

Specifically regarding the OpenFOAM adapter, this topic might end up being related: OpenFOAM interface patch in parallel computation - #3 by Makis

I admit that a long time has passed since you already asked the question. Do you maybe have any updated information here?

Did you try to follow the suggestions given by preCICE?

Please check if your coupling meshes are correct (e.g. no vertices are duplicated) or reconfigure your basis-function (e.g. reduce the support-radius)

i.e. maybe something like

    <mapping:rbf direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative">
      <basis-function:compact-polynomial-c6 support-radius="0.01" />
    </mapping:rbf>

As pointed out by @Makis you don’t have to take any additional actions to run your simulation in parallel. If you face this error, I would assume that either your simulation crashes also in serial or your partitioning is not clean, i.e., maybe you duplicate vertices on the solid mesh across processor boundaries when running in parallel, which will lead to singular matrices in the rbf mapping. To validate this, you could, e.g., check your preCICE exports or log the number of interface vertices on all ranks and check, that they are actually the same when running in parallel.

1 Like

@Makis:

I admit that a long time has passed since you already asked the question. Do you maybe have any updated information here?

Yes, I found that the problem is on the DOLFINx side, but I don’t have a solution yet. My version of the DOLFINx adapter uses the mesh DOFs at the interface patch as a coupling node, making it easy to read and write DOLFINx/Functions data directly from these DOFs, simplifying the adapter. Under the hood, these Functions are PETSc vectors, so having the indices of the DOFs is quite simple.

@DavidSCN:

maybe you duplicate vertices on the solid mesh across processor boundaries when running in parallel, which will lead to singular matrices in the rbf mapping. To validate this, you could, e.g., check your preCICE exports or log the number of interface vertices on all ranks and check, that they are actually the same when running in parallel.

Yes I think that’s the problem, and I don’t know how to handle the vectors in this situation, when the problem is parallelized the ghost nodes are duplicated on different partitions to perform communication between them, I’m unsure how to manage this. According to this topic in the official documentation, I have three choices, but I don’t know how to implement those solutions yet, so I’m stuck here. It would be very helpful if anyone had some code examples for this.

Ghost nodes are read-only data locations. In other codes (at least the ones I were involved) we forwarded only the locally-owned dofs to the interface (which are unique). Ghost value updates should then complete the coupling data on your local rank. However, I have no idea how to achieve this with FEniCSx.