[Bug] Socket communication directory path

JulianSchl · July 29, 2021, 6:02pm

Hi,
I was having issues with the socket-communication of preCICE.
Somehow after >150 iterations the python interface of preCICE stuck at the advance command without any error message (in debug mode). It seems that there are issues in the localization of the socket file. I’m launching the two coupled solvers from different directories (although this is not recommended here). When I change the position of the socket file (one directory layer above both solvers), the coupling works smooth.
Maybe it is possible to change the entry exchange-directory to two paths referring from each of the solvers (e.g. exchange-directory-solver1 and exchange-directory-solver2).
Another possibility would be to throw an error and kill the communication.
In my opinion advising the user how to structure the case files is not an option, since different solvers have different architectures.

Works till some point and then crashes:

<m2n:sockets port="0" exchange-directory="." from="{string}" network="lo" to="{string}" enforce-gather-scatter="0" use-two-level-initialization="0"/>

Works perfectly:

<m2n:sockets port="0" exchange-directory=".." from="{string}" network="lo" to="{string}" enforce-gather-scatter="0" use-two-level-initialization="0"/>

Makis · July 29, 2021, 7:38pm

Do you by any chance have a uni-directional coupling? The “after >150 iterations” reminds me of system-level issues where the buffer gets full.

We also start the solvers in our tutorials from different directories now and set the exchange directory to ... The “start from the same directory” is a recommendation we used to give, as it made a few aspects easier (e.g. no need to specify the exchange-directory).

I think we need to look a bit deeper into this. If you identify that this is indeed a bug, then please move it to a GitHub issue. But maybe there is something else. Can you reproduce it consistently?

ajaust · July 30, 2021, 7:29am

I think you should be able to use any exchange directory as long as all solvers can reach the directory. I have used the exchange-directory with absolute paths before. That means I used it with something like exchange-directory="/tmp/precice" or exchange-directory="/home/ajaust/simulations/precice-exchange". I think this should still be a valid option.

JulianSchl · July 30, 2021, 11:21am

After trying some other constellations, I’m not sure if it is path-related.
How could I see if there is a full buffer?

Please see my precice-config.xml below:

<?xml version="1.0" encoding="UTF-8" ?>
<precice-configuration>
  <log>
    <sink
      filter="%Severity% > debug and %Rank% = 0"
      format="---[precice] %ColorizedSeverity% %Message%"
      enabled="true" />
  </log>

  <solver-interface dimensions="2">
    <data:vector name="Force_Data" />
    <data:vector name="Displacement_Data" />

    <mesh name="Fluid-Mesh">
      <use-data name="Force_Data" />
      <use-data name="Displacement_Data" />
    </mesh>
    
    <mesh name="Solid-Mesh">
      <use-data name="Displacement_Data" />
      <use-data name="Force_Data" />
    </mesh>

    <participant name="Fluid">
      <use-mesh name="Fluid-Mesh" provide="yes" />
      <use-mesh name="Solid-Mesh" from="Solid" />
      <write-data name="Force_Data" mesh="Fluid-Mesh" />
      <read-data name="Displacement_Data" mesh="Fluid-Mesh" />
      <!--<mapping:rbf-thin-plate-splines direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative" z-dead="true"/>
   	  <mapping:rbf-thin-plate-splines direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent" z-dead="true"/>-->
   	  <mapping:rbf-compact-polynomial-c0 support-radius="5" direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative"/>
   	  <mapping:rbf-compact-polynomial-c0 support-radius="5" direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent"/>
    </participant>

    <participant name="Solid">
      <use-mesh name="Solid-Mesh" provide="yes" />
      <write-data name="Displacement_Data" mesh="Solid-Mesh" />
      <read-data name="Force_Data" mesh="Solid-Mesh" />
    </participant>

    <m2n:sockets from="Fluid" to="Solid" exchange-directory=".." enforce-gather-scatter="1"/>

    <coupling-scheme:parallel-explicit>
      <time-window-size value="0.0001" />
      <max-time value="10" />
      <participants first="Fluid" second="Solid" />
      <exchange data="Force_Data" mesh="Solid-Mesh" from="Fluid" to="Solid" />
      <exchange data="Displacement_Data" mesh="Solid-Mesh" from="Solid" to="Fluid" />
    </coupling-scheme:parallel-explicit>
  </solver-interface>
</precice-configuration>

uekerman · July 31, 2021, 10:17am

I can’t see how such a behavior could be related to the path. If you already run 150 iterations, the communication is already established.
Also, you use a bi-directional coupling. So, the buffer should not be a problem either.

Could you please provide more information? Log outputs of both solvers, iterations and convergence files, information on what kind of system you run (a cluster?).

Any particular reason why you switch on gather-scatter communication? This options is only for debugging purposes.

My best guess is that you run into a numerical crash, but that there is no meaningful error message. You say one solver is stuck in advance; what does the other solver give?

JulianSchl · August 12, 2021, 12:47pm

I found out, that my code was stuck in some other socket communication in OpenFOAM. Sorry for bothering you.

Indeed I just copy-pasted the gather-scatter option. Thank you for pointing this out, it fixed my issues running preCICE in parallel

Makis · August 12, 2021, 3:10pm

What other socket communication do you mean? This sounds a bit unusual. Are you running some special solver?

JulianSchl · August 15, 2021, 9:56am

I’m using a additional self-made socket communication between my python interface and OpenFOAM, since preCICE is only supporting displacement motions, but not solidBodyMotions. I was having issues with running socket my communication in parallel mode, but now solved it by creating a master-slave hirarchy between the processors (only the master handles the socket communication).

system · August 18, 2021, 9:56am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help! The participants are not finding each other! Using preCICE faq	3	2399	July 3, 2021
Communcation over sockets work for MPI-Fluid solver with 2 Procs, not for MPI-Fluid solver with 4 Procs Using preCICE configuration , calculix , fsi	4	403	July 6, 2021
Simulation stuck at advance for both solvers Using preCICE mpi , python	30	1547	May 4, 2021
Regarding the issue with running solverdummy Official adapters and tutorials	3	137	November 21, 2023
Two couplings to one interface reloaded Using preCICE multi-coupling , coupling-schemes	12	1566	November 4, 2020

[Bug] Socket communication directory path

Related topics