[Bug] Socket communication directory path

Hi,
I was having issues with the socket-communication of preCICE.
Somehow after >150 iterations the python interface of preCICE stuck at the advance command without any error message (in debug mode). It seems that there are issues in the localization of the socket file. I’m launching the two coupled solvers from different directories (although this is not recommended here). When I change the position of the socket file (one directory layer above both solvers), the coupling works smooth.
Maybe it is possible to change the entry exchange-directory to two paths referring from each of the solvers (e.g. exchange-directory-solver1 and exchange-directory-solver2).
Another possibility would be to throw an error and kill the communication.
In my opinion advising the user how to structure the case files is not an option, since different solvers have different architectures.

Works till some point and then crashes:

<m2n:sockets port="0" exchange-directory="." from="{string}" network="lo" to="{string}" enforce-gather-scatter="0" use-two-level-initialization="0"/>

Works perfectly:
<m2n:sockets port="0" exchange-directory=".." from="{string}" network="lo" to="{string}" enforce-gather-scatter="0" use-two-level-initialization="0"/>

Do you by any chance have a uni-directional coupling? The “after >150 iterations” reminds me of system-level issues where the buffer gets full.

We also start the solvers in our tutorials from different directories now and set the exchange directory to ... The “start from the same directory” is a recommendation we used to give, as it made a few aspects easier (e.g. no need to specify the exchange-directory).

I think we need to look a bit deeper into this. If you identify that this is indeed a bug, then please move it to a GitHub issue. But maybe there is something else. Can you reproduce it consistently?

I think you should be able to use any exchange directory as long as all solvers can reach the directory. I have used the exchange-directory with absolute paths before. That means I used it with something like exchange-directory="/tmp/precice" or exchange-directory="/home/ajaust/simulations/precice-exchange". I think this should still be a valid option.

After trying some other constellations, I’m not sure if it is path-related.
How could I see if there is a full buffer?

Please see my precice-config.xml below:

<?xml version="1.0" encoding="UTF-8" ?>
<precice-configuration>
  <log>
    <sink
      filter="%Severity% > debug and %Rank% = 0"
      format="---[precice] %ColorizedSeverity% %Message%"
      enabled="true" />
  </log>

  <solver-interface dimensions="2">
    <data:vector name="Force_Data" />
    <data:vector name="Displacement_Data" />

    <mesh name="Fluid-Mesh">
      <use-data name="Force_Data" />
      <use-data name="Displacement_Data" />
    </mesh>
    
    <mesh name="Solid-Mesh">
      <use-data name="Displacement_Data" />
      <use-data name="Force_Data" />
    </mesh>

    <participant name="Fluid">
      <use-mesh name="Fluid-Mesh" provide="yes" />
      <use-mesh name="Solid-Mesh" from="Solid" />
      <write-data name="Force_Data" mesh="Fluid-Mesh" />
      <read-data name="Displacement_Data" mesh="Fluid-Mesh" />
      <!--<mapping:rbf-thin-plate-splines direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative" z-dead="true"/>
   	  <mapping:rbf-thin-plate-splines direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent" z-dead="true"/>-->
   	  <mapping:rbf-compact-polynomial-c0 support-radius="5" direction="write" from="Fluid-Mesh" to="Solid-Mesh" constraint="conservative"/>
   	  <mapping:rbf-compact-polynomial-c0 support-radius="5" direction="read" from="Solid-Mesh" to="Fluid-Mesh" constraint="consistent"/>
    </participant>

    <participant name="Solid">
      <use-mesh name="Solid-Mesh" provide="yes" />
      <write-data name="Displacement_Data" mesh="Solid-Mesh" />
      <read-data name="Force_Data" mesh="Solid-Mesh" />
    </participant>

    <m2n:sockets from="Fluid" to="Solid" exchange-directory=".." enforce-gather-scatter="1"/>

    <coupling-scheme:parallel-explicit>
      <time-window-size value="0.0001" />
      <max-time value="10" />
      <participants first="Fluid" second="Solid" />
      <exchange data="Force_Data" mesh="Solid-Mesh" from="Fluid" to="Solid" />
      <exchange data="Displacement_Data" mesh="Solid-Mesh" from="Solid" to="Fluid" />
    </coupling-scheme:parallel-explicit>
  </solver-interface>
</precice-configuration>

I can’t see how such a behavior could be related to the path. If you already run 150 iterations, the communication is already established.
Also, you use a bi-directional coupling. So, the buffer should not be a problem either.

Could you please provide more information? Log outputs of both solvers, iterations and convergence files, information on what kind of system you run (a cluster?).

Any particular reason why you switch on gather-scatter communication? This options is only for debugging purposes.

My best guess is that you run into a numerical crash, but that there is no meaningful error message. You say one solver is stuck in advance; what does the other solver give?

I found out, that my code was stuck in some other socket communication in OpenFOAM. Sorry for bothering you.

Indeed I just copy-pasted the gather-scatter option. Thank you for pointing this out, it fixed my issues running preCICE in parallel :slight_smile:

What other socket communication do you mean? This sounds a bit unusual. Are you running some special solver?

I’m using a additional self-made socket communication between my python interface and OpenFOAM, since preCICE is only supporting displacement motions, but not solidBodyMotions. I was having issues with running socket my communication in parallel mode, but now solved it by creating a master-slave hirarchy between the processors (only the master handles the socket communication).

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.