Simulation crashed at the same time-window. Now thing’s getting strange. So far have tried serial-explicit, parallel-explicit, parallel-implicit. Both nearest-neighbor and rbf are tested. The OpenFOAM participant runs in serial or parallel MPI mode. All the runs crashed at the same time-window.
The “meshes” causing trouble are 1D line segment (HOS-Coupling-Mesh and OpenFOAM-Coupling-Mesh), at the crashing time the coupling data (etaVOF) is almost zero everywhere on the line mesh (error message: etaVOF on mesh OpenFOAM-Coupling-Mesh didn’t contain any data samples while attempting to map to mesh HOS-Coupling-Mesh).
Update: I managed to complete one run this morning. In this configuration, the output interval dt_out is not a multiple of the preCICE time window size dt. In all other failed runs, dt_out has always been a multiple of dt.
When dt_out is not an integer multiple of dt, the time step size dt1 passed to precice.advance(dt1) is not always consistent with the predefined time window size. This results in fine adjustments near output timestamps. I am unsure whether this has contributed to the crashes.
At the end of the successful run this morning, the potential code participant (using the Fortran binding) reported an error stating: “The preCICE library has not been properly initialized. precicef_initialize() must be called before any other preCICE calls.” The process then hung without a proper exit.
I have double-checked my code and found no violations of this initialization rule. Could you please advise on the possible cause? Many thanks!
Update, by setting adjustTimeStep=true to enable variable stepsizes for both participants, I can avoid the crashing. Instead, using same fixed stepsize for both solvers the coupling will crash at a certain time-window when the time-window-size is a multiple time of that stepsize.
Final update, after allowing the participants to adjust their own stepsizes (adjustTimeStep=true), I have tested the following configurations and they all completed succesfully without error.
number of participants: 2, 3, 4
Coupling schemes: serial-explicit, parallel-explicit, parallel-implicit
Mapping methods: rbf, nearest-neigbours
numer of CPUs for each participant: 1-48 (parallel using openmpi)
This is interesting, and might still be a bug in preCICE itself. Just to be sure, could you please post the version information from your logs of both solvers? In your first logs, it looked like this:
# log. HOS
---[precice] This is preCICE version 3.1.2
---[precice] Revision info: no-info [git failed to run]
# log.hosCoupleFoam
---[precice] This is preCICE version 3.1.2
---[precice] Revision info: no-info [git failed to run]
For convenience to other readers, I attach the tail of your logs:
# log. HOS
---[precice] Time window completed
---[precice] Mapping "etaVOF" for t=68.34 from "OpenFOAM-Coupling-Mesh" to "HOS-Coupling-Mesh"
---[precice] Mapping "etaVOFShifted" for t=68.34 from "OpenFOAM-Coupling-Mesh" to "HOS-Coupling-Mesh"
---[precice] Mapping "wVOF" for t=68.34 from "OpenFOAM-Coupling-Mesh" to "HOS-Coupling-Mesh"
---[precice] Mapping "weights" for t=68.34 from "OpenFOAM-Coupling-Mesh" to "HOS-Coupling-Mesh"
---[precice] iteration: 1, time-window: 6835, time: 68.34 of 100, time-window-size: 0.01, max-time-step-size: 0.01, ongoing: yes, time-window-complete: yes
HOS-Coupling: Time advanced.
... <solver logs> ...
---[precice] Time window completed
---[precice] ERROR: Data etaVOF on mesh OpenFOAM-Coupling-Mesh didn't contain any data samples while attempting to map to mesh HOS-Coupling-Mesh. Check your exchange tags to ensure your coupling scheme exchanges the data or the pariticipant produces it using an action. The expected exchange tag should look like this: <exchange data="etaVOF" mesh="OpenFOAM-Coupling-Mesh" from=... to=... />.
# log.hosCoupleFoam
---[preciceAdapter] [DEBUG] Advancing preCICE...
---[precice] Time window completed
---[precice] Mapping "UHOS" for t=68.35000000000001 from "HOS-Relaxation-Mesh" to "OpenFOAM-Relaxation-Mesh"
---[precice] Mapping "etaHOS" for t=68.35000000000001 from "HOS-Relaxation-Mesh" to "OpenFOAM-Relaxation-Mesh"
---[precice] iteration: 1, time-window: 6836, time: 68.35 of 100, time-window-size: 0.01, max-time-step-size: 0.01, ongoing: yes, time-window-complete: yes
---[preciceAdapter] [DEBUG] Adjusting the solver's timestep...
---[preciceAdapter] [DEBUG] The solver's timestep is the same as the coupling timestep.
---[preciceAdapter] [DEBUG] Reading coupling data associated to the calculated time-step size...
---[preciceAdapter] [DEBUG] Reading coupling data...
---[preciceAdapter] [DEBUG] precice_.readData() done for: etaHOS on OpenFOAM-Relaxation-Mesh
---[preciceAdapter] [DEBUG] couplingDataReader->read() done for: etaHOS on OpenFOAM-Relaxation-Mesh
---[preciceAdapter] [DEBUG] precice_.readData() done for: UHOS on OpenFOAM-Relaxation-Mesh
---[preciceAdapter] [DEBUG] couplingDataReader->read() done for: UHOS on OpenFOAM-Relaxation-Mesh
Courant Number mean: 0.00310636354339 max: 0.0873743703009
Interface Courant Number mean: 8.33957199729e-05 max: 0.0621531512312
Time = 68.36
... <solver logs> ...
ExecutionTime = 7979.87 s ClockTime = 7976 s
---[preciceAdapter] [DEBUG] Writing coupling data...
... <solver logs> ...
---[preciceAdapter] [DEBUG] couplingDataWriter->write() done for: etaVOF on OpenFOAM-Coupling-Mesh
---[preciceAdapter] [DEBUG] precice_.writeData() done for: etaVOF on OpenFOAM-Coupling-Mesh
<end of logs>
This is expected when compiling preCICE from an archive which doesn’t contain any git-related information.
My best guess it that the spack environment isn’t loaded correctly, and the adapter uses the wrong preCICE library for some reason.
The following tool is invaluable at debugging this kind of problem as it tells you where dynamic libraries are loaded from and why: