Cannot run OpenFOAM in parallel when using RBF Mapping

Hey everyone,

I’m having trouble running OpenFOAM in parallel. Specifically, I cannot run, in parallel, any cases with the RBF mapping option. I get a segmentation violation error before it even communicates with partner (e.g. CalculiX) and starts the calculation. The following log is from the perpendicular flap tutorial; the same happens with the Turek-Hron FSI3 tutorial or any case I set up which involves RBF mapping. Serial seems to work fine. Parallel in cases where other mappings are used (nearest-projection, for example) also seems to work fine.

*---------------------------------------------------------------------------*\
  =========                 |
  \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox
   \\    /   O peration     | Website:  https://openfoam.org
    \\  /    A nd           | Version:  7
     \\/     M anipulation  |
\*---------------------------------------------------------------------------*/
Build  : 7-3bcbaf946ae9
Exec   : pimpleFoam -parallel
Date   : Nov 03 2021
Time   : 11:13:22
Host   : "andres-linux"
PID    : 24224
I/O    : uncollated
Case   : /home/andres/OpenFOAM/andres-7/preCICE-adapter/tutorials/perpendicular-flap/fluid-openfoam
nProcs : 4
Slaves : 
3
(
"andres-linux.24225"
"andres-linux.24226"
"andres-linux.24227"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

Selecting dynamicFvMesh dynamicMotionSolverFvMesh
Selecting motion solver: displacementLaplacian
Selecting motion diffusion: quadratic
Selecting motion diffusion: inverseDistance

PIMPLE: No convergence criteria found


PIMPLE: Operating solver in transient mode with 1 outer corrector
PIMPLE: Operating solver in PISO mode


Reading field p

Reading field U

Reading/calculating face flux field phi

Selecting incompressible transport model Newtonian
Selecting turbulence model type laminar
Selecting laminar stress model Stokes
No MRF models present

No finite volume options present
Constructing face velocity Uf

Courant Number mean: 0 max: 0

Starting time loop

---[preciceAdapter] Loaded the OpenFOAM-preCICE adapter v1.0.0.
---[preciceAdapter] Reading preciceDict...
---[precice]  This is preCICE version 2.3.0
---[precice]  Revision info: v2.3.0
---[precice]  Configuration: Release (Debug and Trace log unavailable)
---[precice]  Configuring preCICE with configuration "../precice-config.xml"
---[precice]  I am participant "Fluid"
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[2]PETSC ERROR: to get more information on the crash.
[2]PETSC ERROR: User provided function() line 0 in  unknown file  
---[precice]  Connecting Master to 3 Slaves
[3]PETSC ERROR: ------------------------------------------------------------------------
[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[3]PETSC ERROR: to get more information on the crash.
[3]PETSC ERROR: User provided function() line 0 in  unknown file  
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
---[precice]  Setting up master communication to coupling partner/s
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[1]PETSC ERROR: to get more information on the crash.
[1]PETSC ERROR: User provided function() line 0 in  unknown file  
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[2]PETSC ERROR: to get more information on the crash.
[2]PETSC ERROR: User provided function() line 0 in  unknown file  
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: User provided function() line 0 in  unknown file  
[andres-linux:24219] 4 more processes have sent help message help-mpi-api.txt / mpi-abort

My system and installation details are:

  • Ubuntu 20.04 Focal, fresh install (though the same happens on 18.04 Bionic).
  • preCICE 2.3.0 installed from .deb package (same issue with 2.2.1).
  • OpenFOAM v7, installed from .deb package.
  • preCICE Adapter for OF v7, succesfully compiled with ./Allwmake
  • mpirun --version output:
mpirun (Open MPI) 4.0.3

Report bugs to http://www.open-mpi.org/community/help/
  • sudo update-alternatives --list mpi output:
/usr/bin/mpicc.openmpi
  • PETSC 3.12 (installed by preCICE’s .deb)

How should I proceed?

Best regards,
Andrés

Hi @Andres

thanks for reporting this issue this looks indeed like a problem. I can reproduce it locally with OpenFOAM 2012. I opened an issue and keep you posted OpenFOAM tutorial fails in parallel · Issue #241 · precice/tutorials · GitHub

Best regards,
David

1 Like

Ok, so the issue is unmasked and the fix is on its way. If it is urgent to run this tutorial right now you can either merge Fix 2D check for parallel cases by DavidSCN · Pull Request #203 · precice/openfoam-adapter · GitHub into your local branch or you can change the partitioning of the OpenFOAM case so that every rank owns a portion of the coupling interface. If you have a little bit more time you can also wait until the update is published in the release section.

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.