Issues during "ctest" after compiling and installing preCICE with Intel MPI

Dear preCICE community,

I have compiled and installed preCICE 3.2.0 on the headnode of our university cluster (Rocky 8 distrib) from the source (I don’t want to use spack). The following deps were installed:

  • boost 1.85
  • eigen 3.4.0
  • petsc 3.23.2 with Intel MPI (Version 2021.12 Build 20240410)

gcc 13.2.0 is set as compiler (Intel Compiler 2024 combined with the standard gcc of the distrib 8.5 is not supporting c++17).

intel-oneapi-mpi 2021.12.1 Version 2021.12 Build 20240410 is set for MPI.

The compilation and installation of preCICE 3.2.0 is working. ‘make test_install’ is also working well.

However, the ‘ctest’ test suite is generating a lot of the following MPI init errors:

Abort(1090319) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Unknown error class
MPIR_Init_thread(192): 
MPID_Init(1538)......: 
MPIR_pmi_init(169)...: PMI2_Job_GetId returned 14

This kind of error occurs for example with
testprecice --run_test=XML/AttributeConcatenation --log_level=message

The error disappears by adding “mpirun” or “mpiexec” in front of the test. It seems that Intel MPI Version 2021.12 Build 20240410 needs its wrapper “mpirun” / “mpiexec” to correctly initialize its MPI environment.

When I look into the Ctest log (here a part of the log LastTest_cut.log (529.8 KB)), I see that all “parallel MPI” tests configured to use “mpiexec” passed. However, some tests (serial?) are not invoking “mpiexec” in front of “testprecice”. These tests failed with the “Fatal error in PMPI_Init”. By adding “mpiexec -np 1” or “mpirun -np 1” the tests pass.

On another cluster, preCICE is compiled with an older Intel MPI (Version 2021.6 Build 20220227) and this issue does not occur. I can start the “serial” tests without wrapper.

Is this a typical behaviour of the “new” intel MPI? Is there a way to solve this issue in Ctest (It is difficult to see the real problems in the Ctest log)?

Regards,
Guillaume

Hi @gdenayer,

I am not aware whether this is a new behavior of MPI, but it might be that the tests were so far relying on some assumption that was so far working fine.

Just to clarify the severity of the issue: do your parallel preCICE-based simulations run fine, despite the failing tests?

Hi,
we are testing Intel MPI and their compiler in the CI via the OneAPI packages to avoid exactly these kinds of problems.

Compiler

-- The CXX compiler identification is IntelLLVM 2025.0.4

MPI Version

-- MPI Version: Intel(R) MPI Library 2021.14 for Linux* OS

This version is in between of the two working ones 2021.6 (your other cluster) and 2021.14 (the current CI). So either there was a temporary problem with the release, or there is some other issue.

Do you have additional environment variables set that influence Intel MPI? These could be an issue.

Best
Frédéric

Hi @Makis,

I did not test my FSI cases with preCICE on the University cluster until now. I was installing preCICE on that cluster to start with them. But before starting coupled FSI simulations, I wanted to check if the install of preCICE is ok. with the ctest suite :wink:

Best
Guillaume

Hi @fsimonis ,

unfortunately the intel MPI is only available through spack on our cluster university:
spack info intel-oneapi-mpi:
2021.12.1
2021.12.0
2021.11.0
2021.10.0
2021.9.0
2021.8.0
2021.7.1
2021.7.0
2021.6.0
2021.5.1
2021.5.0
2021.4.0
2021.3.0
2021.2.0
2021.1.1

As suggested by the admin of our university cluster, I will try the different version of intel MPI between 2021.12.1 and 2021.6.0 to investigate if it is only in one version or in several. Or if it is an issue with one of the spack configuration options.

Best
Guillaume

Hi,

it seems that the MPI init error that I get by starting “testprecice” without MPI wrapper, come from the environement and not from the MPI version:

  • on our university cluster, the variable “I_MPI_PMI_LIBRARY” is set to “/usr/lib64/libpmi2.so” per default.
  • on a second cluster, the variable “I_MPI_PMI_LIBRARY” is not set and I don’t get this MPI init error.
  • When I unset the variable “I_MPI_PMI_LIBRARY” on our university cluster, I can start “testprecice” without MPI wrapper and I don’t get the MPI init error:
This test suite runs on rank 0 of 1
Running 1 test case...
Setup up logging
Test context of XML/AttributeConcatenation represents "Unnamed" and runs on rank 0 out of 1.
Test case XML/AttributeConcatenation did not check any assertions

*** No errors detected

Regards
Guillaume

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.