How to configure preCICE communication on kubernetes with TCP port access control?

I am trying to replicate the Perpendicular Flap example on a k8s cloud that uses an istio service mesh. I am running OpenFOAM with the preCICE adapter in one pod and Calculix with the preCICE adapter in another pod in the same k8s namespace. In theory this is not too different from running the two components in different docker containers in a docker network. Now here comes the problem: I am using Istio with strict mTLS, so that by default, all traffic should be sent through the istio sidecars. I do not want to change this setting as it is a basic security feature of istio.

To get the preCICE communication to work, I currently have to do two things:

  1. use the following m2n configuration setting in the precice-config.xml:
<m2n:sockets from="Fluid" to="Solid" exchange-directory="../../pvc_shared" network="eth0" port="50061" enforce-gather-scatter="1" />

where enforce-gather-scatter=“1” is required to control the communication port range (only 50061). I understand that this limits performance, so I would prefer to remove it.
If I don’t use this flag, the Calculix participant fails with the following error:

Setting up preCICE participant Solid, using config file: config.yml
---[precice]  This is preCICE version 2.5.0
---[precice]  Revision info: no-info [git failed to run]
---[precice]  Build type: Release (without debug log)
---[precice]  Configuring preCICE with configuration "../precice-config.xml"
---[precice]  I am participant "Solid"
Using quasi 2D-3D coupling
Set ID Found 
2D-3D mapping results in a 2D mesh of size 247 from the 494 points in 3D space.
Read data 'Force' found with ID # '3'.
Write data 'Displacement' found with ID # '2'.
---[precice]  Setting up primary communication to coupling partner/s
---[precice]  Primary ranks are connected
---[precice]  Setting up preliminary secondary communication to coupling partner/s
---[precice]  Prepare partition for mesh Solid-Mesh
---[precice]  Gather mesh Solid-Mesh
---[precice]  Send global mesh Solid-Mesh
---[precice]  Setting up secondary communication to coupling partner/s
---[precice] ERROR:  Accepting a socket connection at  failed with the system error: bind: Address already in use
Segmentation fault (core dumped)

Question: Could it be that preCICE tries to open additional communication ports which get blocked by istio?

  1. I also have to open-up port 50061 in the k8s container definition and explicitly exclude this port from istio communications - see k8s deployment definition below. Normally, I would prefer to define ports in a k8s service entry linked to the deployment, and route using a service name (eg: openFOAM:50061) instead of IP address (eth0). That asside, I think that if I could set what ports preCICE opens, then I could update the ports in the deployment definition below accordingly.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-deployment
  labels:
    app: python-deployment
  namespace: user
spec:
  replicas: 1
  selector:
    matchLabels:
      app: python-deployment
      version: alpha
  template:
    metadata:
      labels:
        app: python-deployment
      annotations:
        traffic.sidecar.istio.io/excludeInboundPorts: "50061" 
        traffic.sidecar.istio.io/excludeOutboundPorts: "50061"
    spec:
      serviceAccountName: python-deployment
      securityContext:
        runAsUser: 1000
      containers:
      - name: python-deployment
        image: xxxxxxxxxxx
        imagePullPolicy: Always
        securityContext:
          runAsUser: 1000
          allowPrivilegeEscalation: false
        volumeMounts:
        - name: task-storage
          mountPath: /app/precice-xxxxxxxxxxxxx-comp/editables/pvc_shared
        ports:
        - containerPort: 50061
      initContainers:
      - name: data-permission-fix
        image: busybox
        command: ["/bin/chmod","-R","u=rwX,g=rwX,o=rwX", "/data"]
        volumeMounts:
        - name: task-storage
          mountPath: /data
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: "1Gi"
      volumes:
      - name: task-storage
        persistentVolumeClaim:
          claimName: nfs-pvc

Any ideas?

PS: Great to meet you all at the conference in Chania last week.
Olivia

2 Likes

Hi,

nice to hear from you again after Coupled 23!

Question 1 : connection count in m2n

The m2n connects m parallel instances of one solver to n parallel instances of another solver. There are 2 strategies for this

  1. Gather all information on the primary instance of one participant, send it over to the other participant and finally scatter it on there.
    This requires only one single connection between participants.
  2. Establish required point-to-point connections between parallel instances and communicate fully in parallel. This requires one primary connection between participants as well as multiple sub connections for each communication channel, thus requiring up to n ports.

I suggest checking out Section 4.2 of B. Uekermann’s dissertation if you want to know more about the design decisions behind this behaviour.

When you provide a port, then the second method will fail as multiple separate connections are required. Hence, you need enforce-gather-scatter="1".
So, your observation is correct.

Question 2 : port range configuration

The selected ports are contained in the debug logs. To see them, you most likely need to build preCICE from source and configure preCICE with debug logs either by building the debug version of preCICE or by building a release build with PRECICE_RELEASE_WITH_DEBUG_LOG enabled.

I opened an issue to support specifying port ranges. This also contains a workaround on how to extract the ports from the debug logs.

Question NULL : host resolution

We also talked about possible host name resolution of the address field.
I already opened an issue regarding this.

Feel invited to comment and add missing perspectives. It will help us to flesh out and prioritize features.

Best regards,
Frédéric

3 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

We now have documentation about this: