We had a good discussion with Satish on the mailing list about the repartitioning process and the connection to parallel data mapping in preCICE.
Satish asked (July 24, 2019):
It seems to me that mappings can occur in parallel in preCICE. Is that correct? However, I am not entirely sure how they are being worked out. I wonder if you can throw some light on that.
Let’s say you have two participants S (Solid) and F (Fluid) and that I execute both of them in parallel. Therefore, their meshes are partitioned and distributed to different ranks/processes. In the process, the mesh at the fluid solid interface also gets partitioned and spreads across different ranks. I would imagine that preCICE constructs a weights matrix at the beginning of an analysis to use it for the purpose of interpolation during every coupling iteration. And, let’s assume we are using the nearest neighbor approach to map displacements from S to F. If there is an interface mesh node in rank/process #2 of F for which I would like to know the nearest neighbor in the interface mesh corresponding to S, how does the algorithm find it? If the nearest neighbor is also in the same process i.e. #2, it is straightforward. What if that is not the case? How does the algorithm figure that out?
Florian replied (July 25, 2019):
You’re basically correct.
The mapping relationships are determined at the beginning of simulation. preCICE performs a socalled repartitioning of the meshes, which is largely driven by the mapping requirements. Let’s assume the mapping is computed at participant S. Roughly, the steps are:
 The entire mesh of F is collected and broadcasted to every rank process of S. This is done via the socalled master process of each participant, which is rank 0.
 Each process at S filters the mesh according to the selected mapping method. It marks/tags each vertex of the mesh from F that it needs to perform the mapping
 Based on this tagging the communication links that are established between processes of F and S are determined.
 In the simulation, as long as the mesh does not change, every required vertex is available through the established communication links.
For more information, this process is also described in the thesis of Uekermann:
 Uekermann, B. (2016). Partitioned FluidStructure Interaction on Massively Parallel Systems. Technical University of Munich. Retrieved from https://mediatum.ub.tum.de/1320661
Satish replied (Sep 9, 2019):
I have started going through the thesis of Uekermann that you pointed me to. In Figure 54 of page 73, repartitioning of meshes is being represented. As you will see in that figure, step III involves broadcast of interface mesh of solver B from master node of solver A to the individual slave nodes of A. So, if the interface mesh of solver B is composed of a million vertices, does the picture mean that, after the broadcast operation, each slave rank of solver A ends up receiving a copy of each of these million?
Myself (Sep 9, 2019):
Exactly. These one million vertices are filtered afterwards, however, such that only the part remains that the rank actually needs for its local mapping. Also, one million would be a very large case as for surfacecoupled problems, it would mean one million vertices at the coupling boundary.
If you want to avoid this there is a prefiltering alternative implemented, which filters the mesh partitions already at the master rank, and which, therefore, scales better for large meshes and a small number of ranks.
Last, we currently work on a twolevel initialization, which will improve the complete situation.
A followup from Satish (Sep 9, 2019):
I think I now understand both the broadcast/filter as well as the prefilter/postfilter mesh repartitioning approaches. They make perfect sense. I do have couple of follow up questions for you in the context of the entire coupling workflow. Your responses will help me understand what exactly is happening within any coupling iteration and how the communication links are leveraged at run time i.e. during the simulation. Please see below.

Let’s assume that there are two participants F (fluid) and S (solid) and that they are being coupled via the conventional serialimplicit approach. That is, one solver solves after another in a serial fashion and each coupling time step has >1 coupling iterations.

Let’s also assume that the mappings are to occur in F. In order to enable this, assume that the necessary changes to the configuration (.xml) file are made.

Also assume that solver F is launched with 3 ranks/processes and S is launched with 4 ranks/processes. Let’s use the broadcast/filter approach to do mesh repartitioning.

In the first coupling iteration of the first coupling time step, the mesh from S is filtered in F since we asked (via the xml file) that the mappings occur in F. As a result of the filtering, each vertex in any rank of F knows which vertex (or vertices) of S it wants data from. Based on this tagging, communication links are established between individual processes of F and S.
If everything I stated above makes sense, could you please answer the questions below?

I understand that the communication links are established just once in the first coupling iteration of the first coupling step. Is that correct?

In subsequent coupling iterations of the first step and all other steps as well, how are the communication links exactly leveraged? Are only data values exchanged through them? For example, if a given mesh vertex V1 in slave rank #2 of F wants to know what displacements to use before F can solve, does the following happen?
 From the tagging information (is this some sort of a weights matrix?), figure out which vertex/vertices of S have nonzero weights associated with them. That is, figure out which of the mesh vertices in S does V1 map to.
 And then, look for them in all the ranks of S that rank#2 of F is connected to.
 Once found, fetch the displacements data from those ranks and use the weights to arrive at the interpolated value of the displacements vector at V1
Myself (Sep 10, 2019):

Let’s assume that there are two participants F (fluid) and S (solid) and that they are being coupled via the conventional serialimplicit approach. That is, one solver solves after another in a serial fashion and each coupling time step has >1 coupling iterations.

Let’s also assume that the mappings are to occur in F. In order to enable this, assume that the necessary changes to the configuration (.xml) file are made.

Also assume that solver F is launched with 3 ranks/processes and S is launched with 4 ranks/processes. Let’s use the broadcast/filter approach to do mesh repartitioning.

In the first coupling iteration of the first coupling time step, the mesh from S is filtered in F since we asked (via the xml file) that the mappings occur in F. As a result of the filtering, each vertex in any rank of F knows which vertex (or vertices) of S it wants data from. Based on this tagging, communication links are established between individual processes of F and S.
Makes all sense with the only exception that the communication of the solid mesh, the mesh repartitioning of the received mesh, and the establishment of the communication channels, all that happens in initialize()
, so already before the “first coupling iteration of the first coupling time step”.
 I understand that the communication links are established just once in the first coupling iteration of the first coupling step. Is that correct?
Yes (but already in initialize)

In subsequent coupling iterations of the first step and all other steps as well, how are the communication links exactly leveraged? Are only data values exchanged through them? For example, if a given mesh vertex V1 in slave rank #2 of F wants to know what displacements to use before F can solve, does the following happen?
 From the tagging information (is this some sort of a weights matrix?), figure out which vertex/vertices of S have nonzero weights associated with them. That is, figure out which of the mesh vertices in S does V1 map to.
 And then, look for them in all the ranks of S that rank#2 of F is connected to.
 Once found, fetch the displacements data from those ranks and use the weights to arrive at the interpolated value of the displacements vector at V1
Yes, only data values are currently exchanged. And yes, you could regard a data mapping method as a sort of weighting. For RBF interpolation this weighting is, however, only implicit, as we solve the interpolation system in every timestep. The communication is however a bit simpler. Once everything is set up, communication and data mapping are two separate things. In your case, data values are communicated via mesh S. Every rank of S sends the values of its vertices to the connected ranks of F. It could be that one vertex of rank 3 of S has copies on multiple ranks of F. Then, the value is sent to all these ranks. Afterwards, every rank of F can do the mapping. For a nearestneighbor or a nearestprojection mapping, for example, this last step is a complete local operations. It needs no communication. Let’s say, on every rank, v_F = A * v_S with A being some sort of local weighting matrix.
Good to know: we are also working on dynamic meshes. Meaning that in a future release of preCICE, you will be able to also change the mesh in every iteration (to allow dynamic adaptive meshes or mapping between a background mesh and an immersed mesh)
A final followup from Satish with me answering (Sep 10, 2019)
 By ‘interpolation system’, do you mean ’ v_F = A * v_S’, where v_F are the values in a rank of F, A is the local weighting matrix (calculated during the initialization phase), and v_S are the values corresponding to vertices of a filtered mesh of S associated with a rank of F?
No, for RBF, things are more complicated. You have a global linear system you need to solve every time you want to map data. See Section 4.3.2 in my thesis.
 Given that mesh S is already in F, v_F can be calculated within each rank of F and so the interpolated displacements become known to F. What about the forces now? Given that mesh S is the carrier of all data in my case, I am assuming that, in each rank of F, interpolation of forces/pressures occur from the mesh in F to the associated filtered mesh of S. And then, the force values on all filtered meshes of S are agglomerated and shipped back through the master node of F to the master node of S? Is this correct?
Yes, now you have force values on multiple filtered meshes of S. Agglomerated means that they are summed up on the receiving ranks of F (the only valid mapping constraint in this setting is “conservative”). The communication, however, does not go through the master, but pointtopoint to the ranks of F. During each coupling iteration, preCICE performans (almost) no global operations, only during the initialization.