System Tests - what is public API vs internal detail?

@makish i have been spending time in tools/tests/systemtests ( mainly systemtest.py , systemtestarguments.py and testsuite.py ) and i’m trying to understand where the boundary is between PUBLIC API and internal implementation for the system tests ,

from outside i could see that things like tests.yaml ,reference_versions.yaml , some build_args (example python_bindings_ref ,precice_ref ) feels nice and stable at the same time but parts of systemtests itself (like config overrides , log archieving , etc) look more easier to change ,

when adding features like max_time overrides or the iterations-log hashing , i’m never sure how much backward compatibility i should assume at this layer .

do you have rough guidance for which pieces you consider a stable contract vs things we might freely reshape between releases , ig i was not able to find out a thread for this discussion so i thought i should ask here .
Thanks