Hi,
first of all: this is a very interesting topic, thank you for bringing it up!
I do sympathise with your idea and I don't think it is a very unique use-case at all. I also use MuseScore, but mostly for rehearsing. So I let MuseScore play some voices of a piece while I play another voice on top. And I usually choose sounds with a very short attack phase in MuseScore, so that the attack phases don't mess with the timing. I have used the OnTime offset shifting of MuseScore 2 from time to time, but as MuseScore 3 has crippled the user-interface for that feature (you can only affect a single note at a time), it is now way too much hassle for me.
So I agree: it would be great to have a good solution to this problem.
Let's assume that SoundFonts would be extended so that they contain information about the length of the attack phase and the position of the first "meaningful" sample point, i.e. the sample point that should be "on beat". Lets ignore the fact that there are MIDI messages that affect the sample offset. And lets also ignore that the choice of which sample is heard can also depend on oscillators. And that for a single note-on, many different samples (with different attack phases) could be started simultaneously.
Then the synth would have to start playing the sample *before* the beat, in order to play it on beat. In other words, the sampler would have to act on a note-on event before this note-on event is actually due. And in order to decide which sample to play and how much attack phase time needs to be compensated, it would have to examine all CC and other messages that could influence sample choice and sample offsets leading up to that note-on event. And assuming we don't want to put an artificial limit on how long an attack phase of a sample is allowed to be, that effectively means that the synth would need to know and analyse (or even simulate) the complete MIDI stream right until the last event before starting the playback.
That is obviously impossible for live playback, you say that yourself. But it also doesn't fit how synthesizers like FluidSynth are used by MuseScore, Ardour and similar programs. Because as far as I know, those programs are MIDI sequencers themselves. In other words: *they* control which MIDI messages to send and - most importantly - when to send them. They don't pass a complete MIDI file to Fluidsynth to be played, but rather send MIDI messages in bursts or as a continuous stream. And they have very good reasons to do it that way.
So in my opinion, if we wanted to implement a system like you propose, it would have to be implemented in the MIDI sequencer. In other words: in MuseScore, Ardour and all the other programs that use MIDI events to control synthesizers (which also includes FluidSynths internal sequencer used to play MIDI files).
So maybe all that is needed is better sequencer support for shifting OneTime offsets for notes, tracks and scores. MuseScore is definitely lacking in that regard, it needs better user-interfaces for selecting multiple notes and affecting their OnTime offset. Maybe even support for some database of popular soundfonts that lists the OnTime offset for each note of each sample. MuseScore could then read that database and adjust all notes in a track automatically, if the user decides that is would make musical sense.
Cheers
Marcus