|
From: | manish.mishra |
Subject: | Re: [PATCH] migration: check magic value for deciding the mapping of channels |
Date: | Thu, 3 Nov 2022 22:04:54 +0530 |
User-agent: | Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 |
On Thu, Nov 03, 2022 at 02:50:25PM +0530, manish.mishra wrote:On 01/11/22 9:15 pm, Daniel P. Berrangé wrote:On Tue, Nov 01, 2022 at 09:10:14PM +0530, manish.mishra wrote:On 01/11/22 8:21 pm, Daniel P. Berrangé wrote:On Tue, Nov 01, 2022 at 02:30:29PM +0000, manish.mishra wrote:diff --git a/migration/migration.c b/migration/migration.c index 739bb683f3..f4b6f278a9 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -733,31 +733,40 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) { MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; - bool start_migration; QEMUFile *f; + bool default_channel = true; + uint32_t channel_magic = 0; + int ret = 0; - if (!mis->from_src_file) { - /* The first connection (multifd may have multiple) */ + if (migrate_use_multifd() && !migration_in_postcopy()) { + ret = qio_channel_read_peek_all(ioc, (void *)&channel_magic, + sizeof(channel_magic), &local_err); + + if (ret != 1) { + error_propagate(errp, local_err); + return; + }....and thus this will fail for TLS channels AFAICT.Yes, thanks for quick review Daniel. You pointed this earlier too, sorry missed it, will put another check !migrate_use_tls() in V2.But we need this problem fixed with TLS too, so just excluding it isn't right. IMHO we need to modify the migration code so we can read the magic earlier, instead of peeking. With regards, DanielHi Daniel, I was trying tls migrations. What i see is that tls session creation does handshake. So if we read ahead in ioc_process_incoming for default channel. Because client sends magic only after multiFD channels are setup, which too requires tls handshake.By the time we get to migrate_ioc_process_incoming, the TLS handshake has already been performed. migration_channel_process_incoming -> migration_ioc_process_incoming vs migration_channel_process_incoming -> migration_tls_channel_process_incoming -> migration_tls_incoming_handshake -> migration_channel_process_incoming -> migration_ioc_process_incoming
Yes sorry i thought we block on source side till handshake is done but that is not true. I checked then why that deadlock is happening. So this where the dealock is happening.
static int ram_save_setup(QEMUFile *f, void *opaque) {
+
+
ram_control_before_iterate(f, RAM_CONTROL_SETUP);
ram_control_after_iterate(f, RAM_CONTROL_SETUP);
ret = multifd_send_sync_main(f);
if (ret < 0) {
return ret;
}
qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
qemu_fflush(f);
return 0;
}
Now if we block in migration_ioc_process_incoming for reading
magic value from channel, which is actually sent by client when
this qemu_fflush is done. Before this qemu_fflush we wait for
multifd_send_sync_main which actually requires that tls handshake
is done for multiFD channels as it blocks on sem_sync which posted
by multifd_send_thread which is called after handshake.
But then on destination side we are blocked in
migration_ioc_process_incoming() waiting to read something from
default channel hence handshake for multiFD channels can not
happen. This to me looks unresolvable whatever way we try to
manipulate stream until we do some changes on source side.
With regards, Daniel
[Prev in Thread] | Current Thread | [Next in Thread] |