[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 10/28] migration: add reporting of errors for
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH v5 10/28] migration: add reporting of errors for outgoing migration |
Date: |
Fri, 18 Mar 2016 16:33:25 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
* Daniel P. Berrange (address@hidden) wrote:
> Currently if an application initiates an outgoing migration,
> it may or may not, get an error reported back on failure. If
> the error occurs synchronously to the 'migrate' command
> execution, the client app will see the error message. This
> is the case for DNS lookup failures. If the error occurs
> asynchronously to the monitor command though, the error
> will be thrown away and the client left guessing about
> what went wrong. This is the case for failure to connect
> to the TCP server (eg due to wrong port, or firewall
> rules, or other similar errors).
>
> In the future we'll be adding more scope for errors to
> happen asynchronously with the TLS protocol handshake.
> TLS errors are hard to diagnose even when they are well
> reported, so discarding errors entirely will make it
> impossible to debug TLS connection problems.
>
> Management apps which do migration are already using
> 'query-migrate' / 'info migrate' to check up on progress
> of background migration operations and to see their end
> status. This is a fine place to also include the error
> message when things go wrong.
>
> This patch thus adds an 'error-desc' field to the
> MigrationInfo struct, which will be populated when
> the 'status' is set to 'failed':
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
> (qemu) migrate -d tcp:localhost:9001
> (qemu) info migrate
> capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks:
> off compress: off events: off x-postcopy-ram: off
> Migration status: failed (Error connecting to socket: Connection refused)
> total time: 0 milliseconds
>
> In the HMP, when doing non-detached migration, it is
> also possible to display this error message directly
> to the app.
>
> (qemu) migrate tcp:localhost:9001
> Error connecting to socket: Connection refused
>
> Or with QMP
>
> {
> "execute": "query-migrate",
> "arguments": {}
> }
> {
> "return": {
> "status": "failed",
> "error-desc": "address resolution failed for myhost:9000: No address
> associated with hostname"
> }
> }
>
> Signed-off-by: Daniel P. Berrange <address@hidden>
> ---
> hmp.c | 13 ++++++++++++-
> include/migration/migration.h | 5 ++++-
> include/qapi/error.h | 2 +-
> migration/migration.c | 15 ++++++++++++---
> migration/rdma.c | 10 +++-------
> migration/tcp.c | 2 +-
> migration/unix.c | 2 +-
> qapi-schema.json | 7 ++++++-
> trace-events | 2 +-
> util/error.c | 2 +-
> 10 files changed, 42 insertions(+), 18 deletions(-)
>
> diff --git a/hmp.c b/hmp.c
> index 5b6084a..7126f17 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -34,6 +34,7 @@
> #include "ui/console.h"
> #include "block/qapi.h"
> #include "qemu-io.h"
> +#include "qemu/error-report.h"
>
> #ifdef CONFIG_SPICE
> #include <spice/enums.h>
> @@ -167,8 +168,15 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
> }
>
> if (info->has_status) {
> - monitor_printf(mon, "Migration status: %s\n",
> + monitor_printf(mon, "Migration status: %s",
> MigrationStatus_lookup[info->status]);
> + if (info->status == MIGRATION_STATUS_FAILED &&
> + info->has_error_desc) {
> + monitor_printf(mon, " (%s)\n", info->error_desc);
> + } else {
> + monitor_printf(mon, "\n");
> + }
> +
> monitor_printf(mon, "total time: %" PRIu64 " milliseconds\n",
> info->total_time);
> if (info->has_expected_downtime) {
> @@ -1532,6 +1540,9 @@ static void hmp_migrate_status_cb(void *opaque)
> if (status->is_block_migration) {
> monitor_printf(status->mon, "\n");
> }
> + if (info->has_error_desc) {
> + error_report("%s", info->error_desc);
> + }
> monitor_resume(status->mon);
> timer_del(status->timer);
> g_free(status);
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index e335380..46c1bbe 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -171,6 +171,9 @@ struct MigrationState
> QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest)
> src_page_requests;
> /* The RAMBlock used in the last src_page_request */
> RAMBlock *last_req_rb;
> +
> + /* The last error that occurred */
> + Error *error;
> };
>
> void migrate_set_state(int *state, int old_state, int new_state);
> @@ -207,7 +210,7 @@ void rdma_start_outgoing_migration(void *opaque, const
> char *host_port, Error **
>
> void rdma_start_incoming_migration(const char *host_port, Error **errp);
>
> -void migrate_fd_error(MigrationState *s);
> +void migrate_fd_error(MigrationState *s, const Error *error);
>
> void migrate_fd_connect(MigrationState *s);
>
> diff --git a/include/qapi/error.h b/include/qapi/error.h
> index 02e9dd2..c7e2869 100644
> --- a/include/qapi/error.h
> +++ b/include/qapi/error.h
> @@ -139,7 +139,7 @@ typedef enum ErrorClass {
> /*
> * Get @err's human-readable error message.
> */
> -const char *error_get_pretty(Error *err);
> +const char *error_get_pretty(const Error *err);
>
> /*
> * Get @err's error class.
> diff --git a/migration/migration.c b/migration/migration.c
> index c22c0eb..b4a63a9 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -691,6 +691,10 @@ MigrationInfo *qmp_query_migrate(Error **errp)
> break;
> case MIGRATION_STATUS_FAILED:
> info->has_status = true;
> + if (s->error) {
> + info->has_error_desc = true;
> + info->error_desc = g_strdup(error_get_pretty(s->error));
> + }
> break;
> case MIGRATION_STATUS_CANCELLED:
> info->has_status = true;
> @@ -863,12 +867,15 @@ static void migrate_fd_cleanup(void *opaque)
> notifier_list_notify(&migration_state_notifiers, s);
> }
>
> -void migrate_fd_error(MigrationState *s)
> +void migrate_fd_error(MigrationState *s, const Error *error)
> {
> - trace_migrate_fd_error();
> + trace_migrate_fd_error(error ? error_get_pretty(error) : "");
> assert(s->to_dst_file == NULL);
> migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> MIGRATION_STATUS_FAILED);
> + if (!s->error) {
> + s->error = error_copy(error);
> + }
> notifier_list_notify(&migration_state_notifiers, s);
> }
>
> @@ -967,6 +974,8 @@ MigrationState *migrate_init(const MigrationParams
> *params)
> s->postcopy_after_devices = false;
> s->migration_thread_running = false;
> s->last_req_rb = NULL;
> + error_free(s->error);
> + s->error = NULL;
>
> migrate_set_state(&s->state, MIGRATION_STATUS_NONE,
> MIGRATION_STATUS_SETUP);
>
> @@ -1067,7 +1076,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool
> blk,
> }
>
> if (local_err) {
> - migrate_fd_error(s);
> + migrate_fd_error(s, local_err);
> error_propagate(errp, local_err);
> return;
> }
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 187fc1c..cd33d90 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -3487,16 +3487,14 @@ void rdma_start_outgoing_migration(void *opaque,
> const char *host_port, Error **errp)
> {
> MigrationState *s = opaque;
> - Error *local_err = NULL, **temp = &local_err;
> - RDMAContext *rdma = qemu_rdma_data_init(host_port, &local_err);
> + RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
> int ret = 0;
>
> if (rdma == NULL) {
> - ERROR(temp, "Failed to initialize RDMA data structures! %d", ret);
> goto err;
> }
>
> - ret = qemu_rdma_source_init(rdma, &local_err,
> + ret = qemu_rdma_source_init(rdma, errp,
> s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]);
>
> if (ret) {
> @@ -3504,7 +3502,7 @@ void rdma_start_outgoing_migration(void *opaque,
> }
>
> trace_rdma_start_outgoing_migration_after_rdma_source_init();
> - ret = qemu_rdma_connect(rdma, &local_err);
> + ret = qemu_rdma_connect(rdma, errp);
>
> if (ret) {
> goto err;
> @@ -3516,7 +3514,5 @@ void rdma_start_outgoing_migration(void *opaque,
> migrate_fd_connect(s);
> return;
> err:
> - error_propagate(errp, local_err);
> g_free(rdma);
> - migrate_fd_error(s);
> }
> diff --git a/migration/tcp.c b/migration/tcp.c
> index e1fa7f8..d0e0db9 100644
> --- a/migration/tcp.c
> +++ b/migration/tcp.c
> @@ -40,7 +40,7 @@ static void tcp_wait_for_connect(int fd, Error *err, void
> *opaque)
> if (fd < 0) {
> DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
> s->to_dst_file = NULL;
> - migrate_fd_error(s);
> + migrate_fd_error(s, err);
> } else {
> DPRINTF("migrate connect success\n");
> s->to_dst_file = qemu_fopen_socket(fd, "wb");
> diff --git a/migration/unix.c b/migration/unix.c
> index d9aac36..b3537fd 100644
> --- a/migration/unix.c
> +++ b/migration/unix.c
> @@ -40,7 +40,7 @@ static void unix_wait_for_connect(int fd, Error *err, void
> *opaque)
> if (fd < 0) {
> DPRINTF("migrate connect error: %s\n", error_get_pretty(err));
> s->to_dst_file = NULL;
> - migrate_fd_error(s);
> + migrate_fd_error(s, err);
> } else {
> DPRINTF("migrate connect success\n");
> s->to_dst_file = qemu_fopen_socket(fd, "wb");
> diff --git a/qapi-schema.json b/qapi-schema.json
> index f253a37..a3fe948 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -484,6 +484,10 @@
> # throttled during auto-converge. This is only present when
> auto-converge
> # has started throttling guest cpus. (Since 2.5)
> #
> +# @error-desc: #optional the human readable error description string, when
> +# @status is 'failed'. Clients should not attempt to parse the
> +# error strings. (Since 2.6)
> +#
> # Since: 0.14.0
> ##
> { 'struct': 'MigrationInfo',
> @@ -494,7 +498,8 @@
> '*expected-downtime': 'int',
> '*downtime': 'int',
> '*setup-time': 'int',
> - '*x-cpu-throttle-percentage': 'int'} }
> + '*x-cpu-throttle-percentage': 'int',
> + '*error-desc': 'str'} }
>
> ##
> # @query-migrate
> diff --git a/trace-events b/trace-events
> index d494de1..8a29e94 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1483,7 +1483,7 @@ await_return_path_close_on_source_close(void) ""
> await_return_path_close_on_source_joining(void) ""
> migrate_set_state(int new_state) "new state %d"
> migrate_fd_cleanup(void) ""
> -migrate_fd_error(void) ""
> +migrate_fd_error(const char *error_desc) "error=%s"
> migrate_fd_cancel(void) ""
> migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len)
> "in %s at %zx len %zx"
> migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t
> nonpost) "pending size %" PRIu64 " max %" PRIu64 " (post=%" PRIu64 "
> nonpost=%" PRIu64 ")"
> diff --git a/util/error.c b/util/error.c
> index 47f93af..16132ec 100644
> --- a/util/error.c
> +++ b/util/error.c
> @@ -216,7 +216,7 @@ ErrorClass error_get_class(const Error *err)
> return err->err_class;
> }
>
> -const char *error_get_pretty(Error *err)
> +const char *error_get_pretty(const Error *err)
> {
> return err->msg;
> }
> --
> 2.5.0
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- [Qemu-devel] [PATCH v5 01/28] s390: use FILE instead of QEMUFile for creating text file, (continued)
- [Qemu-devel] [PATCH v5 01/28] s390: use FILE instead of QEMUFile for creating text file, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 03/28] migration: remove use of qemu_bufopen from vmstate tests, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 02/28] io: avoid double-free when closing QIOChannelBuffer, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 05/28] migration: split migration hooks out of QEMUFileOps, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 08/28] migration: introduce a new QEMUFile impl based on QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 06/28] migration: introduce set_blocking function in QEMUFileOps, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 07/28] migration: force QEMUFile to blocking mode for outgoing migration, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 09/28] migration: add helpers for creating QEMUFile from a QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 10/28] migration: add reporting of errors for outgoing migration, Daniel P. Berrange, 2016/03/18
- Re: [Qemu-devel] [PATCH v5 10/28] migration: add reporting of errors for outgoing migration,
Dr. David Alan Gilbert <=
- [Qemu-devel] [PATCH v5 12/28] migration: convert unix socket protocol to use QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 11/28] migration: convert post-copy to use QIOChannelBuffer, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 13/28] migration: rename unix.c to socket.c, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 15/28] migration: convert fd socket protocol to use QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 16/28] migration: convert exec socket protocol to use QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 14/28] migration: convert tcp socket protocol to use QIOChannel, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 18/28] migration: convert savevm to use QIOChannel for writing to files, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 17/28] migration: convert RDMA to use QIOChannel interface, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 19/28] migration: delete QEMUFile buffer implementation, Daniel P. Berrange, 2016/03/18
- [Qemu-devel] [PATCH v5 21/28] migration: delete QEMUFile sockets implementation, Daniel P. Berrange, 2016/03/18