[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] fence: introduce a file-based self-fence mechanism
From: |
Felipe Franciosi |
Subject: |
Re: [PATCH] fence: introduce a file-based self-fence mechanism |
Date: |
Thu, 5 Dec 2019 14:22:29 +0000 |
Heya,
> On Nov 26, 2019, at 12:18 PM, Marc-André Lureau <address@hidden> wrote:
>
> On Mon, Nov 25, 2019 at 8:14 PM Felipe Franciosi <address@hidden> wrote:
>>
>> This introduces a self-fence mechanism to Qemu, causing it to die if a
>> heartbeat condition is not met. Currently, a file-based heartbeat is
>> available and can be configured as follows:
>>
>> -object file-fence,id=ff0,file=/foo,qtimeout=20,ktimeout=25,signal=kill
>>
>> Qemu will watch 'file' for attribute changes. Touching the file works as
>> a heartbeat. This parameter is mandatory.
>>
>> Fencing happens after 'qtimeout' or 'ktimeout' seconds elapse without a
>> heartbeat. At least one of these must be specified. Both may be used.
>>
>> When using 'qtimeout', an internal Qemu timer is used. Fencing with this
>> method gives Qemu a chance to write a log message indicating which file
>> caused the event. If Qemu's main loop is hung for whatever reason, this
>> method won't successfully kill Qemu.
>>
>> When using 'ktimeout', a kernel timer is used. In this case, 'signal'
>> can be 'kill' (for SIGKILL, default) or 'quit' (for SIGQUIT). Using
>> SIGQUIT may be preferred for obtaining core dumps. If Qemu is hung
>> (eg. uninterruptable sleep), this method won't successfully kill Qemu.
>>
>> It is worth noting that even successfully killing Qemu may not be
>> sufficient to completely fence a VM as certain operations like network
>> packets or block commands may be pending in the kernel. If that is a
>> concern, systems should consider using further fencing mechanisms like
>> hardware watchdogs either in addition or in conjunction with this for
>> additional protection.
>>
>> Signed-off-by: Felipe Franciosi <address@hidden>
>> ---
>> Based-on: <address@hidden>
>>
>> Makefile.objs | 1 +
>> fence/Makefile.objs | 1 +
>> fence/file_fence.c | 381 ++++++++++++++++++++++++++++++++++++++++++++
>
> I think it could be under backends/
I thought about it and couldn't make up my mind. My decision was based on:
- Doesn't really feel like a "backend".
- I envision other types of self-fencing heartbeats (eg. network-based),
in which case this would probably be split in a "common" file.
Arguably other objects in backends/ also fall within these categories,
so I'm happy to move if you think it's better. Let me know.
> And a slight preference for - seperated words in filenames over qemu codebase.
Sure, will change.
>
>> qemu-options.hx | 27 +++-
>> 4 files changed, 409 insertions(+), 1 deletion(-)
>> create mode 100644 fence/Makefile.objs
>> create mode 100644 fence/file_fence.c
>>
>> diff --git a/Makefile.objs b/Makefile.objs
>> index 11ba1a36bd..998eed4796 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -75,6 +75,7 @@ common-obj-$(CONFIG_TPM) += tpm.o
>>
>> common-obj-y += backends/
>> common-obj-y += chardev/
>> +common-obj-y += fence/
>>
>> common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
>> qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
>> diff --git a/fence/Makefile.objs b/fence/Makefile.objs
>> new file mode 100644
>> index 0000000000..2ed2092568
>> --- /dev/null
>> +++ b/fence/Makefile.objs
>> @@ -0,0 +1 @@
>> +common-obj-y += file_fence.o
>> diff --git a/fence/file_fence.c b/fence/file_fence.c
>> new file mode 100644
>> index 0000000000..5b743e69d2
>> --- /dev/null
>> +++ b/fence/file_fence.c
>> @@ -0,0 +1,381 @@
>> +/*
>> + * QEMU file-based self-fence mechanism
>> + *
>> + * Copyright (c) 2019 Nutanix Inc. All rights reserved.
>> + *
>> + * Authors:
>> + * Felipe Franciosi <address@hidden>
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, see
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.gnu.org_licenses_&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=CCrJKVC5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=edzXQ5P0GCmmzZ4-20uvsmgqIreV3BKYC8JYikgHVH4&s=q3GMprakVoRJw8zkbbjXPqAbExv93DzJ-HgI2z00408&e=
>> >.
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qom/object_interfaces.h"
>> +#include "qemu/filemonitor.h"
>> +#include "qemu/timer.h"
>> +
>> +#include <time.h>
>> +
>> +#define TYPE_FILE_FENCE "file-fence"
>> +
>> +typedef struct FileFence {
>> + Object parent_obj;
>> +
>> + gchar *dir;
>> + gchar *file;
>> + uint32_t qtimeout;
>> + uint32_t ktimeout;
>> + int signal;
>> +
>> + timer_t ktimer;
>> + QEMUTimer *qtimer;
>> +
>> + QFileMonitor *fm;
>> + uint64_t id;
>> +} FileFence;
>> +
>> +#define FILE_FENCE(obj) \
>> + OBJECT_CHECK(FileFence, (obj), TYPE_FILE_FENCE)
>> +
>> +static void
>> +timer_update(FileFence *ff)
>> +{
>> + struct itimerspec its = {
>> + .it_value.tv_sec = ff->ktimeout,
>> + };
>> + int err;
>> +
>> + if (ff->qtimeout) {
>> + timer_mod(ff->qtimer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> + ff->qtimeout * 1000);
>> + }
>> +
>> + if (ff->ktimeout) {
>> + err = timer_settime(ff->ktimer, 0, &its, NULL);
>> + g_assert(err == 0);
>> + }
>> +}
>> +
>> +static void
>> +file_fence_abort_cb(void *opaque)
>> +{
>> + FileFence *ff = opaque;
>> + printf("Fencing after %u seconds on '%s'\n",
>> + ff->qtimeout, g_strconcat(ff->dir, "/", ff->file, NULL));
>
> May be error_printf() instead.
Makes sense.
>
>> + abort();
>> +}
>> +
>> +static void
>> +file_fence_watch_cb(int64_t id, QFileMonitorEvent ev, const char *file,
>> + void *opaque)
>> +{
>> + FileFence *ff = opaque;
>> +
>> + if (ev != QFILE_MONITOR_EVENT_ATTRIBUTES) {
>> + return;
>> + }
>> +
>> + if (g_strcmp0(file, ff->file) != 0) {
>
> Does it actually happen? Apparently the code in
> util/filemonitor-inotify.c already checks for g_str_equal(filename)
You are right, it does. I think I saw it, but for some reason decided
to be over protective. I will remove this check.
>
>> + return;
>> + }
>> +
>> + timer_update(ff);
>> +}
>> +
>> +static void
>> +ktimer_tear(FileFence *ff)
>> +{
>> + int err;
>> +
>> + if (ff->ktimer) {
>> + err = timer_delete(ff->ktimer);
>> + g_assert(err == 0);
>> + ff->ktimer = NULL;
>> + }
>> +}
>> +
>> +static void
>> +ktimer_setup(FileFence *ff, Error **errp)
>> +{
>> + int err;
>> +
>> + struct sigevent sev = {
>> + .sigev_notify = SIGEV_SIGNAL,
>> + .sigev_signo = ff->signal ? ff->signal : SIGKILL,
>> + };
>> +
>> + if (ff->ktimeout == 0) {
>> + return;
>> + }
>> +
>> + err = timer_create(CLOCK_MONOTONIC, &sev, &ff->ktimer);
>> + if (err == -1) {
>> + error_setg(errp, "Error creating kernel timer: %m");
>> + return;
>> + }
>> +}
>> +
>> +static void
>> +qtimer_tear(FileFence *ff)
>> +{
>> + if (ff->qtimer) {
>> + timer_del(ff->qtimer);
>> + timer_free(ff->qtimer);
>> + }
>> + ff->qtimer = NULL;
>> +}
>> +
>> +static void
>> +qtimer_setup(FileFence *ff, Error **errp)
>> +{
>> + QEMUTimer *qtimer;
>> +
>> + if (ff->qtimeout == 0) {
>> + return;
>> + }
>> +
>> + qtimer = timer_new_ms(QEMU_CLOCK_REALTIME, file_fence_abort_cb, ff);
>> + if (qtimer == NULL) {
>> + error_setg(errp, "Error creating Qemu timer");
>> + return;
>> + }
>> +
>> + ff->qtimer = qtimer;
>> +}
>> +
>> +static void
>> +watch_tear(FileFence *ff)
>> +{
>> + if (ff->fm) {
>> + qemu_file_monitor_remove_watch(ff->fm, ff->dir, ff->id);
>> + qemu_file_monitor_free(ff->fm);
>> + ff->fm = NULL;
>> + ff->id = 0;
>> + }
>> +}
>> +
>> +static void
>> +watch_setup(FileFence *ff, Error **errp)
>> +{
>> + QFileMonitor *fm;
>> + int64_t id;
>> +
>> + fm = qemu_file_monitor_new(errp);
>> + if (!fm) {
>> + return;
>> + }
>> +
>> + id = qemu_file_monitor_add_watch(fm, ff->dir, ff->file,
>> + file_fence_watch_cb, ff, errp);
>> + if (id < 0) {
>> + qemu_file_monitor_free(fm);
>> + return;
>> + }
>> +
>> + ff->fm = fm;
>> + ff->id = id;
>> +}
>> +
>> +static void
>> +file_fence_complete(UserCreatable *obj, Error **errp)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> +
>> + if (ff->dir == NULL) {
>> + error_setg(errp, "A 'file' must be set");
>> + return;
>> + }
>> +
>> + if (ff->signal != 0 && ff->ktimeout == 0) {
>> + error_setg(errp, "Using 'signal' requires 'ktimeout' to be set");
>> + return;
>> + }
>> +
>> + if (ff->ktimeout == 0 && ff->qtimeout == 0) {
>> + error_setg(errp, "One or both of 'ktimeout' or 'qtimeout' must be
>> set");
>> + return;
>> + }
>> +
>> + if (ff->qtimeout >= ff->ktimeout) {
>> + error_setg(errp, "Using 'qtimeout' >= 'ktimeout' doesn't make
>> sense");
>> + return;
>> + }
>> +
>> + watch_setup(ff, errp);
>> + if (*errp != NULL) {
>> + return;
>> + }
>> +
>> + qtimer_setup(ff, errp);
>> + if (*errp != NULL) {
>> + goto err_watch;
>> + }
>> +
>> + ktimer_setup(ff, errp);
>> + if (*errp != NULL) {
>> + goto err_qtimer;
>> + }
>
>
> I would rather return success on the setup functions and write:
>
> if (!watch_setup(ff, errp) ||
> !qtimer_setup(...) ||
> !...) {
> return; (no cleanup)
> }
>
> Object creation will fail, and finalize function will be called.
>
>> +
>> + timer_update(ff);
>> +
>> + return;
>> +
>> +err_qtimer:
>> + qtimer_tear(ff);
>> +err_watch:
>> + watch_tear(ff);
>> +}
>> +
>> +static void
>> +file_fence_set_signal(Object *obj, const char *value, Error **errp)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> + gchar *gsig;
>> +
>> + if (ff->signal) {
>> + error_setg(errp, "Signal property already set");
>> + return;
>> + }
>> +
>> + gsig = g_ascii_strup(value, -1);
>> +
>> + if (g_strcmp0(gsig, "QUIT") == 0) {
>> + ff->signal = SIGQUIT;
>> + } else
>> + if (g_strcmp0(gsig, "KILL") == 0) {
>> + ff->signal = SIGKILL;
>> + } else {
>> + error_setg(errp, "Invalid signal. Must be 'quit' or 'kill'");
>> + }
>
> Or bail out early for NULL case and use g_ascii_strcasecmp()
Sounds better. Let me look into it.
>
>> +
>> + g_free(gsig);
>> +}
>> +
>> +static char *
>> +file_fence_get_signal(Object *obj, Error **errp)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> +
>> + switch (ff->signal) {
>> + case SIGKILL:
>> + return g_strdup("kill");
>> + case SIGQUIT:
>> + return g_strdup("quit");
>> + }
>> +
>> + /* Unreachable */
>> + abort();
>> +}
>> +
>> +static void
>> +file_fence_set_file(Object *obj, const char *value, Error **errp)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> + gchar *dir, *file;
>
> g_autofree
Sweet.
>
>> +
>> + if (ff->dir) {
>> + error_setg(errp, "File property already set");
>> + return;
>> + }
>> +
>> + dir = g_path_get_dirname(value);
>> + if (g_str_equal(dir, ".")) {
>> + error_setg(errp, "Path for file-fence must be absolute");
>> + return;
>> + }
>> +
>> + file = g_path_get_basename(value);
>> + if (g_str_equal(file, ".")) {
>> + error_setg(errp, "Path for file-fence must be a file");
>> + g_free(dir);
>> + return;
>> + }
>> +
>> + ff->dir = dir;
>> + ff->file = file;
>
> g_steal_pointer()
Cool! Clearly I could spend more time in the glib manual. :)
>
>> +}
>> +
>> +static char *
>> +file_fence_get_file(Object *obj, Error **errp)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> +
>> + if (ff->file) {
>> + return g_strconcat(ff->dir, "/", ff->file, NULL);
>
> Or g_build_filename()
Makes sense.
>
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static void
>> +file_fence_instance_finalize(Object *obj)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> +
>> + ktimer_tear(ff);
>> + qtimer_tear(ff);
>> + watch_tear(ff);
>> +
>> + g_free(ff->file);
>> + g_free(ff->dir);
>> +}
>> +
>> +static void
>> +file_fence_instance_init(Object *obj)
>> +{
>> + FileFence *ff = FILE_FENCE(obj);
>> +
>> + object_property_add_str(obj, "file",
>> + file_fence_get_file,
>> + file_fence_set_file,
>> + &error_abort);
>> + object_property_add_str(obj, "signal",
>> + file_fence_get_signal,
>> + file_fence_set_signal,
>> + &error_abort);
>> + object_property_add_uint32_ptr(obj, "qtimeout", &ff->qtimeout,
>> + false, &error_abort);
>> + object_property_add_uint32_ptr(obj, "ktimeout", &ff->ktimeout,
>> + false, &error_abort);
>
> Make them class properties (this will help with -object
> file-fence,help and such). (fwiw, I have pending patches to support
> class property description & default values..)
I tried to fit some of these in the class, as well as justify a split
of the file-based fencing with a more generic self-fencer right off
the bat. But it didn't make sense in the end. I envisioned scenarios
where you may have different heartbeats for one Qemu with different
timeouts. In that case, it wouldn't work as a class property, right?
>
>> +}
>> +
>> +static void
>> +file_fence_class_init(ObjectClass *klass, void *class_data)
>> +{
>> + UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
>> + ucc->complete = file_fence_complete;
>> +}
>> +
>> +static const TypeInfo file_fence_info = {
>> + .name = TYPE_FILE_FENCE,
>> + .parent = TYPE_OBJECT,
>> + .class_init = file_fence_class_init,
>> + .instance_size = sizeof(FileFence),
>> + .instance_init = file_fence_instance_init,
>> + .instance_finalize = file_fence_instance_finalize,
>> + .interfaces = (InterfaceInfo[]) {
>> + { TYPE_USER_CREATABLE },
>> + { }
>> + }
>> +};
>> +
>> +static void
>> +register_types(void)
>> +{
>> + type_register_static(&file_fence_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 65c9473b73..995d3d6abf 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -4929,8 +4929,33 @@ CN=laptop.example.com,O=Example
>> Home,L=London,ST=London,C=GB
>>
>> @end table
>>
>> -ETEXI
>> +@item -object
>> file-fence,id=@var{id},file=@var{file},qtimeout=@var{qtimeout},ktimeout=@var{ktimeout},signal=@{signal}
>> +
>> +Self-fence Qemu if @var{file} is not modified within a given timeout.
>> +
>> +Qemu will watch @var{file} for attribute changes. Touching the file works
>> as a
>> +heartbeat. This parameter is mandatory.
>> +
>> +Fencing happens after @var{qtimeout} or @var{ktimeout} seconds elapse
>> +without a heartbeat. At least one of these must be specified. Both may be
>> used.
>>
>> +When using @var{qtimeout}, an internal Qemu timer is used. Fencing with
>> +this method gives Qemu a chance to write a log message indicating which file
>> +caused the event. If Qemu's main loop is hung for whatever reason, this
>> method
>> +won't successfully kill Qemu.
>> +
>> +When using @var{ktimeout}, a kernel timer is used. In this case,
>> @var{signal}
>> +can be 'kill' (for SIGKILL, default) or 'quit' (for SIGQUIT). Using SIGQUIT
>> may
>> +be preferred for obtaining core dumps. If Qemu is hung (eg. uninterruptable
>> +sleep), this method won't successfully kill Qemu.
>> +
>> +It is worth noting that even successfully killing Qemu may not be
>> sufficient to
>> +completely fence a VM as certain operations like network packets or block
>> +commands may be pending in the kernel. If that is a concern, systems should
>> +consider using further fencing mechanisms like hardware watchdogs either in
>> +addition or in conjunction with this feature for additional protection.
>> +
>> +ETEXI
>>
>> HXCOMM This is the last statement. Insert new options before this line!
>> STEXI
>> --
>> 2.20.1
>>
>
> Code looks fine to me otherwise
Let me work on a v2 next week.
F.
>
> --
> Marc-André Lureau
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [PATCH] fence: introduce a file-based self-fence mechanism,
Felipe Franciosi <=