qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fence: introduce a file-based self-fence mechanism


From: Felipe Franciosi
Subject: Re: [PATCH] fence: introduce a file-based self-fence mechanism
Date: Thu, 5 Dec 2019 14:22:29 +0000

Heya,

> On Nov 26, 2019, at 12:18 PM, Marc-André Lureau <address@hidden> wrote:
> 
> On Mon, Nov 25, 2019 at 8:14 PM Felipe Franciosi <address@hidden> wrote:
>> 
>> This introduces a self-fence mechanism to Qemu, causing it to die if a
>> heartbeat condition is not met. Currently, a file-based heartbeat is
>> available and can be configured as follows:
>> 
>> -object file-fence,id=ff0,file=/foo,qtimeout=20,ktimeout=25,signal=kill
>> 
>> Qemu will watch 'file' for attribute changes. Touching the file works as
>> a heartbeat. This parameter is mandatory.
>> 
>> Fencing happens after 'qtimeout' or 'ktimeout' seconds elapse without a
>> heartbeat. At least one of these must be specified. Both may be used.
>> 
>> When using 'qtimeout', an internal Qemu timer is used. Fencing with this
>> method gives Qemu a chance to write a log message indicating which file
>> caused the event. If Qemu's main loop is hung for whatever reason, this
>> method won't successfully kill Qemu.
>> 
>> When using 'ktimeout', a kernel timer is used. In this case, 'signal'
>> can be 'kill' (for SIGKILL, default) or 'quit' (for SIGQUIT). Using
>> SIGQUIT may be preferred for obtaining core dumps. If Qemu is hung
>> (eg. uninterruptable sleep), this method won't successfully kill Qemu.
>> 
>> It is worth noting that even successfully killing Qemu may not be
>> sufficient to completely fence a VM as certain operations like network
>> packets or block commands may be pending in the kernel. If that is a
>> concern, systems should consider using further fencing mechanisms like
>> hardware watchdogs either in addition or in conjunction with this for
>> additional protection.
>> 
>> Signed-off-by: Felipe Franciosi <address@hidden>
>> ---
>> Based-on: <address@hidden>
>> 
>> Makefile.objs       |   1 +
>> fence/Makefile.objs |   1 +
>> fence/file_fence.c  | 381 ++++++++++++++++++++++++++++++++++++++++++++
> 
> I think it could be under backends/

I thought about it and couldn't make up my mind. My decision was based on:
- Doesn't really feel like a "backend".
- I envision other types of self-fencing heartbeats (eg. network-based),
  in which case this would probably be split in a "common" file.

Arguably other objects in backends/ also fall within these categories,
so I'm happy to move if you think it's better. Let me know.


> And a slight preference for - seperated words in filenames over qemu codebase.

Sure, will change.

> 
>> qemu-options.hx     |  27 +++-
>> 4 files changed, 409 insertions(+), 1 deletion(-)
>> create mode 100644 fence/Makefile.objs
>> create mode 100644 fence/file_fence.c
>> 
>> diff --git a/Makefile.objs b/Makefile.objs
>> index 11ba1a36bd..998eed4796 100644
>> --- a/Makefile.objs
>> +++ b/Makefile.objs
>> @@ -75,6 +75,7 @@ common-obj-$(CONFIG_TPM) += tpm.o
>> 
>> common-obj-y += backends/
>> common-obj-y += chardev/
>> +common-obj-y += fence/
>> 
>> common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
>> qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
>> diff --git a/fence/Makefile.objs b/fence/Makefile.objs
>> new file mode 100644
>> index 0000000000..2ed2092568
>> --- /dev/null
>> +++ b/fence/Makefile.objs
>> @@ -0,0 +1 @@
>> +common-obj-y += file_fence.o
>> diff --git a/fence/file_fence.c b/fence/file_fence.c
>> new file mode 100644
>> index 0000000000..5b743e69d2
>> --- /dev/null
>> +++ b/fence/file_fence.c
>> @@ -0,0 +1,381 @@
>> +/*
>> + * QEMU file-based self-fence mechanism
>> + *
>> + * Copyright (c) 2019 Nutanix Inc. All rights reserved.
>> + *
>> + * Authors:
>> + *    Felipe Franciosi <address@hidden>
>> + *
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, see 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.gnu.org_licenses_&d=DwIFaQ&c=s883GpUCOChKOHiocYtGcg&r=CCrJKVC5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=edzXQ5P0GCmmzZ4-20uvsmgqIreV3BKYC8JYikgHVH4&s=q3GMprakVoRJw8zkbbjXPqAbExv93DzJ-HgI2z00408&e=
>>  >.
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qom/object_interfaces.h"
>> +#include "qemu/filemonitor.h"
>> +#include "qemu/timer.h"
>> +
>> +#include <time.h>
>> +
>> +#define TYPE_FILE_FENCE "file-fence"
>> +
>> +typedef struct FileFence {
>> +    Object parent_obj;
>> +
>> +    gchar *dir;
>> +    gchar *file;
>> +    uint32_t qtimeout;
>> +    uint32_t ktimeout;
>> +    int signal;
>> +
>> +    timer_t ktimer;
>> +    QEMUTimer *qtimer;
>> +
>> +    QFileMonitor *fm;
>> +    uint64_t id;
>> +} FileFence;
>> +
>> +#define FILE_FENCE(obj) \
>> +    OBJECT_CHECK(FileFence, (obj), TYPE_FILE_FENCE)
>> +
>> +static void
>> +timer_update(FileFence *ff)
>> +{
>> +    struct itimerspec its = {
>> +        .it_value.tv_sec = ff->ktimeout,
>> +    };
>> +    int err;
>> +
>> +    if (ff->qtimeout) {
>> +        timer_mod(ff->qtimer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
>> +                              ff->qtimeout * 1000);
>> +    }
>> +
>> +    if (ff->ktimeout) {
>> +        err = timer_settime(ff->ktimer, 0, &its, NULL);
>> +        g_assert(err == 0);
>> +    }
>> +}
>> +
>> +static void
>> +file_fence_abort_cb(void *opaque)
>> +{
>> +    FileFence *ff = opaque;
>> +    printf("Fencing after %u seconds on '%s'\n",
>> +           ff->qtimeout, g_strconcat(ff->dir, "/", ff->file, NULL));
> 
> May be error_printf() instead.

Makes sense.

> 
>> +    abort();
>> +}
>> +
>> +static void
>> +file_fence_watch_cb(int64_t id, QFileMonitorEvent ev, const char *file,
>> +                    void *opaque)
>> +{
>> +    FileFence *ff = opaque;
>> +
>> +    if (ev != QFILE_MONITOR_EVENT_ATTRIBUTES) {
>> +        return;
>> +    }
>> +
>> +    if (g_strcmp0(file, ff->file) != 0) {
> 
> Does it actually happen? Apparently the code in
> util/filemonitor-inotify.c already checks for g_str_equal(filename)

You are right, it does. I think I saw it, but for some reason decided
to be over protective. I will remove this check.

> 
>> +        return;
>> +    }
>> +
>> +    timer_update(ff);
>> +}
>> +
>> +static void
>> +ktimer_tear(FileFence *ff)
>> +{
>> +    int err;
>> +
>> +    if (ff->ktimer) {
>> +        err = timer_delete(ff->ktimer);
>> +        g_assert(err == 0);
>> +        ff->ktimer = NULL;
>> +    }
>> +}
>> +
>> +static void
>> +ktimer_setup(FileFence *ff, Error **errp)
>> +{
>> +    int err;
>> +
>> +    struct sigevent sev = {
>> +        .sigev_notify = SIGEV_SIGNAL,
>> +        .sigev_signo = ff->signal ? ff->signal : SIGKILL,
>> +    };
>> +
>> +    if (ff->ktimeout == 0) {
>> +        return;
>> +    }
>> +
>> +    err = timer_create(CLOCK_MONOTONIC, &sev, &ff->ktimer);
>> +    if (err == -1) {
>> +        error_setg(errp, "Error creating kernel timer: %m");
>> +        return;
>> +    }
>> +}
>> +
>> +static void
>> +qtimer_tear(FileFence *ff)
>> +{
>> +    if (ff->qtimer) {
>> +        timer_del(ff->qtimer);
>> +        timer_free(ff->qtimer);
>> +    }
>> +    ff->qtimer = NULL;
>> +}
>> +
>> +static void
>> +qtimer_setup(FileFence *ff, Error **errp)
>> +{
>> +    QEMUTimer *qtimer;
>> +
>> +    if (ff->qtimeout == 0) {
>> +        return;
>> +    }
>> +
>> +    qtimer = timer_new_ms(QEMU_CLOCK_REALTIME, file_fence_abort_cb, ff);
>> +    if (qtimer == NULL) {
>> +        error_setg(errp, "Error creating Qemu timer");
>> +        return;
>> +    }
>> +
>> +    ff->qtimer = qtimer;
>> +}
>> +
>> +static void
>> +watch_tear(FileFence *ff)
>> +{
>> +    if (ff->fm) {
>> +        qemu_file_monitor_remove_watch(ff->fm, ff->dir, ff->id);
>> +        qemu_file_monitor_free(ff->fm);
>> +        ff->fm = NULL;
>> +        ff->id = 0;
>> +    }
>> +}
>> +
>> +static void
>> +watch_setup(FileFence *ff, Error **errp)
>> +{
>> +    QFileMonitor *fm;
>> +    int64_t id;
>> +
>> +    fm = qemu_file_monitor_new(errp);
>> +    if (!fm) {
>> +        return;
>> +    }
>> +
>> +    id = qemu_file_monitor_add_watch(fm, ff->dir, ff->file,
>> +                                     file_fence_watch_cb, ff, errp);
>> +    if (id < 0) {
>> +        qemu_file_monitor_free(fm);
>> +        return;
>> +    }
>> +
>> +    ff->fm = fm;
>> +    ff->id = id;
>> +}
>> +
>> +static void
>> +file_fence_complete(UserCreatable *obj, Error **errp)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +
>> +    if (ff->dir == NULL) {
>> +        error_setg(errp, "A 'file' must be set");
>> +        return;
>> +    }
>> +
>> +    if (ff->signal != 0 && ff->ktimeout == 0) {
>> +        error_setg(errp, "Using 'signal' requires 'ktimeout' to be set");
>> +        return;
>> +    }
>> +
>> +    if (ff->ktimeout == 0 && ff->qtimeout == 0) {
>> +        error_setg(errp, "One or both of 'ktimeout' or 'qtimeout' must be 
>> set");
>> +        return;
>> +    }
>> +
>> +    if (ff->qtimeout >= ff->ktimeout) {
>> +        error_setg(errp, "Using 'qtimeout' >= 'ktimeout' doesn't make 
>> sense");
>> +        return;
>> +    }
>> +
>> +    watch_setup(ff, errp);
>> +    if (*errp != NULL) {
>> +        return;
>> +    }
>> +
>> +    qtimer_setup(ff, errp);
>> +    if (*errp != NULL) {
>> +        goto err_watch;
>> +    }
>> +
>> +    ktimer_setup(ff, errp);
>> +    if (*errp != NULL) {
>> +        goto err_qtimer;
>> +    }
> 
> 
> I would rather return success on the setup functions and write:
> 
>  if (!watch_setup(ff, errp) ||
>      !qtimer_setup(...) ||
>      !...) {
>      return; (no cleanup)
>   }
> 
> Object creation will fail, and finalize function will be called.
> 
>> +
>> +    timer_update(ff);
>> +
>> +    return;
>> +
>> +err_qtimer:
>> +    qtimer_tear(ff);
>> +err_watch:
>> +    watch_tear(ff);
>> +}
>> +
>> +static void
>> +file_fence_set_signal(Object *obj, const char *value, Error **errp)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +    gchar *gsig;
>> +
>> +    if (ff->signal) {
>> +        error_setg(errp, "Signal property already set");
>> +        return;
>> +    }
>> +
>> +    gsig = g_ascii_strup(value, -1);
>> +
>> +    if (g_strcmp0(gsig, "QUIT") == 0) {
>> +        ff->signal = SIGQUIT;
>> +    } else
>> +    if (g_strcmp0(gsig, "KILL") == 0) {
>> +        ff->signal = SIGKILL;
>> +    } else {
>> +        error_setg(errp, "Invalid signal. Must be 'quit' or 'kill'");
>> +    }
> 
> Or bail out early for NULL case and use g_ascii_strcasecmp()

Sounds better. Let me look into it.

> 
>> +
>> +    g_free(gsig);
>> +}
>> +
>> +static char *
>> +file_fence_get_signal(Object *obj, Error **errp)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +
>> +    switch (ff->signal) {
>> +    case SIGKILL:
>> +        return g_strdup("kill");
>> +    case SIGQUIT:
>> +        return g_strdup("quit");
>> +    }
>> +
>> +    /* Unreachable */
>> +    abort();
>> +}
>> +
>> +static void
>> +file_fence_set_file(Object *obj, const char *value, Error **errp)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +    gchar *dir, *file;
> 
> g_autofree

Sweet.

> 
>> +
>> +    if (ff->dir) {
>> +        error_setg(errp, "File property already set");
>> +        return;
>> +    }
>> +
>> +    dir = g_path_get_dirname(value);
>> +    if (g_str_equal(dir, ".")) {
>> +        error_setg(errp, "Path for file-fence must be absolute");
>> +        return;
>> +    }
>> +
>> +    file = g_path_get_basename(value);
>> +    if (g_str_equal(file, ".")) {
>> +        error_setg(errp, "Path for file-fence must be a file");
>> +        g_free(dir);
>> +        return;
>> +    }
>> +
>> +    ff->dir = dir;
>> +    ff->file = file;
> 
> g_steal_pointer()

Cool! Clearly I could spend more time in the glib manual. :)

> 
>> +}
>> +
>> +static char *
>> +file_fence_get_file(Object *obj, Error **errp)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +
>> +    if (ff->file) {
>> +        return g_strconcat(ff->dir, "/", ff->file, NULL);
> 
> Or g_build_filename()

Makes sense.

> 
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static void
>> +file_fence_instance_finalize(Object *obj)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +
>> +    ktimer_tear(ff);
>> +    qtimer_tear(ff);
>> +    watch_tear(ff);
>> +
>> +    g_free(ff->file);
>> +    g_free(ff->dir);
>> +}
>> +
>> +static void
>> +file_fence_instance_init(Object *obj)
>> +{
>> +    FileFence *ff = FILE_FENCE(obj);
>> +
>> +    object_property_add_str(obj, "file",
>> +                            file_fence_get_file,
>> +                            file_fence_set_file,
>> +                            &error_abort);
>> +    object_property_add_str(obj, "signal",
>> +                            file_fence_get_signal,
>> +                            file_fence_set_signal,
>> +                            &error_abort);
>> +    object_property_add_uint32_ptr(obj, "qtimeout", &ff->qtimeout,
>> +                                   false, &error_abort);
>> +    object_property_add_uint32_ptr(obj, "ktimeout", &ff->ktimeout,
>> +                                   false, &error_abort);
> 
> Make them class properties (this will help with -object
> file-fence,help and such). (fwiw, I have pending patches to support
> class property description & default values..)

I tried to fit some of these in the class, as well as justify a split
of the file-based fencing with a more generic self-fencer right off
the bat. But it didn't make sense in the end. I envisioned scenarios
where you may have different heartbeats for one Qemu with different
timeouts. In that case, it wouldn't work as a class property, right?

> 
>> +}
>> +
>> +static void
>> +file_fence_class_init(ObjectClass *klass, void *class_data)
>> +{
>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
>> +    ucc->complete = file_fence_complete;
>> +}
>> +
>> +static const TypeInfo file_fence_info = {
>> +    .name = TYPE_FILE_FENCE,
>> +    .parent = TYPE_OBJECT,
>> +    .class_init = file_fence_class_init,
>> +    .instance_size = sizeof(FileFence),
>> +    .instance_init = file_fence_instance_init,
>> +    .instance_finalize = file_fence_instance_finalize,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_USER_CREATABLE },
>> +        { }
>> +    }
>> +};
>> +
>> +static void
>> +register_types(void)
>> +{
>> +    type_register_static(&file_fence_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 65c9473b73..995d3d6abf 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -4929,8 +4929,33 @@ CN=laptop.example.com,O=Example 
>> Home,L=London,ST=London,C=GB
>> 
>> @end table
>> 
>> -ETEXI
>> +@item -object 
>> file-fence,id=@var{id},file=@var{file},qtimeout=@var{qtimeout},ktimeout=@var{ktimeout},signal=@{signal}
>> +
>> +Self-fence Qemu if @var{file} is not modified within a given timeout.
>> +
>> +Qemu will watch @var{file} for attribute changes. Touching the file works 
>> as a
>> +heartbeat. This parameter is mandatory.
>> +
>> +Fencing happens after @var{qtimeout} or @var{ktimeout} seconds elapse
>> +without a heartbeat. At least one of these must be specified. Both may be 
>> used.
>> 
>> +When using @var{qtimeout}, an internal Qemu timer is used. Fencing with
>> +this method gives Qemu a chance to write a log message indicating which file
>> +caused the event. If Qemu's main loop is hung for whatever reason, this 
>> method
>> +won't successfully kill Qemu.
>> +
>> +When using @var{ktimeout}, a kernel timer is used. In this case, 
>> @var{signal}
>> +can be 'kill' (for SIGKILL, default) or 'quit' (for SIGQUIT). Using SIGQUIT 
>> may
>> +be preferred for obtaining core dumps. If Qemu is hung (eg. uninterruptable
>> +sleep), this method won't successfully kill Qemu.
>> +
>> +It is worth noting that even successfully killing Qemu may not be 
>> sufficient to
>> +completely fence a VM as certain operations like network packets or block
>> +commands may be pending in the kernel. If that is a concern, systems should
>> +consider using further fencing mechanisms like hardware watchdogs either in
>> +addition or in conjunction with this feature for additional protection.
>> +
>> +ETEXI
>> 
>> HXCOMM This is the last statement. Insert new options before this line!
>> STEXI
>> --
>> 2.20.1
>> 
> 
> Code looks fine to me otherwise

Let me work on a v2 next week.

F.

> 
> -- 
> Marc-André Lureau


reply via email to

[Prev in Thread] Current Thread [Next in Thread]