qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmenta


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
Date: Thu, 10 Jan 2019 17:06:16 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Fei Li <address@hidden> writes:

> 在 2019/1/10 下午5:20, Markus Armbruster 写道:
>> fei <address@hidden> writes:
>>
>>>> 在 2019年1月9日,23:24,Markus Armbruster <address@hidden> 写道:
>>>>
>>>> Fei Li <address@hidden> writes:
>>>>
>>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>>> fei <address@hidden> writes:
>>>>>>
>>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <address@hidden> 写道:
>>>>>>>>
>>>>>>>> Fei Li <address@hidden> writes:
>>>>>>>>
>>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>>
>>>>>>>>> Cc: Stefan Weil <address@hidden>
>>>>>>>>> Signed-off-by: Fei Li <address@hidden>
>>>>>>>>> Reviewed-by: Fam Zheng <address@hidden>
>>>>>>>>> ---
>>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>>      int err;
>>>>>>>>>      void *ret;
>>>>>>>>>
>>>>>>>>> +    if (!thread->thread) {
>>>>>>>>> +        return NULL;
>>>>>>>>> +    }
>>>>>>>> How can this happen?
>>>>>>> I think I have answered this earlier, please check the following link 
>>>>>>> to see whether it helps:
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>>>> explanation.  You also wrote there "I will remove this patch in next
>>>>>> version"; looks like you've since changed your mind.
>>>>> Emm, issues left over from history.. The background is I was hurry to
>>>>> make those five
>>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>>> the real
>>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>>> the segmentation
>>>>> fault after we modified qemu_thread_create() related functions
>>>>> although it has got a
>>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>>> "remove..." sentence
>>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>>> only four patches.
>>>>> But later I got a message that these five patches are not that urgent
>>>>> to catch qemu v3.1,
>>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>>> a better review.
>>>>>
>>>>> Sorry for the trouble, I need to explain it without involving too much
>>>>> background..
>>>>>
>>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>>> to join()
>>>>> the total number of threads if caller fails. This is not a problem
>>>>> until applying the
>>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>>> fails while trying to create the last do_data_compress thread,
>>>>> segmentation fault
>>>>> will occur when join() is called (sadly there's not enough condition
>>>>> to filter this
>>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>>
>>>>> Hope the above makes it clear. :)
>>>> Alright, let's have a look at compress_threads_save_setup():
>>>>
>>>>     static int compress_threads_save_setup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression()) {
>>>>             return 0;
>>>>         }
>>>>         thread_count = migrate_compress_threads();
>>>>         compress_threads = g_new0(QemuThread, thread_count);
>>>>         comp_param = g_new0(CompressParam, thread_count);
>>>>         qemu_cond_init(&comp_done_cond);
>>>>         qemu_mutex_init(&comp_done_lock);
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>>             if (!comp_param[i].originbuf) {
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             if (deflateInit(&comp_param[i].stream,
>>>>                             migrate_compress_level()) != Z_OK) {
>>>>                 g_free(comp_param[i].originbuf);
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             /* comp_param[i].file is just used as a dummy buffer to save 
>>>> data,
>>>>              * set its ops to empty.
>>>>              */
>>>>             comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>>             comp_param[i].done = true;
>>>>             comp_param[i].quit = false;
>>>>             qemu_mutex_init(&comp_param[i].mutex);
>>>>             qemu_cond_init(&comp_param[i].cond);
>>>>             qemu_thread_create(compress_threads + i, "compress",
>>>>                                do_data_compress, comp_param + i,
>>>>                                QEMU_THREAD_JOINABLE);
>>>>         }
>>>>         return 0;
>>>>
>>>>     exit:
>>>>         compress_threads_save_cleanup();
>>>>         return -1;
>>>>     }
>>>>
>>>> At label exit, we have @i threads, all fully initialized.  That's an
>>>> invariant.
>>>>
>>>> compress_threads_save_cleanup() finds the threads to clean up by
>>>> checking comp_param[i].file:
>>>>
>>>>     static void compress_threads_save_cleanup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression() || !comp_param) {
>>>>             return;
>>>>         }
>>>>
>>>>         thread_count = migrate_compress_threads();
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             /*
>>>>              * we use it as a indicator which shows if the thread is
>>>>              * properly init'd or not
>>>>              */
>>>> --->        if (!comp_param[i].file) {
>>>> --->            break;
>>>> --->        }
>>>>
>>>>             qemu_mutex_lock(&comp_param[i].mutex);
>>>>             comp_param[i].quit = true;
>>>>             qemu_cond_signal(&comp_param[i].cond);
>>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>>
>>>>             qemu_thread_join(compress_threads + i);
>>>>             qemu_mutex_destroy(&comp_param[i].mutex);
>>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>>             deflateEnd(&comp_param[i].stream);
>>>>             g_free(comp_param[i].originbuf);
>>>>             qemu_fclose(comp_param[i].file);
>>>>             comp_param[i].file = NULL;
>>>>         }
>>>>         qemu_mutex_destroy(&comp_done_lock);
>>>>         qemu_cond_destroy(&comp_done_cond);
>>>>         g_free(compress_threads);
>>>>         g_free(comp_param);
>>>>         compress_threads = NULL;
>>>>         comp_param = NULL;
>>>>     }
>>>>
>>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>>> *any* cleanup.
>>>>
>>>> To maintain the invariant, compress_threads_save_setup() carefully
>>>> cleans up any partial initializations itself before a goto exit.  Since
>>>> the code is arranged smartly, the only such cleanup is the
>>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>>
>>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>>> initializations.  Breaks the invariant.
>>>>
>>>> I see two sane solutions:
>>>>
>>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>>    initializations itself.  compress_threads_save_cleanup() copes only
>>>>    with fully initialized comp_param[i].  This is how things work before
>>>>    your series.
>>>>
>>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>>    comp_param[i], i.e. does the right thing for each goto exit in
>>>>    compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>>>    clean up partial initializations.
>>>>
>>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>>
>>>> 3. A confusing mix of the two.
>>>>
>>>> Don't.
>>> Thanks for the detail analysis! :)
>>> Emm.. Actually I have thought to do the cleanup in the setup() function for 
>>> the third ‘goto exit’ [1],  which is a partial initialization.
>>> But due to the below [1] is too long and seems not neat (I notice that most 
>>> cleanups for each thread are in the xxx_cleanup() function), I turned to 
>>> modify the join() function..
>>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is 
>>> there any other better way to do the cleanup?
>>>
>>> [1]
>>> qemu_mutex_lock(&comp_param[i].mutex);
>>>             comp_param[i].quit = true;
>>>             qemu_cond_signal(&comp_param[i].cond);
>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>
>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>             deflateEnd(&comp_param[i].stream);
>>>             g_free(comp_param[i].originbuf);
>>>             qemu_fclose(comp_param[i].file);
>>>             comp_param[i].file = NULL;
>> Have you considered creating the thread earlier, e.g. right after
>> initializing compression with deflateInit()?
> I am afraid we can not do this, as the members of comp_param[i], like
> file/done/quit/mutex/cond
> will be used later in the new created thread: do_data_[de]compress via
> qemu_thread_create().

You're right.

> Thus it seems we have to accept the above long [1] if we do want to
> clean up partial initialization
> in xxx_setup(). :(
>
> BTW, there is no other argument can be used except the
> "(compress_threads+i)->thread" to
> differentiate whether should we join() the thread, just in case we
> want to change the
> xxx_cleanup() function.

We can try to make compress_threads_save_cleanup() cope with partially
initialized comp_param[i].  Let's have a look at its members:

    bool done;                          // no cleanup
    bool quit;                          // see [2]
    bool zero_page;                     // no cleanup
    QEMUFile *file;                     // qemu_fclose() if non-null
    QemuMutex mutex;                    // see [1]
    QemuCond cond;                      // see [1]
    RAMBlock *block;                    // no cleanup (must be null)
    ram_addr_t offset;                  // no cleanup

    /* internally used fields */
    z_stream stream;                    // see [3]
    uint8_t *originbuf;                 // unconditional g_free()

[1]: we could do something like

    if (comp_param[i].mutex.initialized) {
        qemu_mutex_destroy(&comp_param[i].mutex);
    }
    if (comp_param[i].cond.initialized) {
        qemu_cond_destroy(&comp_param[i].cond);
    }

but that would be unclean.  Instead, I'd initialize these guys first, so
we can clean them up unconditionally.

[2] This is used to make the thread terminate.  Must be done before we
call qemu_thread_join().  I think it can safely be done always, as long
as long as .mutex and .cond are initialized.  Trivial if we initialize
them first.

[3]: I can't see a squeaky clean way to detect whether .stream has been
initialized with deflateInit().  Here's a slightly unclean way:
deflateInit() sets .stream.msg to null on success, and to non-null on
failure.  We can make it non-null until we're ready to call
deflateInit(), then have compress_threads_save_cleanup() clean up
.stream when it's null.  If that's too unclean for your or your
reviewers' taste, add a boolean @stream_initialized flag.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]