All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Fei Li <lifei1214@126.com>
Cc: Stefan Weil <sw@weilnetz.de>,
	qemu-devel@nongnu.org, shirley17fei@gmail.com
Subject: Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
Date: Thu, 10 Jan 2019 17:06:16 +0100	[thread overview]
Message-ID: <87tvigfqfb.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <016ae7bc-6271-59a7-36a1-c2b8a210020f@126.com> (Fei Li's message of "Thu, 10 Jan 2019 21:24:54 +0800")

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/10 下午5:20, Markus Armbruster 写道:
>> fei <lifei1214@126.com> writes:
>>
>>>> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
>>>>
>>>> Fei Li <lifei1214@126.com> writes:
>>>>
>>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>>> fei <lifei1214@126.com> writes:
>>>>>>
>>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>>>>
>>>>>>>> Fei Li <fli@suse.com> writes:
>>>>>>>>
>>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>>
>>>>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>>>>> ---
>>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>>      int err;
>>>>>>>>>      void *ret;
>>>>>>>>>
>>>>>>>>> +    if (!thread->thread) {
>>>>>>>>> +        return NULL;
>>>>>>>>> +    }
>>>>>>>> How can this happen?
>>>>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>>>> explanation.  You also wrote there "I will remove this patch in next
>>>>>> version"; looks like you've since changed your mind.
>>>>> Emm, issues left over from history.. The background is I was hurry to
>>>>> make those five
>>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>>> the real
>>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>>> the segmentation
>>>>> fault after we modified qemu_thread_create() related functions
>>>>> although it has got a
>>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>>> "remove..." sentence
>>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>>> only four patches.
>>>>> But later I got a message that these five patches are not that urgent
>>>>> to catch qemu v3.1,
>>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>>> a better review.
>>>>>
>>>>> Sorry for the trouble, I need to explain it without involving too much
>>>>> background..
>>>>>
>>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>>> to join()
>>>>> the total number of threads if caller fails. This is not a problem
>>>>> until applying the
>>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>>> fails while trying to create the last do_data_compress thread,
>>>>> segmentation fault
>>>>> will occur when join() is called (sadly there's not enough condition
>>>>> to filter this
>>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>>
>>>>> Hope the above makes it clear. :)
>>>> Alright, let's have a look at compress_threads_save_setup():
>>>>
>>>>     static int compress_threads_save_setup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression()) {
>>>>             return 0;
>>>>         }
>>>>         thread_count = migrate_compress_threads();
>>>>         compress_threads = g_new0(QemuThread, thread_count);
>>>>         comp_param = g_new0(CompressParam, thread_count);
>>>>         qemu_cond_init(&comp_done_cond);
>>>>         qemu_mutex_init(&comp_done_lock);
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>>             if (!comp_param[i].originbuf) {
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             if (deflateInit(&comp_param[i].stream,
>>>>                             migrate_compress_level()) != Z_OK) {
>>>>                 g_free(comp_param[i].originbuf);
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             /* comp_param[i].file is just used as a dummy buffer to save data,
>>>>              * set its ops to empty.
>>>>              */
>>>>             comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>>             comp_param[i].done = true;
>>>>             comp_param[i].quit = false;
>>>>             qemu_mutex_init(&comp_param[i].mutex);
>>>>             qemu_cond_init(&comp_param[i].cond);
>>>>             qemu_thread_create(compress_threads + i, "compress",
>>>>                                do_data_compress, comp_param + i,
>>>>                                QEMU_THREAD_JOINABLE);
>>>>         }
>>>>         return 0;
>>>>
>>>>     exit:
>>>>         compress_threads_save_cleanup();
>>>>         return -1;
>>>>     }
>>>>
>>>> At label exit, we have @i threads, all fully initialized.  That's an
>>>> invariant.
>>>>
>>>> compress_threads_save_cleanup() finds the threads to clean up by
>>>> checking comp_param[i].file:
>>>>
>>>>     static void compress_threads_save_cleanup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression() || !comp_param) {
>>>>             return;
>>>>         }
>>>>
>>>>         thread_count = migrate_compress_threads();
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             /*
>>>>              * we use it as a indicator which shows if the thread is
>>>>              * properly init'd or not
>>>>              */
>>>> --->        if (!comp_param[i].file) {
>>>> --->            break;
>>>> --->        }
>>>>
>>>>             qemu_mutex_lock(&comp_param[i].mutex);
>>>>             comp_param[i].quit = true;
>>>>             qemu_cond_signal(&comp_param[i].cond);
>>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>>
>>>>             qemu_thread_join(compress_threads + i);
>>>>             qemu_mutex_destroy(&comp_param[i].mutex);
>>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>>             deflateEnd(&comp_param[i].stream);
>>>>             g_free(comp_param[i].originbuf);
>>>>             qemu_fclose(comp_param[i].file);
>>>>             comp_param[i].file = NULL;
>>>>         }
>>>>         qemu_mutex_destroy(&comp_done_lock);
>>>>         qemu_cond_destroy(&comp_done_cond);
>>>>         g_free(compress_threads);
>>>>         g_free(comp_param);
>>>>         compress_threads = NULL;
>>>>         comp_param = NULL;
>>>>     }
>>>>
>>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>>> *any* cleanup.
>>>>
>>>> To maintain the invariant, compress_threads_save_setup() carefully
>>>> cleans up any partial initializations itself before a goto exit.  Since
>>>> the code is arranged smartly, the only such cleanup is the
>>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>>
>>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>>> initializations.  Breaks the invariant.
>>>>
>>>> I see two sane solutions:
>>>>
>>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>>    initializations itself.  compress_threads_save_cleanup() copes only
>>>>    with fully initialized comp_param[i].  This is how things work before
>>>>    your series.
>>>>
>>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>>    comp_param[i], i.e. does the right thing for each goto exit in
>>>>    compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>>>    clean up partial initializations.
>>>>
>>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>>
>>>> 3. A confusing mix of the two.
>>>>
>>>> Don't.
>>> Thanks for the detail analysis! :)
>>> Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
>>> But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function..
>>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup?
>>>
>>> [1]
>>> qemu_mutex_lock(&comp_param[i].mutex);
>>>             comp_param[i].quit = true;
>>>             qemu_cond_signal(&comp_param[i].cond);
>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>
>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>             deflateEnd(&comp_param[i].stream);
>>>             g_free(comp_param[i].originbuf);
>>>             qemu_fclose(comp_param[i].file);
>>>             comp_param[i].file = NULL;
>> Have you considered creating the thread earlier, e.g. right after
>> initializing compression with deflateInit()?
> I am afraid we can not do this, as the members of comp_param[i], like
> file/done/quit/mutex/cond
> will be used later in the new created thread: do_data_[de]compress via
> qemu_thread_create().

You're right.

> Thus it seems we have to accept the above long [1] if we do want to
> clean up partial initialization
> in xxx_setup(). :(
>
> BTW, there is no other argument can be used except the
> "(compress_threads+i)->thread" to
> differentiate whether should we join() the thread, just in case we
> want to change the
> xxx_cleanup() function.

We can try to make compress_threads_save_cleanup() cope with partially
initialized comp_param[i].  Let's have a look at its members:

    bool done;                          // no cleanup
    bool quit;                          // see [2]
    bool zero_page;                     // no cleanup
    QEMUFile *file;                     // qemu_fclose() if non-null
    QemuMutex mutex;                    // see [1]
    QemuCond cond;                      // see [1]
    RAMBlock *block;                    // no cleanup (must be null)
    ram_addr_t offset;                  // no cleanup

    /* internally used fields */
    z_stream stream;                    // see [3]
    uint8_t *originbuf;                 // unconditional g_free()

[1]: we could do something like

    if (comp_param[i].mutex.initialized) {
        qemu_mutex_destroy(&comp_param[i].mutex);
    }
    if (comp_param[i].cond.initialized) {
        qemu_cond_destroy(&comp_param[i].cond);
    }

but that would be unclean.  Instead, I'd initialize these guys first, so
we can clean them up unconditionally.

[2] This is used to make the thread terminate.  Must be done before we
call qemu_thread_join().  I think it can safely be done always, as long
as long as .mutex and .cond are initialized.  Trivial if we initialize
them first.

[3]: I can't see a squeaky clean way to detect whether .stream has been
initialized with deflateInit().  Here's a slightly unclean way:
deflateInit() sets .stream.msg to null on success, and to non-null on
failure.  We can make it non-null until we're ready to call
deflateInit(), then have compress_threads_save_cleanup() clean up
.stream when it's null.  If that's too unclean for your or your
reviewers' taste, add a boolean @stream_initialized flag.

  reply	other threads:[~2019-01-10 16:06 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 02/16] migration: fix the multifd code when receiving less channels Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
2019-01-07 16:50   ` Markus Armbruster
2019-01-08 15:58     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 04/16] migration: add more error handling for postcopy_ram_enable_notify Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co Fei Li
2019-01-03 11:25   ` Dr. David Alan Gilbert
2019-01-03 13:27     ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly Fei Li
2019-01-07 17:18   ` Markus Armbruster
2019-01-08 15:55     ` fei
2019-01-08 17:07       ` Markus Armbruster
2019-01-09 13:19         ` Fei Li
2019-01-09 14:36           ` Markus Armbruster
2019-01-09 14:42             ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 07/16] qemu_thread: supplement error handling for qemu_X_start_vcpu Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory Fei Li
2019-01-07 17:21   ` Markus Armbruster
2019-01-08 16:00     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize Fei Li
2019-01-07 17:29   ` Markus Armbruster
2019-01-08  6:14     ` Jiri Slaby
2019-01-08  6:51       ` Peter Xu
2019-01-08  8:43         ` Markus Armbruster
2019-01-10 13:29           ` Fei Li
2019-01-11  2:49             ` Peter Xu
2019-01-11 13:19               ` Fei Li
2019-01-13 15:44     ` Fei Li
2019-01-14 12:36       ` Markus Armbruster
2019-01-14 13:38         ` Fei Li
2019-01-15 12:55           ` Markus Armbruster
2019-01-16  4:43             ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare Fei Li
2019-01-02  2:36   ` David Gibson
2019-01-02  6:44     ` 李菲
2019-01-03  3:43       ` David Gibson
2019-01-03 13:41         ` Fei Li
2019-01-04  5:21           ` David Gibson
2019-01-04  6:20             ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize Fei Li
2019-01-07 17:31   ` Markus Armbruster
2019-01-09 13:21     ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat Fei Li
2019-01-07 17:50   ` Markus Armbruster
2019-01-08 16:18     ` fei
2019-01-13 16:16       ` Fei Li
2019-01-14 12:53         ` Markus Armbruster
2019-01-14 13:52           ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
2019-01-03 12:35   ` Dr. David Alan Gilbert
2019-01-03 12:47     ` Fei Li
2019-01-09 15:26   ` Markus Armbruster
2019-01-09 16:01     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread Fei Li
2019-01-07 17:54   ` Markus Armbruster
2019-01-08 16:24     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages Fei Li
2019-01-07 18:13   ` Markus Armbruster
2019-01-09 16:13     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault Fei Li
2019-01-07 17:55   ` Markus Armbruster
2019-01-08 16:50     ` fei
2019-01-08 17:29       ` Markus Armbruster
2019-01-09 14:01         ` Fei Li
2019-01-09 15:24           ` Markus Armbruster
2019-01-09 15:57             ` fei
2019-01-10  9:20               ` Markus Armbruster
2019-01-10 13:24                 ` Fei Li
2019-01-10 16:06                   ` Markus Armbruster [this message]
2019-01-11 14:01                     ` Fei Li
2019-01-02 13:46 ` [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle no-reply
2019-01-07 12:44   ` Fei Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tvigfqfb.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=lifei1214@126.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shirley17fei@gmail.com \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.