All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fei Li <fli@suse.com>
To: Peter Xu <peterx@redhat.com>
Cc: QEMU Developers <qemu-devel@nongnu.org>,
	Markus Armbruster <armbru@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH for-4.0 v8 6/7] qemu_thread_create: propagate the error to callers to handle
Date: Tue, 25 Dec 2018 20:18:26 +0800	[thread overview]
Message-ID: <b5a341cb-257a-798f-4672-12e52a98fc8e@suse.com> (raw)
In-Reply-To: <038cd81d-d12a-9f86-fdb5-3bf10e0093ee@suse.com>

Hi all,

As I am leaving my current company and most reviewers are on holiday,
I'd like to send a new version now:
v9: "qemu_thread: Make qemu_thread_create() handle errors properly",
although some details like whether it is appropriate to report the error
to be seen by the management layer. And I will use my new personal
email address (shirley17fei@gmail.com <mailto:shirley17fei@gmail.com>) 
to follow the new version. :)

Merry Christmas, and have a nice day, thanks all!
Fei

On 12/24/2018 02:53 PM, Fei Li wrote:
>
>
> On 12/24/2018 11:34 AM, Peter Xu wrote:
>> On Fri, Dec 21, 2018 at 05:36:57PM +0800, Fei Li wrote:
>>> On 12/19/2018 08:14 PM, Fei Li wrote:
>>>> On 12/19/2018 06:10 PM, Markus Armbruster wrote:
>>>>> Fei Li <fli@suse.com> writes:
>>>>>
>>>>>> On 12/13/2018 03:26 PM, Markus Armbruster wrote:
>>>>>>> There's a question for David Gibson inline.  Please search for 
>>>>>>> /ppc/.
>>>>>>>
>>>>>>> Fei Li <fli@suse.com> writes:
>>>>>>>
>>>>>>>> Make qemu_thread_create() return a Boolean to indicate if it 
>>>>>>>> succeeds
>>>>>>>> rather than failing with an error. And add an Error parameter 
>>>>>>>> to hold
>>>>>>>> the error message and let the callers handle it.
>>>>>>> The "rather than failing with an error" is misleading. Before the
>>>>>>> patch, we report to stderr and abort().  What about:
>>>>>>>
>>>>>>>        qemu-thread: Make qemu_thread_create() handle errors 
>>>>>>> properly
>>>>>>>
>>>>>>>        qemu_thread_create() abort()s on error.  Not nice. Give it a
>>>>>>>        return value and an Error ** argument, so it can
>>>>>>> return success /
>>>>>>>        failure.
>>>>>> A nice commit-amend! Thanks!
>>>>>>> Still missing from the commit message then: how you update
>>>>>>> the callers.
>>>>>> Yes, agree. I think the-how should also be noted here, like
>>>>>> - propagating the err to callers whose call trace already have the
>>>>>> Error paramater;
>>>>>> - just add an &error_abort for qemu_thread_create() and make it a
>>>>>> "TODO: xxx";
>>>>>>> Let's see below.
>>>>>>>
>>>>>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>>>>>> Cc: Daniel P. Berrangé <berrange@redhat.com>
>>>>>>>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>>> ---
>>>>>>>>     cpus.c                      | 45
>>>>>>>> ++++++++++++++++++++++++-------------
>>>>>>>>     dump.c                      |  6 +++--
>>>>>>>>     hw/misc/edu.c               |  6 +++--
>>>>>>>>     hw/ppc/spapr_hcall.c        | 10 +++++++--
>>>>>>>>     hw/rdma/rdma_backend.c      |  4 +++-
>>>>>>>>     hw/usb/ccid-card-emulated.c | 16 ++++++++++----
>>>>>>>>     include/qemu/thread.h       |  4 ++--
>>>>>>>>     io/task.c                   |  3 ++-
>>>>>>>>     iothread.c                  | 16 +++++++++-----
>>>>>>>>     migration/migration.c       | 54
>>>>>>>> +++++++++++++++++++++++++++++----------------
>>>>>>>>     migration/postcopy-ram.c    | 14 ++++++++++--
>>>>>>>>     migration/ram.c             | 40 
>>>>>>>> ++++++++++++++++++++++++---------
>>>>>>>>     migration/savevm.c          | 11 ++++++---
>>>>>>>>     tests/atomic_add-bench.c    |  3 ++-
>>>>>>>>     tests/iothread.c            |  2 +-
>>>>>>>>     tests/qht-bench.c           |  3 ++-
>>>>>>>>     tests/rcutorture.c          |  3 ++-
>>>>>>>>     tests/test-aio.c            |  2 +-
>>>>>>>>     tests/test-rcu-list.c       |  3 ++-
>>>>>>>>     ui/vnc-jobs.c               | 17 +++++++++-----
>>>>>>>>     ui/vnc-jobs.h               |  2 +-
>>>>>>>>     ui/vnc.c                    |  4 +++-
>>>>>>>>     util/compatfd.c             | 12 ++++++++--
>>>>>>>>     util/oslib-posix.c          | 17 ++++++++++----
>>>>>>>>     util/qemu-thread-posix.c    | 24 +++++++++++++-------
>>>>>>>>     util/qemu-thread-win32.c    | 16 ++++++++++----
>>>>>>>>     util/rcu.c                  |  3 ++-
>>>>>>>>     util/thread-pool.c          |  4 +++-
>>>>>>>>     28 files changed, 243 insertions(+), 101 deletions(-)
>>>>>>>>
>>> ...snip, and only leave the three uncertain small topics...
>>>>>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>>>>>> index 658dfa88a3..6e0cccf066 100644
>>>>>>>> --- a/migration/ram.c
>>>>>>>> +++ b/migration/ram.c
>>>>>>>> @@ -473,6 +473,7 @@ static void 
>>>>>>>> compress_threads_save_cleanup(void)
>>>>>>>>     static int compress_threads_save_setup(void)
>>>>>>>>     {
>>>>>>>>         int i, thread_count;
>>>>>>>> +    Error *local_err = NULL;
>>>>>>>>           if (!migrate_use_compression()) {
>>>>>>>>             return 0;
>>>>>>>> @@ -502,9 +503,12 @@ static int compress_threads_save_setup(void)
>>>>>>>>             comp_param[i].quit = false;
>>>>>>>> qemu_mutex_init(&comp_param[i].mutex);
>>>>>>>>             qemu_cond_init(&comp_param[i].cond);
>>>>>>>> -        qemu_thread_create(compress_threads + i, "compress",
>>>>>>>> -                           do_data_compress, comp_param + i,
>>>>>>>> -                           QEMU_THREAD_JOINABLE);
>>>>>>>> +        if (!qemu_thread_create(compress_threads + i, "compress",
>>>>>>>> +                                do_data_compress, comp_param + i,
>>>>>>>> + QEMU_THREAD_JOINABLE, &local_err)) {
>>>>>>>> +            error_reportf_err(local_err, "failed to
>>>>>>>> create do_data_compress: ");
>>>>>>>> +            goto exit;
>> [1]
>>
>>>>>>>> +        }
>>>>>>>>         }
>>>>>>>>         return 0;
>>>>>>> Reviewing the migration changes is getting tiresome...
>>>>>> Yes, indeed, the migration involves a lot! Thanks so much for 
>>>>>> helping
>>>>>> to review!
>>>>>>>     Is reporting the
>>>>>>> error appropriate here, and why?
>>>>>> I think the qemu monitor should display the obvious and exact 
>>>>>> failing
>>>>>> reason for administrators, esp considering that qemu_thread_create()
>>>>>> itself does not print any message thus we have no idea which direct
>>>>>> function fails if gdb is not enabled.
>>>>>> IOW, I think David's answer to that ppc's error_reportf_err() also
>>>>>> apply here:
>>>>>>
>>>>>> "The error returns are for the guest, the reported errors are for 
>>>>>> the
>>>>>> guest administrator or management layers."
>>>>> There could well be an issue with the "management layers" part. 
>>>>> Should
>>>>> this error be sent to the management layer via QMP somehow? Migration
>>>>> maintainers should be able to assist with this question.
>>> Kindly ping migration maintainers. :)
>> I think both the maintainers are on holiday so possibly there won't be
>> any reply from them this week... :)
>>
>> Regarding to error reports of migration via QMP layer, please have a
>> look at d59ce6f344 ("migration: add reporting of errors for outgoing
>> migration", 2016-05-26).  Though I see that even
>> qemu_savevm_state_setup() is not capturing error for the management
>> layer so if you want to pass this thread creation error upward you'll
>> possibly need to work on that as well.
> Thanks for the useful commit. :) I guess the "the client app" 
> mentioned is not qemu,
> but other upper thing, maybe something inside openstack? As I have to 
> say that I
> can see the error message (I mean the above error_reportf_err(...) ) 
> be printed to the
> screen when I use qemu command line via hmp to do the migration.
>
> For the qemu_savevm_state_setup(), I see it sets the f->last_error 
> (instead of s->error)
> to indicate whether to stop the migration or not when back to 
> migration_thread()
> in migration_detect_error(s). And no matter whether 
> qemu_savevm_state_setup()
> succeeds, the current code continues to set the migration state to be 
> ACTIVE. Emm,
> I am wondering whether this is on purpose..
>> Though here note that when you "goto exit" at [1] you probably also
>> need to touch up the cleanup part since otherwise the join() could be
>> with an invalid thread ID, so you'll possibly need to check the thread
>> ID validity before do the join() of the compression thread.
> Thanks for pointing this out. I think my last patch is to fix this 
> problem, that is
> to add a check in qemu_thread_join():
> +    if (!thread->thread) {
> +        return NULL;
> +    }
> Correct me if this is not the proper solution. :)
>
> Have a nice day, thanks :)
> Fei
>>
>> Regards,
>>
>
>
>
>

  reply	other threads:[~2018-12-25 12:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-11  9:50 [Qemu-devel] [PATCH for-4.0 v8 0/7] qemu_thread_create: propagate errors to callers to check Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 1/7] Fix segmentation fault when qemu_signal_init fails Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 2/7] qemu_init_vcpu: add a new Error parameter to propagate Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 3/7] migration: fix the multifd code when receiving less channels Fei Li
2018-12-13  6:17   ` Markus Armbruster
2018-12-17 11:45     ` Fei Li
2018-12-19 14:11       ` Markus Armbruster
2018-12-20  3:27         ` Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 4/7] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 5/7] migration: add more error handling for postcopy_ram_enable_notify Fei Li
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 6/7] qemu_thread_create: propagate the error to callers to handle Fei Li
2018-12-13  7:26   ` Markus Armbruster
2018-12-14  0:24     ` David Gibson
2018-12-19  9:29       ` Markus Armbruster
2019-01-02  2:29         ` David Gibson
2018-12-17  7:29     ` Fei Li
2018-12-18 12:40       ` Fei Li
2018-12-19 10:11         ` Markus Armbruster
2018-12-19 10:10       ` Markus Armbruster
2018-12-19 12:14         ` Fei Li
2018-12-19 17:29           ` Eric Blake
2018-12-20  3:20             ` Fei Li
2018-12-21  9:36           ` Fei Li
2018-12-24  3:34             ` Peter Xu
2018-12-24  6:53               ` Fei Li
2018-12-25 12:18                 ` Fei Li [this message]
2018-12-11  9:50 ` [Qemu-devel] [PATCH for-4.0 v8 7/7] qemu_thread_join: fix segmentation fault Fei Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b5a341cb-257a-798f-4672-12e52a98fc8e@suse.com \
    --to=fli@suse.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.