From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48430) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fxVrf-0002ox-1m for qemu-devel@nongnu.org; Wed, 05 Sep 2018 07:20:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fxVra-0002MI-Vz for qemu-devel@nongnu.org; Wed, 05 Sep 2018 07:20:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:53990 helo=mx1.suse.de) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fxVra-0002GV-Lg for qemu-devel@nongnu.org; Wed, 05 Sep 2018 07:20:46 -0400 References: <20180904110822.12863-1-fli@suse.com> <20180904110822.12863-2-fli@suse.com> <20180904112620.GG22349@redhat.com> <0831de15-95cb-0774-10f9-8b03f4141c10@suse.com> <20180905083641.GD3026@redhat.com> From: Fei Li Message-ID: Date: Wed, 5 Sep 2018 19:20:39 +0800 MIME-Version: 1.0 In-Reply-To: <20180905083641.GD3026@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 1/5] Fix segmentation fault when qemu_signal_init fails List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "=?UTF-8?Q?Daniel_P._Berrang=c3=a9?=" Cc: qemu-devel@nongnu.org On 09/05/2018 04:36 PM, Daniel P. Berrang=C3=A9 wrote: > On Wed, Sep 05, 2018 at 12:17:24PM +0800, Fei Li wrote: >> Thanks for the review! :) >> >> >> On 09/04/2018 07:26 PM, Daniel P. Berrang=C3=A9 wrote: >>> On Tue, Sep 04, 2018 at 07:08:18PM +0800, Fei Li wrote: >>> ... snip ... >>>> free(info); >>>> return -1; >>>> } >>>> @@ -94,17 +97,21 @@ static int qemu_signalfd_compat(const sigset_t *= mask) >>>> return fds[0]; >>>> } >>>> -int qemu_signalfd(const sigset_t *mask) >>>> +int qemu_signalfd(const sigset_t *mask, Error **errp) >>>> { >>>> -#if defined(CONFIG_SIGNALFD) >>>> int ret; >>>> + Error *local_err =3D NULL; >>>> +#if defined(CONFIG_SIGNALFD) >>>> ret =3D syscall(SYS_signalfd, -1, mask, _NSIG / 8); >>>> if (ret !=3D -1) { >>>> qemu_set_cloexec(ret); >>>> return ret; >>>> } >>>> #endif >>>> - >>>> - return qemu_signalfd_compat(mask); >>>> + ret =3D qemu_signalfd_compat(mask, &local_err); >>>> + if (local_err) { >>>> + error_propagate(errp, local_err); >>>> + } >>> Using a local_err is not required - you can just pass errp stright >>> to qemu_signalfd_compat() and then check >>> >>> if (ret < 0) >> For the use of a local error object & error_propagate call, I'd like t= o >> explain here. :) >> In our code, the initial caller passes two kinds of Error to the call = trace, >> one is >> something like &error_abort and &error_fatal, the other is NULL. >> >> For the former, the exit() occurs in the functions where >> error_handle_fatal() is called >> (e.g. called by error_propagate/error_setg/...). The patch3: qemu_init= _vcpu >> is the case, >> that means the system will exit in the final callee: qemu_thread_creat= e(), >> instead of >> the initial caller pc_new_cpu(). In such case, I think propagating see= ms >> more reasonable. > I don't really agree. It is preferrable to abort immediately at the dee= pest > place which raises the error. The stack trace will thus show the full c= all > chain leading upto the problem. Sorry for the above example, it is not exactly correct: for the patch3=20 case, the system will exit in device_set_realized(), where the first=20 error_propagate() is called if we pass errp directly, but not in the final callee.. Sorry for the=20 misleading. For another example, its call trace: qemu_thread_create(, NULL) <=3D iothread_complete(, NULL) <=3D=3D user_creatable_complete(, NULL) <=3D=3D=3D object_new_with_propv(, errp) <=3D=3D=3D=3D object_new_with_props(, errp) {... error_propagate(errp,=20 local_err); ...} <=3D=3D=3D=3D=3D iothread_create(, &error_abort) The exit occurs in object_new_with_props where the first error_propagate=20 is called. Either the device_set_realized() or object_new_with_props() is a middle=20 caller, thus we can only see the top half stack trace until where=20 error_handle_fatal() is called. In other words, the exit() occurs neither in the final callee nor the=20 initial caller. Sorry for the misleading example again.. > >> How do you think passing errp straightly for the latter case, and use = a >> local error object & >> error_propagate for the former case? This is a distinct treatment, but= would >> shorten the code. > It is inappropriate to second-guess whether the caller is a passing in > NULL or &error_abort, or another Error object. What is passed in can > change at any time in the future. ok. > > We should only ever use a local error where the local method has a need > to look at the error contents before returning to the caller. Any other > case should just use the errp directly. > > Regards, > Daniel Have a nice day, thanks Fei