From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50188) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aR1oo-0003S2-11 for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:06:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aR1oj-0004zJ-Oy for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:06:17 -0500 Received: from mx1.redhat.com ([209.132.183.28]:52700) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aR1oj-0004zF-Ff for qemu-devel@nongnu.org; Wed, 03 Feb 2016 13:06:13 -0500 From: Markus Armbruster References: <145442963048.1539.13602468921796488810.stgit@localhost> <145442963860.1539.7135815311391731257.stgit@localhost> <87twlraqqw.fsf@blackfin.pond.sub.org> <56B123F7.50505@redhat.com> <20160203050436.GI15080@voom.fritz.box> <87bn7yxgxm.fsf@blackfin.pond.sub.org> <56B1CF5C.7040007@redhat.com> <87io26ulip.fsf@blackfin.pond.sub.org> <87vb65di6q.fsf@fimbulvetr.bsc.es> <87fux9q2uy.fsf@blackfin.pond.sub.org> <87io25bzhc.fsf@fimbulvetr.bsc.es> Date: Wed, 03 Feb 2016 19:06:10 +0100 In-Reply-To: <87io25bzhc.fsf@fimbulvetr.bsc.es> (=?utf-8?Q?=22Llu=C3=ADs?= Vilanova"'s message of "Wed, 03 Feb 2016 16:11:27 +0100") Message-ID: <8737t9is8d.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v6 1/5] util: Introduce error reporting functions with fatal/abort List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Thomas Huth Cc: Stefan Hajnoczi , qemu-devel@nongnu.org, "Dr . David Alan Gilbert" , David Gibson Llu=C3=ADs Vilanova writes: > Markus Armbruster writes: > >> Llu=C3=ADs Vilanova writes: >>> Markus Armbruster writes: >>>=20 >>>> Thomas Huth writes: >>>>> On 03.02.2016 10:48, Markus Armbruster wrote: >>>>>> David Gibson writes: >>>>>>=20 >>>>>>> On Tue, Feb 02, 2016 at 10:47:35PM +0100, Thomas Huth wrote: >>>>>>>> On 02.02.2016 19:53, Markus Armbruster wrote: >>>>>>>>> Llu=C3=ADs Vilanova writes: >>>>>>>> ... >>>>>>>>=20 >>>>>>>>>> diff --git a/include/qemu/error-report.h b/include/qemu/error-re= port.h >>>>>>>>>> index 7ab2355..6c2f142 100644 >>>>>>>>>> --- a/include/qemu/error-report.h >>>>>>>>>> +++ b/include/qemu/error-report.h >>>>>>>>>> @@ -43,4 +43,23 @@ void error_report(const char *fmt, ...) GCC_F= MT_ATTR(1, 2); >>>>>>>>>> const char *error_get_progname(void); >>>>>>>>>> extern bool enable_timestamp_msg; >>>>>>>>>>=20 >>>>>>>>>> +/* Report message and exit with error */ >>>>>>>>>> +void QEMU_NORETURN error_vreport_fatal(const char *fmt, va_list= ap) GCC_FMT_ATTR(1, 0); >>>>>>>>>> +void QEMU_NORETURN error_report_fatal(const char *fmt, ...) GCC= _FMT_ATTR(1, 2); >>>>>>>>>=20 >>>>>>>>> This lets people write things like >>>>>>>>>=20 >>>>>>>>> error_report_fatal("The sky is falling"); >>>>>>>>>=20 >>>>>>>>> instead of >>>>>>>>>=20 >>>>>>>>> error_report("The sky is falling"); >>>>>>>>> exit(1); >>>>>>>>>=20 >>>>>>>>> or >>>>>>>>>=20 >>>>>>>>> fprintf(stderr, "The sky is falling\n"); >>>>>>>>> exit(1); >>>>>>>>>=20 >>>>>>>>> I don't think that's an improvement in clarity. >>>>>>>>=20 >>>>>>>> The problem is not the existing code, but that in a couple of new >>>>>>>> patches, I've now already seen that people are trying to use >>>>>>>>=20 >>>>>>>> error_setg(&error_fatal, ... ); >>>>>>>=20 >>>>>>> So, I don't actually see any real advantage to error_report_fatal(.= ..) >>>>>>> over error_setg(&error_fatal, ...). >>>>>>=20 >>>>>> I do. Compare: >>>>>>=20 >>>>>> (a) error_report(...); >>>>>> exit(1); >>>>>>=20 >>>>>> (b) error_report_fatal(...); >>>>>>=20 >>>>>> (c) error_setg(&error_fatal, ...); >>>>>>=20 >>>>>> In my opinion, (a) is clearest: even a relatively clueless reader wi= ll >>>>>> know what exit(1) does, can guess what error_report() approximately >>>>>> does, and doesn't need to know what it does exactly. (b) is slightly >>>>>> less obvious, and (c) is positively opaque. >>>>>>=20 >>>>>> Let's stick to the obvious (a) and be done with it. >>>>>=20 >>>>> Ok, (a) is fine for me too, as long as we avoid (c). Llu=C3=ADs, coul= d you >>>>> maybe add that information to your patch that updates the HACKING tex= t? >>>=20 >>>> I feel such detailed advice belings into error.h. Sketch appended. >>>=20 >>>> If that doesn't succeed in keeping (c) out, make checkpatch flag it. >>>=20 >>>>> (and sorry for the fuzz with error_report_fatal() ... I thought it wo= uld >>>>> be a good solution to avoid (c), but if (a) is preferred instead, then >>>>> we should go with that solution instead). >>>=20 >>> I can easily change that, no problem. I'm just happy consensus is landi= ng on >>> this subject. >>>=20 >>>=20 >>>>> And, by the way, what about the spots that currently already use >>>>> error_setg(&error_abort, ....) ? Should they be turned into >>>>> error_report() + abort() instead? Or only abort(), without error >>>>> message, since abort() is only about programming errors? >>>=20 >>>> As I wrote in my first reply to this thread, I'd like them to be clean= ed >>>> up to just abort() or assert(). >>>=20 >>>> I like assert(), because it gives me exactly what I can use to debug t= he >>>> programming error: a core dump (if enabled) and a source location >>>> (useful when no core dump). I never bought the argument that we should >>>> use abort() instead of assert(0) because "what if NDEBUG?!?". If you >>>> define NDEBUG, our 600+ abort()s won't save you from our 4000+ >>>> assert()s. >>>=20 >>> Sorry, but I don't buy the argument of, "I prefer assert() because ther= e's >>> already lots of them". To me, there's a semantic difference between deb= ug builds >>> and regular ones (aka, assert vs abort). > >> That's not what I said :) > >> In the past, people have argued in favor of abort() by pointing to >> NDEBUG. I don't buy that argument, but me not buying it is not why I >> prefer assert(). I do because it prints additional information that's >> occasionally useful. > >>> Also, I think it adds to the confusion >>> that assert and abort seem to be used interchangeably in the code. > >> For better or worse, we overwhelmingly use abort() instead of assert(0), >> but don't use if (!good) abort() instead of assert(good). Doesn't make >> sense to me, but my appetite for tree-wide changes and the debates that >> go with them has limits. > >>> What about this definition? >>>=20 >>> * exit(): user-triggered errors >>> * abort(): general programming errors >>> * assert(): additional sanity/consistency checks against programming er= rors >>>=20 >>> Now, abort & assert have an overlap. Should we discourage one in favour= of the >>> other? > >> I can't see how to decide whether a programming error is "general" or >> "additional", or why an "additional" one error deserves a message >> pointing to source code, but a "general" one does not. > >>> Also: >>>=20 >>> * error_report_fatal ensures the same exit code is always used (otherwi= se it can >>> fail with inconsistent error codes) > >> What if you *want* to use a different exit code? > >> But I grant you that we should almost always use exit(1) for fatal >> errors. And in fact we do! There are a bunch of misguided exit(-1) in >> the code, but git-log -S'exit(-1)' finds only half a dozen offending >> commits since 2013, and none since 2015, so preventing more seems to be >> a mostly solved problem. > >>> * error_report_abort brings the code information of assert into abort > >> If you want your crashes to print source location information, don't >> reinvent the wheel, just use assert(). > >> &error_abort can't because the interesting spot isn't where we decide to >> abort, but where the error got created. > > Fair enough. I don't want a flame on style either, although I might look = like > wanting one :) I think we're having a civil, constructive discussion on error handling and reporting that happens to include stylistic aspects :) >>> But of course, I'm happy either way :) >>>=20 >>>=20 >>>> diff --git a/include/qapi/error.h b/include/qapi/error.h >>>> index 45d6c72..ea7e74f 100644 >>>> --- a/include/qapi/error.h >>>> +++ b/include/qapi/error.h >>>> @@ -162,6 +162,9 @@ ErrorClass error_get_class(const Error *err); >>>> * human-readable error message is made from printf-style @fmt, ... >>>> * The resulting message should be a single phrase, with no newline or >>>> * trailing punctuation. >>>> + * Please don't error_setg(&error_fatal, ...), use error_report() and >>>> + * exit(), because that's more obvious. >>>> + * Likewise, don't error_setg(&error_abort, ...), use assert(). >>>> */ >>>> #define error_setg(errp, fmt, ...) \ >>>> error_setg_internal((errp), __FILE__, __LINE__, __func__, \ >>>> @@ -213,6 +216,8 @@ void error_setg_win32_internal(Error **errp, >>>> * the error object. >>>> * Else, move the error object from @local_err to *@dst_errp. >>>> * On return, @local_err is invalid. >>>> + * Please don't error_propagate(&error_fatal, ...), use >>>> + * error_report_err() and exit(), because that's more obvious. >>>> */ >>>> void error_propagate(Error **dst_errp, Error *local_err); >>>=20 >>>> @@ -291,12 +296,14 @@ void error_set_internal(Error **errp, >>>> GCC_FMT_ATTR(6, 7); >>>=20 >>>> /* >>>> - * Pass to error_setg() & friends to abort() on error. >>>> + * Special error destination to abort on error. >>>> + * See error_setg() and error_propagate() for details. >>>> */ >>>> extern Error *error_abort; >>>=20 >>>> /* >>>> - * Pass to error_setg() & friends to exit(1) on error. >>>> + * Special error destination to exit(1) on error. >>>> + * See error_setg() and error_propagate() for details. >>>> */ >>>> extern Error *error_fatal; >>>=20 >>> I see, this will make it clearer for people looking for functions witho= ut >>> reading HACKING. I can add this and reference it from the document. > >> If you like, I can post it as a formal patch you can then include in >> your series. > > That'd be great. Please cc me when you send it. Done: [PATCH 0/2] error: Documentation updates