All of lore.kernel.org
 help / color / mirror / Atom feed
* Questionable aspects of QEMU Error's design
@ 2020-04-01  9:02 Markus Armbruster
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
                   ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-01  9:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé

QEMU's Error was patterned after GLib's GError.  Differences include:

* &error_fatal, &error_abort for convenience

* Error can optionally store hints

* Pointlessly different names: error_prepend() vs. g_error_prefix() and
  so forth *shrug*

* Propagating errors

  Thanks to Vladimir, we'll soon have "auto propagation", which is less
  verbose and less error-prone.

* Accumulating errors

  error_propagate() permits it, g_propagate_error() does not[*].

  I believe this feature is used rarely.  Perhaps we'd be better off
  without it.  The problem is identifying its uses.  If I remember
  correctly, Vladimir struggled with that for his "auto propagation"
  work.

  Perhaps "auto propagation" will reduce the number of manual
  error_propagate() to the point where we can identify accumulations.
  Removing the feature would become feasible then.

* Distinguishing different errors

  Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
  of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
  discouraged.  When we need callers to distinguish errors, we return
  suitable error codes separately.

* Return value conventions

  Common: non-void functions return a distinct error value on failure
  when such a value can be defined.  Patterns:

  - Functions returning non-null pointers on success return null pointer
    on failure.

  - Functions returning non-negative integers on success return a
    negative error code on failure.

  Different: GLib discourages void functions, because these lead to
  awkward error checking code.  We have tons of them, and tons of
  awkward error checking code:

    Error *err = NULL;
    frobnicate(arg, &err);
    if (err) {
        ... recover ...
        error_propagate(errp, err);
    }

  instead of

    if (!frobnicate(arg, errp))
        ... recover ...
    }

  Can also lead to pointless creation of Error objects.

  I consider this a design mistake.  Can we still fix it?  We have more
  than 2000 void functions taking an Error ** parameter...

  Transforming code that receives and checks for errors with Coccinelle
  shouldn't be hard.  Transforming code that returns errors seems more
  difficult.  We need to transform explicit and implicit return to
  either return true or return false, depending on what we did to the
  @errp parameter on the way to the return.  Hmm.


[*] According to documentation; the code merely calls g_warning() then,
in typical GLib fashion.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01  9:02 Questionable aspects of QEMU Error's design Markus Armbruster
@ 2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
  2020-04-01 12:14   ` Vladimir Sementsov-Ogievskiy
                     ` (2 more replies)
  2020-04-01 12:44 ` Daniel P. Berrangé
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-01 12:10 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel; +Cc: Philippe Mathieu-Daudé

01.04.2020 12:02, Markus Armbruster wrote:
> QEMU's Error was patterned after GLib's GError.  Differences include:
> 
> * &error_fatal, &error_abort for convenience
> 
> * Error can optionally store hints
> 
> * Pointlessly different names: error_prepend() vs. g_error_prefix() and
>    so forth *shrug*
> 
> * Propagating errors
> 
>    Thanks to Vladimir, we'll soon have "auto propagation", which is less
>    verbose and less error-prone.
> 
> * Accumulating errors
> 
>    error_propagate() permits it, g_propagate_error() does not[*].
> 
>    I believe this feature is used rarely.  Perhaps we'd be better off
>    without it.  The problem is identifying its uses.  If I remember
>    correctly, Vladimir struggled with that for his "auto propagation"
>    work.
> 
>    Perhaps "auto propagation" will reduce the number of manual
>    error_propagate() to the point where we can identify accumulations.
>    Removing the feature would become feasible then.
> 
> * Distinguishing different errors
> 
>    Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
>    of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>    discouraged.  When we need callers to distinguish errors, we return
>    suitable error codes separately.
> 
> * Return value conventions
> 
>    Common: non-void functions return a distinct error value on failure
>    when such a value can be defined.  Patterns:
> 
>    - Functions returning non-null pointers on success return null pointer
>      on failure.
> 
>    - Functions returning non-negative integers on success return a
>      negative error code on failure.
> 
>    Different: GLib discourages void functions, because these lead to
>    awkward error checking code.  We have tons of them, and tons of
>    awkward error checking code:
> 
>      Error *err = NULL;
>      frobnicate(arg, &err);
>      if (err) {
>          ... recover ...
>          error_propagate(errp, err);
>      }
> 
>    instead of
> 
>      if (!frobnicate(arg, errp))
>          ... recover ...
>      }
> 
>    Can also lead to pointless creation of Error objects.
> 
>    I consider this a design mistake.  Can we still fix it?  We have more
>    than 2000 void functions taking an Error ** parameter...
> 
>    Transforming code that receives and checks for errors with Coccinelle
>    shouldn't be hard.  Transforming code that returns errors seems more
>    difficult.  We need to transform explicit and implicit return to
>    either return true or return false, depending on what we did to the
>    @errp parameter on the way to the return.  Hmm.
> 
> 
> [*] According to documentation; the code merely calls g_warning() then,
> in typical GLib fashion.
> 


Side question:

Can we somehow implement a possibility to reliably identify file and line number
where error is set by error message?

It's where debug of error-bugs always starts: try to imagine which parts of the error
message are "%s", and how to grep for it in the code, keeping in mind also,
that error massage may be split into several lines..

Put file:line into each error? Seems too noisy for users.. A lot of errors are not
bugs: use do something wrong and see the error, and understands what he is doing
wrong.. It's not usual practice to print file:line into each message for user.


But what if we do some kind of mapping file:line <-> error code, so user will see
something like:


    Error 12345: Device drive-scsi0-0-0-0 is not found

....

Hmm, maybe, just add one more argument to error_setg:

error_setg(errp, 12345, "Device %s is not found", device_name);

- it's enough grep-able.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
@ 2020-04-01 12:14   ` Vladimir Sementsov-Ogievskiy
  2020-04-01 14:01   ` Alex Bennée
  2020-04-01 15:05   ` Markus Armbruster
  2 siblings, 0 replies; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-01 12:14 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel; +Cc: Philippe Mathieu-Daudé

01.04.2020 15:10, Vladimir Sementsov-Ogievskiy wrote:
> 01.04.2020 12:02, Markus Armbruster wrote:
>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>
>> * &error_fatal, &error_abort for convenience
>>
>> * Error can optionally store hints
>>
>> * Pointlessly different names: error_prepend() vs. g_error_prefix() and
>>    so forth *shrug*
>>
>> * Propagating errors
>>
>>    Thanks to Vladimir, we'll soon have "auto propagation", which is less
>>    verbose and less error-prone.
>>
>> * Accumulating errors
>>
>>    error_propagate() permits it, g_propagate_error() does not[*].
>>
>>    I believe this feature is used rarely.  Perhaps we'd be better off
>>    without it.  The problem is identifying its uses.  If I remember
>>    correctly, Vladimir struggled with that for his "auto propagation"
>>    work.
>>
>>    Perhaps "auto propagation" will reduce the number of manual
>>    error_propagate() to the point where we can identify accumulations.
>>    Removing the feature would become feasible then.
>>
>> * Distinguishing different errors
>>
>>    Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
>>    of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>>    discouraged.  When we need callers to distinguish errors, we return
>>    suitable error codes separately.
>>
>> * Return value conventions
>>
>>    Common: non-void functions return a distinct error value on failure
>>    when such a value can be defined.  Patterns:
>>
>>    - Functions returning non-null pointers on success return null pointer
>>      on failure.
>>
>>    - Functions returning non-negative integers on success return a
>>      negative error code on failure.
>>
>>    Different: GLib discourages void functions, because these lead to
>>    awkward error checking code.  We have tons of them, and tons of
>>    awkward error checking code:
>>
>>      Error *err = NULL;
>>      frobnicate(arg, &err);
>>      if (err) {
>>          ... recover ...
>>          error_propagate(errp, err);
>>      }
>>
>>    instead of
>>
>>      if (!frobnicate(arg, errp))
>>          ... recover ...
>>      }
>>
>>    Can also lead to pointless creation of Error objects.
>>
>>    I consider this a design mistake.  Can we still fix it?  We have more
>>    than 2000 void functions taking an Error ** parameter...
>>
>>    Transforming code that receives and checks for errors with Coccinelle
>>    shouldn't be hard.  Transforming code that returns errors seems more
>>    difficult.  We need to transform explicit and implicit return to
>>    either return true or return false, depending on what we did to the
>>    @errp parameter on the way to the return.  Hmm.
>>
>>
>> [*] According to documentation; the code merely calls g_warning() then,
>> in typical GLib fashion.
>>
> 
> 
> Side question:
> 
> Can we somehow implement a possibility to reliably identify file and line number
> where error is set by error message?
> 
> It's where debug of error-bugs always starts: try to imagine which parts of the error
> message are "%s", and how to grep for it in the code, keeping in mind also,
> that error massage may be split into several lines..
> 
> Put file:line into each error? Seems too noisy for users.. A lot of errors are not
> bugs: use do something wrong and see the error, and understands what he is doing
> wrong.. It's not usual practice to print file:line into each message for user.
> 
> 
> But what if we do some kind of mapping file:line <-> error code, so user will see
> something like:
> 
> 
>     Error 12345: Device drive-scsi0-0-0-0 is not found
> 
> ....
> 
> Hmm, maybe, just add one more argument to error_setg:
> 
> error_setg(errp, 12345, "Device %s is not found", device_name);
> 
> - it's enough grep-able.
> 


Or, maybe, go the hard way, add a script, which smartly search for error message in code-base,
correctly handling "%s" and multi-line messages.. And add to checkpatch a check
that new added error messages doesn't match to any of existing.

Hmm. I think, I'll try to do something like this.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01  9:02 Questionable aspects of QEMU Error's design Markus Armbruster
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
@ 2020-04-01 12:44 ` Daniel P. Berrangé
  2020-04-01 12:47   ` Vladimir Sementsov-Ogievskiy
  2020-04-01 15:34   ` Markus Armbruster
  2020-04-01 20:15 ` Peter Maydell
  2020-04-04  7:59 ` Markus Armbruster
  3 siblings, 2 replies; 41+ messages in thread
From: Daniel P. Berrangé @ 2020-04-01 12:44 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé, qemu-devel

On Wed, Apr 01, 2020 at 11:02:11AM +0200, Markus Armbruster wrote:
> QEMU's Error was patterned after GLib's GError.  Differences include:
> 
> * &error_fatal, &error_abort for convenience

I think this doesn't really need to exist, and is an artifact
of the later point "return values" where we commonly make methds
return void.  If we adopted a non-void return value, then these
are no longer so compelling.

Consider if we didn't have &error_fatal right now, then we would
need to

   Error *local_err = NULL;
   qemu_boot_set(boot_once, &local_err)
   if (*local_err)
      abort();

This is tedious, so we invented &error_abort to make our lives
better

   qemu_boot_set(boot_once, &error_abort)


If we had a "bool" return value though, we would probably have just
ended up doing:

   assert(qemu_boot_set(boot_once, NULL));

or

   if (!qemu_boot_set(boot_once, NULL))
       abort()

and would never have invented &error_fatal.

> * Distinguishing different errors
> 
>   Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
>   of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>   discouraged.  When we need callers to distinguish errors, we return
>   suitable error codes separately.

The GQuark is just a static string, and in most cases this ends up being
defined per-file, or sometimes per functional group. So essentially you
can consider it to approximately a source file in most cases. The code
is a constant of some arbitrary type that is generally considered to be
scoped within the context of the GQuark domain.

> * Return value conventions
> 
>   Common: non-void functions return a distinct error value on failure
>   when such a value can be defined.  Patterns:
> 
>   - Functions returning non-null pointers on success return null pointer
>     on failure.
> 
>   - Functions returning non-negative integers on success return a
>     negative error code on failure.
> 
>   Different: GLib discourages void functions, because these lead to
>   awkward error checking code.  We have tons of them, and tons of
>   awkward error checking code:
> 
>     Error *err = NULL;
>     frobnicate(arg, &err);
>     if (err) {
>         ... recover ...
>         error_propagate(errp, err);
>     }

Yeah, I really dislike this verbose style...

> 
>   instead of
> 
>     if (!frobnicate(arg, errp))
>         ... recover ...
>     }

...so I've followed this style for any code I've written in QEMU
where possible.

> 
>   Can also lead to pointless creation of Error objects.
> 
>   I consider this a design mistake.  Can we still fix it?  We have more
>   than 2000 void functions taking an Error ** parameter...

Even if we don't do full conversion, we can at least encourage the
simpler style - previously reviewers have told me to rewrite code
to use the more verbose style, which I resisted. So at the very
least setting the expectations for preferred style is useful.

>   Transforming code that receives and checks for errors with Coccinelle
>   shouldn't be hard.  Transforming code that returns errors seems more
>   difficult.  We need to transform explicit and implicit return to
>   either return true or return false, depending on what we did to the
>   @errp parameter on the way to the return.  Hmm.

Even if we only converted methods which are currently void, that
would be a notable benefit I think.

It is a shame we didn't just use GError from the start, but I guess
its probably too late to consider changing that now.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 12:44 ` Daniel P. Berrangé
@ 2020-04-01 12:47   ` Vladimir Sementsov-Ogievskiy
  2020-04-01 15:34   ` Markus Armbruster
  1 sibling, 0 replies; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-01 12:47 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: Philippe Mathieu-Daudé, qemu-devel

01.04.2020 15:44, Daniel P. Berrangé wrote:
> On Wed, Apr 01, 2020 at 11:02:11AM +0200, Markus Armbruster wrote:
>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>
>> * &error_fatal, &error_abort for convenience
> 
> I think this doesn't really need to exist, and is an artifact
> of the later point "return values" where we commonly make methds
> return void.  If we adopted a non-void return value, then these
> are no longer so compelling.
> 
> Consider if we didn't have &error_fatal right now, then we would
> need to
> 
>     Error *local_err = NULL;
>     qemu_boot_set(boot_once, &local_err)
>     if (*local_err)
>        abort();
> 
> This is tedious, so we invented &error_abort to make our lives
> better
> 
>     qemu_boot_set(boot_once, &error_abort)
> 
> 
> If we had a "bool" return value though, we would probably have just
> ended up doing:
> 
>     assert(qemu_boot_set(boot_once, NULL));

But error_abort is better: it crashes where error fired and backtrace
in core file is backtrace of and error.

> 
> or
> 
>     if (!qemu_boot_set(boot_once, NULL))
>         abort()
> 
> and would never have invented &error_fatal.
> 
>> * Distinguishing different errors
>>
>>    Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
>>    of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>>    discouraged.  When we need callers to distinguish errors, we return
>>    suitable error codes separately.
> 
> The GQuark is just a static string, and in most cases this ends up being
> defined per-file, or sometimes per functional group. So essentially you
> can consider it to approximately a source file in most cases. The code
> is a constant of some arbitrary type that is generally considered to be
> scoped within the context of the GQuark domain.
> 
>> * Return value conventions
>>
>>    Common: non-void functions return a distinct error value on failure
>>    when such a value can be defined.  Patterns:
>>
>>    - Functions returning non-null pointers on success return null pointer
>>      on failure.
>>
>>    - Functions returning non-negative integers on success return a
>>      negative error code on failure.
>>
>>    Different: GLib discourages void functions, because these lead to
>>    awkward error checking code.  We have tons of them, and tons of
>>    awkward error checking code:
>>
>>      Error *err = NULL;
>>      frobnicate(arg, &err);
>>      if (err) {
>>          ... recover ...
>>          error_propagate(errp, err);
>>      }
> 
> Yeah, I really dislike this verbose style...
> 
>>
>>    instead of
>>
>>      if (!frobnicate(arg, errp))
>>          ... recover ...
>>      }
> 
> ...so I've followed this style for any code I've written in QEMU
> where possible.
> 
>>
>>    Can also lead to pointless creation of Error objects.
>>
>>    I consider this a design mistake.  Can we still fix it?  We have more
>>    than 2000 void functions taking an Error ** parameter...
> 
> Even if we don't do full conversion, we can at least encourage the
> simpler style - previously reviewers have told me to rewrite code
> to use the more verbose style, which I resisted. So at the very
> least setting the expectations for preferred style is useful.
> 
>>    Transforming code that receives and checks for errors with Coccinelle
>>    shouldn't be hard.  Transforming code that returns errors seems more
>>    difficult.  We need to transform explicit and implicit return to
>>    either return true or return false, depending on what we did to the
>>    @errp parameter on the way to the return.  Hmm.
> 
> Even if we only converted methods which are currently void, that
> would be a notable benefit I think.
> 
> It is a shame we didn't just use GError from the start, but I guess
> its probably too late to consider changing that now.
> 
> Regards,
> Daniel
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
  2020-04-01 12:14   ` Vladimir Sementsov-Ogievskiy
@ 2020-04-01 14:01   ` Alex Bennée
  2020-04-01 15:49     ` Markus Armbruster
  2020-04-01 15:05   ` Markus Armbruster
  2 siblings, 1 reply; 41+ messages in thread
From: Alex Bennée @ 2020-04-01 14:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Philippe Mathieu-Daudé, Markus Armbruster, qemu-devel


Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 01.04.2020 12:02, Markus Armbruster wrote:
>> QEMU's Error was patterned after GLib's GError.  Differences include:
>> * &error_fatal, &error_abort for convenience
>> * Error can optionally store hints
>> * Pointlessly different names: error_prepend() vs. g_error_prefix()
>> and
>>    so forth *shrug*
>> * Propagating errors
>>    Thanks to Vladimir, we'll soon have "auto propagation", which is
>> less
>>    verbose and less error-prone.
>> * Accumulating errors
>>    error_propagate() permits it, g_propagate_error() does not[*].
>>    I believe this feature is used rarely.  Perhaps we'd be better
>> off
>>    without it.  The problem is identifying its uses.  If I remember
>>    correctly, Vladimir struggled with that for his "auto propagation"
>>    work.
>>    Perhaps "auto propagation" will reduce the number of manual
>>    error_propagate() to the point where we can identify accumulations.
>>    Removing the feature would become feasible then.
>> * Distinguishing different errors
>>    Where Error has ErrorClass, GError has Gquark domain, gint code.
>> Use
>>    of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>>    discouraged.  When we need callers to distinguish errors, we return
>>    suitable error codes separately.
>> * Return value conventions
>>    Common: non-void functions return a distinct error value on
>> failure
>>    when such a value can be defined.  Patterns:
>>    - Functions returning non-null pointers on success return null
>> pointer
>>      on failure.
>>    - Functions returning non-negative integers on success return a
>>      negative error code on failure.
>>    Different: GLib discourages void functions, because these lead to
>>    awkward error checking code.  We have tons of them, and tons of
>>    awkward error checking code:
>>      Error *err = NULL;
>>      frobnicate(arg, &err);
>>      if (err) {
>>          ... recover ...
>>          error_propagate(errp, err);
>>      }
>>    instead of
>>      if (!frobnicate(arg, errp))
>>          ... recover ...
>>      }
>>    Can also lead to pointless creation of Error objects.
>>    I consider this a design mistake.  Can we still fix it?  We have
>> more
>>    than 2000 void functions taking an Error ** parameter...
>>    Transforming code that receives and checks for errors with
>> Coccinelle
>>    shouldn't be hard.  Transforming code that returns errors seems more
>>    difficult.  We need to transform explicit and implicit return to
>>    either return true or return false, depending on what we did to the
>>    @errp parameter on the way to the return.  Hmm.
>> 
>> [*] According to documentation; the code merely calls g_warning() then,
>> in typical GLib fashion.
>> 
>
>
> Side question:
>
> Can we somehow implement a possibility to reliably identify file and line number
> where error is set by error message?
>
> It's where debug of error-bugs always starts: try to imagine which parts of the error
> message are "%s", and how to grep for it in the code, keeping in mind also,
> that error massage may be split into several lines..
>
> Put file:line into each error? Seems too noisy for users.. A lot of errors are not
> bugs: use do something wrong and see the error, and understands what he is doing
> wrong.. It's not usual practice to print file:line into each message
> for user.

I tend to use __func__ for these things as the result is usually easily
grep-able.


>
>
> But what if we do some kind of mapping file:line <-> error code, so user will see
> something like:
>
>
>    Error 12345: Device drive-scsi0-0-0-0 is not found
>
> ....
>
> Hmm, maybe, just add one more argument to error_setg:
>
> error_setg(errp, 12345, "Device %s is not found", device_name);
>
> - it's enough grep-able.


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
  2020-04-01 12:14   ` Vladimir Sementsov-Ogievskiy
  2020-04-01 14:01   ` Alex Bennée
@ 2020-04-01 15:05   ` Markus Armbruster
  2 siblings, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-01 15:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy; +Cc: Philippe Mathieu-Daudé, qemu-devel

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> Side question:
>
> Can we somehow implement a possibility to reliably identify file and line number
> where error is set by error message?
>
> It's where debug of error-bugs always starts: try to imagine which parts of the error
> message are "%s", and how to grep for it in the code, keeping in mind also,
> that error massage may be split into several lines..
>
> Put file:line into each error? Seems too noisy for users.. A lot of errors are not
> bugs: use do something wrong and see the error, and understands what he is doing
> wrong.. It's not usual practice to print file:line into each message for user.
>
>
> But what if we do some kind of mapping file:line <-> error code, so user will see
> something like:
>
>
>    Error 12345: Device drive-scsi0-0-0-0 is not found
>
> ....
>
> Hmm, maybe, just add one more argument to error_setg:
>
> error_setg(errp, 12345, "Device %s is not found", device_name);
>
> - it's enough grep-able.

error_setg() already records source file and line number in the Error
object, so that error_handle_fatal(&error_abort, err) can report them.

Making the programmer pick and pass an error ID at every call site is
onerous.  More so when the error ID should be globally unique.

With GLib's domain, code, the code needs only be unique within the
domain, but you still have to define globally unique domains.
Differently onerous.

We could have -msg,debug=on make error_report_err() report the Error
object's source file and line number.  Doesn't help for the many direct
uses of error_report().  To cover those, we'd have to turn
error_report() into a macro, similar to how error_setg() works.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 12:44 ` Daniel P. Berrangé
  2020-04-01 12:47   ` Vladimir Sementsov-Ogievskiy
@ 2020-04-01 15:34   ` Markus Armbruster
  1 sibling, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-01 15:34 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé, qemu-devel

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Apr 01, 2020 at 11:02:11AM +0200, Markus Armbruster wrote:
>> QEMU's Error was patterned after GLib's GError.  Differences include:
>> 
>> * &error_fatal, &error_abort for convenience
>
> I think this doesn't really need to exist, and is an artifact
> of the later point "return values" where we commonly make methds
> return void.  If we adopted a non-void return value, then these
> are no longer so compelling.
>
> Consider if we didn't have &error_fatal right now, then we would
> need to
>
>    Error *local_err = NULL;
>    qemu_boot_set(boot_once, &local_err)
>    if (*local_err)
>       abort();
>
> This is tedious, so we invented &error_abort to make our lives
> better
>
>    qemu_boot_set(boot_once, &error_abort)
>
>
> If we had a "bool" return value though, we would probably have just
> ended up doing:
>
>    assert(qemu_boot_set(boot_once, NULL));

This assumes !defined(NDEBUG).

> or
>
>    if (!qemu_boot_set(boot_once, NULL))
>        abort()
>
> and would never have invented &error_fatal.

Yes, the readability improvement of &error_abort over this is only
marginal.

However, &error_abort also results in more useful stack backtraces, as
Vladimir already pointed out.

Our use of error_propagate() sabotages this advantage.  Vladimir's auto
propagation work stops that.

>> * Distinguishing different errors
>> 
>>   Where Error has ErrorClass, GError has Gquark domain, gint code.  Use
>>   of ErrorClass other than ERROR_CLASS_GENERIC_ERROR is strongly
>>   discouraged.  When we need callers to distinguish errors, we return
>>   suitable error codes separately.
>
> The GQuark is just a static string, and in most cases this ends up being
> defined per-file, or sometimes per functional group. So essentially you
> can consider it to approximately a source file in most cases. The code
> is a constant of some arbitrary type that is generally considered to be
> scoped within the context of the GQuark domain.
>
>> * Return value conventions
>> 
>>   Common: non-void functions return a distinct error value on failure
>>   when such a value can be defined.  Patterns:
>> 
>>   - Functions returning non-null pointers on success return null pointer
>>     on failure.
>> 
>>   - Functions returning non-negative integers on success return a
>>     negative error code on failure.
>> 
>>   Different: GLib discourages void functions, because these lead to
>>   awkward error checking code.  We have tons of them, and tons of
>>   awkward error checking code:
>> 
>>     Error *err = NULL;
>>     frobnicate(arg, &err);
>>     if (err) {
>>         ... recover ...
>>         error_propagate(errp, err);
>>     }
>
> Yeah, I really dislike this verbose style...
>
>> 
>>   instead of
>> 
>>     if (!frobnicate(arg, errp))
>>         ... recover ...
>>     }
>
> ...so I've followed this style for any code I've written in QEMU
> where possible.
>
>> 
>>   Can also lead to pointless creation of Error objects.
>> 
>>   I consider this a design mistake.  Can we still fix it?  We have more
>>   than 2000 void functions taking an Error ** parameter...
>
> Even if we don't do full conversion, we can at least encourage the
> simpler style - previously reviewers have told me to rewrite code
> to use the more verbose style, which I resisted. So at the very
> least setting the expectations for preferred style is useful.

It's a matter of patching the big comment in error.h.

Of course, the non-preferred style will still be copied from bad
examples until we get rid of them.

>>   Transforming code that receives and checks for errors with Coccinelle
>>   shouldn't be hard.  Transforming code that returns errors seems more
>>   difficult.  We need to transform explicit and implicit return to
>>   either return true or return false, depending on what we did to the
>>   @errp parameter on the way to the return.  Hmm.
>
> Even if we only converted methods which are currently void, that
> would be a notable benefit I think.

Manual conversion of a modest set of frequently used functions with
automated conversion of its calls should be feasible.

For more, I believe we need to figure out how to automatically transform
code that returns errors.

> It is a shame we didn't just use GError from the start, but I guess
> its probably too late to consider changing that now.

If I remember correctly, error.h predates our adoption of GLib.  Water
under the bridge now.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 14:01   ` Alex Bennée
@ 2020-04-01 15:49     ` Markus Armbruster
  0 siblings, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-01 15:49 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	Markus Armbruster, qemu-devel

Alex Bennée <alex.bennee@linaro.org> writes:

> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>
>> Side question:
>>
>> Can we somehow implement a possibility to reliably identify file and line number
>> where error is set by error message?
>>
>> It's where debug of error-bugs always starts: try to imagine which parts of the error
>> message are "%s", and how to grep for it in the code, keeping in mind also,
>> that error massage may be split into several lines..
>>
>> Put file:line into each error? Seems too noisy for users.. A lot of errors are not
>> bugs: use do something wrong and see the error, and understands what he is doing
>> wrong.. It's not usual practice to print file:line into each message
>> for user.
>
> I tend to use __func__ for these things as the result is usually easily
> grep-able.

Putting __func__ in error messages makes them both greppable and ugly.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01  9:02 Questionable aspects of QEMU Error's design Markus Armbruster
  2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
  2020-04-01 12:44 ` Daniel P. Berrangé
@ 2020-04-01 20:15 ` Peter Maydell
  2020-04-02  5:31   ` Vladimir Sementsov-Ogievskiy
  2020-04-02  5:54   ` Markus Armbruster
  2020-04-04  7:59 ` Markus Armbruster
  3 siblings, 2 replies; 41+ messages in thread
From: Peter Maydell @ 2020-04-01 20:15 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	QEMU Developers

On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>
> QEMU's Error was patterned after GLib's GError.  Differences include:

From my POV the major problem with Error as we have it today
is that it makes the simple process of writing code like
device realize functions horrifically boilerplate heavy;
for instance this is from hw/arm/armsse.c:

        object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                 "memory", &err);
        if (err) {
            error_propagate(errp, err);
            return;
        }
        object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
        if (err) {
            error_propagate(errp, err);
            return;
        }
        object_property_set_bool(cpuobj, true, "realized", &err);
        if (err) {
            error_propagate(errp, err);
            return;
        }

16 lines of code just to set 2 properties on an object
and realize it. It's a lot of boilerplate and as
a result we frequently get it wrong or take shortcuts
(eg forgetting the error-handling entirely, calling
error_propagate just once for a whole sequence of
calls, taking the lazy approach and using err_abort
or err_fatal when we ought really to be propagating
an error, etc). I haven't looked at 'auto propagation'
yet, hopefully it will help?

thanks
-- PMM


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 20:15 ` Peter Maydell
@ 2020-04-02  5:31   ` Vladimir Sementsov-Ogievskiy
  2020-04-02  9:36     ` BALATON Zoltan
  2020-04-02  5:54   ` Markus Armbruster
  1 sibling, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-02  5:31 UTC (permalink / raw)
  To: Peter Maydell, Markus Armbruster
  Cc: Philippe Mathieu-Daudé, QEMU Developers

01.04.2020 23:15, Peter Maydell wrote:
> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> QEMU's Error was patterned after GLib's GError.  Differences include:
> 
>  From my POV the major problem with Error as we have it today
> is that it makes the simple process of writing code like
> device realize functions horrifically boilerplate heavy;
> for instance this is from hw/arm/armsse.c:
> 
>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                   "memory", &err);
>          if (err) {
>              error_propagate(errp, err);
>              return;
>          }
>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>          if (err) {
>              error_propagate(errp, err);
>              return;
>          }
>          object_property_set_bool(cpuobj, true, "realized", &err);
>          if (err) {
>              error_propagate(errp, err);
>              return;
>          }
> 
> 16 lines of code just to set 2 properties on an object
> and realize it. It's a lot of boilerplate and as
> a result we frequently get it wrong or take shortcuts
> (eg forgetting the error-handling entirely, calling
> error_propagate just once for a whole sequence of
> calls, taking the lazy approach and using err_abort
> or err_fatal when we ought really to be propagating
> an error, etc). I haven't looked at 'auto propagation'
> yet, hopefully it will help?

Yes, after it the code above will look like this:

... some_func(..., errp)
{
     ERRP_AUTO_PROPAGATE(); # magic macro at function start, and no "Error *err" definition

...
           object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                    "memory", errp);
           if (*errp) {
               return;
           }
           object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
           if (*errp) {
               return;
           }
           object_property_set_bool(cpuobj, true, "realized", errp);
           if (*errp) {
               return;
           }
...
}

  - propagation is automatic, errp is used directly and may be safely dereferenced.



-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01 20:15 ` Peter Maydell
  2020-04-02  5:31   ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02  5:54   ` Markus Armbruster
  2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
                       ` (2 more replies)
  1 sibling, 3 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-02  5:54 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	QEMU Developers

Peter Maydell <peter.maydell@linaro.org> writes:

> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> QEMU's Error was patterned after GLib's GError.  Differences include:
>
> From my POV the major problem with Error as we have it today
> is that it makes the simple process of writing code like
> device realize functions horrifically boilerplate heavy;
> for instance this is from hw/arm/armsse.c:
>
>         object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                  "memory", &err);
>         if (err) {
>             error_propagate(errp, err);
>             return;
>         }
>         object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>         if (err) {
>             error_propagate(errp, err);
>             return;
>         }
>         object_property_set_bool(cpuobj, true, "realized", &err);
>         if (err) {
>             error_propagate(errp, err);
>             return;
>         }
>
> 16 lines of code just to set 2 properties on an object
> and realize it. It's a lot of boilerplate and as
> a result we frequently get it wrong or take shortcuts
> (eg forgetting the error-handling entirely, calling
> error_propagate just once for a whole sequence of
> calls, taking the lazy approach and using err_abort
> or err_fatal when we ought really to be propagating
> an error, etc). I haven't looked at 'auto propagation'
> yet, hopefully it will help?

With that, you can have

        object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                 "memory", errp);
        if (*errp) {
            return;
        }
        object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
        if (*errp) {
            return;
        }
        object_property_set_bool(cpuobj, true, "realized", errp);
        if (*errp) {
            return;
        }

but you have to add

        ERRP_AUTO_PROPAGATE();

right at the beginning of the function.

It's a small improvement.  A bigger one is

        if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                     "memory", errp)) {
            return;
        }
        if (object_property_set_link(cpuobj, OBJECT(s), "idau", errp)) {
            return;
        }
        if (object_property_set_bool(cpuobj, true, "realized", errp)) {
            return;
        }

This is item "Return value conventions" in the message you replied to.

Elsewhere in this thread, I discussed the difficulties of automating the
conversion to this style.  I think I know how to automate converting the
calls to use the bool return value, but converting the functions to
return it looks hard.  We could do that manually for a modest set of
frequently used functions.  object.h would top my list.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  5:54   ` Markus Armbruster
@ 2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
  2020-04-02  8:11       ` Peter Maydell
  2020-04-02  8:47     ` Daniel P. Berrangé
  2020-04-02 14:33     ` Eric Blake
  2 siblings, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-02  6:11 UTC (permalink / raw)
  To: Markus Armbruster, Peter Maydell
  Cc: Philippe Mathieu-Daudé, QEMU Developers

02.04.2020 8:54, Markus Armbruster wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
>> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>>
>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>
>>  From my POV the major problem with Error as we have it today
>> is that it makes the simple process of writing code like
>> device realize functions horrifically boilerplate heavy;
>> for instance this is from hw/arm/armsse.c:
>>
>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                   "memory", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>>          object_property_set_bool(cpuobj, true, "realized", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>>
>> 16 lines of code just to set 2 properties on an object
>> and realize it. It's a lot of boilerplate and as
>> a result we frequently get it wrong or take shortcuts
>> (eg forgetting the error-handling entirely, calling
>> error_propagate just once for a whole sequence of
>> calls, taking the lazy approach and using err_abort
>> or err_fatal when we ought really to be propagating
>> an error, etc). I haven't looked at 'auto propagation'
>> yet, hopefully it will help?
> 
> With that, you can have
> 
>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                   "memory", errp);
>          if (*errp) {
>              return;
>          }
>          object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>          if (*errp) {
>              return;
>          }
>          object_property_set_bool(cpuobj, true, "realized", errp);
>          if (*errp) {
>              return;
>          }
> 
> but you have to add
> 
>          ERRP_AUTO_PROPAGATE();
> 
> right at the beginning of the function.
> 
> It's a small improvement.  A bigger one is
> 
>          if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                       "memory", errp)) {
>              return;

return false; you mean, assuming we converting the caller too...

>          }
>          if (object_property_set_link(cpuobj, OBJECT(s), "idau", errp)) {
>              return;
>          }
>          if (object_property_set_bool(cpuobj, true, "realized", errp)) {
>              return;
>          }

Hmm..

Somehow, in general, especially with long function names and long parameter lists I prefer

ret = func(..);
if (ret < 0) {
     return ret;
}

notation to moving the entire function call into if condition.. Extra level of parentheses complicates the view of the code.

But, yes, may be for boolean functions it looks a bit strange:

ok = func(..);
if (!ok) {
     return false;
}


> 
> This is item "Return value conventions" in the message you replied to.
> 
> Elsewhere in this thread, I discussed the difficulties of automating the
> conversion to this style.  I think I know how to automate converting the
> calls to use the bool return value, but converting the functions to
> return it looks hard.  We could do that manually for a modest set of
> frequently used functions.  object.h would top my list.
> 

Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.

Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02  8:11       ` Peter Maydell
  2020-04-02  8:49         ` Daniel P. Berrangé
  2020-04-02  8:55         ` Markus Armbruster
  0 siblings, 2 replies; 41+ messages in thread
From: Peter Maydell @ 2020-04-02  8:11 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Philippe Mathieu-Daudé, Markus Armbruster, QEMU Developers

On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
<vsementsov@virtuozzo.com> wrote:
> Somehow, in general, especially with long function names and long parameter lists I prefer
>
> ret = func(..);
> if (ret < 0) {
>      return ret;
> }

Personally I prefer the other approach -- this one has an extra line
in the source and
needs an extra local variable.

> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>
> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.

When would we want to return an errno? I thought the whole point of the
Error* was that that was where information about the error was returned.
If all your callsites are just going to do "if (ret < 0) { ... } then having
the functions pick an errno value to return is just extra work.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  5:54   ` Markus Armbruster
  2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02  8:47     ` Daniel P. Berrangé
  2020-04-02  9:19       ` Alex Bennée
  2020-04-02 14:33     ` Eric Blake
  2 siblings, 1 reply; 41+ messages in thread
From: Daniel P. Berrangé @ 2020-04-02  8:47 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Philippe Mathieu-Daudé,
	QEMU Developers

On Thu, Apr 02, 2020 at 07:54:11AM +0200, Markus Armbruster wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
> > On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
> >>
> >> QEMU's Error was patterned after GLib's GError.  Differences include:
> >
> > From my POV the major problem with Error as we have it today
> > is that it makes the simple process of writing code like
> > device realize functions horrifically boilerplate heavy;
> > for instance this is from hw/arm/armsse.c:
> >
> >         object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
> >                                  "memory", &err);
> >         if (err) {
> >             error_propagate(errp, err);
> >             return;
> >         }
> >         object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
> >         if (err) {
> >             error_propagate(errp, err);
> >             return;
> >         }
> >         object_property_set_bool(cpuobj, true, "realized", &err);
> >         if (err) {
> >             error_propagate(errp, err);
> >             return;
> >         }
> >
> > 16 lines of code just to set 2 properties on an object
> > and realize it. It's a lot of boilerplate and as
> > a result we frequently get it wrong or take shortcuts
> > (eg forgetting the error-handling entirely, calling
> > error_propagate just once for a whole sequence of
> > calls, taking the lazy approach and using err_abort
> > or err_fatal when we ought really to be propagating
> > an error, etc). I haven't looked at 'auto propagation'
> > yet, hopefully it will help?
> 
> With that, you can have
> 
>         object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                  "memory", errp);
>         if (*errp) {
>             return;
>         }
>         object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>         if (*errp) {
>             return;
>         }
>         object_property_set_bool(cpuobj, true, "realized", errp);
>         if (*errp) {
>             return;
>         }
> 
> but you have to add
> 
>         ERRP_AUTO_PROPAGATE();
> 
> right at the beginning of the function.
> 
> It's a small improvement.  A bigger one is
> 
>         if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                      "memory", errp)) {
>             return;
>         }
>         if (object_property_set_link(cpuobj, OBJECT(s), "idau", errp)) {
>             return;
>         }
>         if (object_property_set_bool(cpuobj, true, "realized", errp)) {
>             return;
>         }
> 
> This is item "Return value conventions" in the message you replied to.

Even better, we can then string the checks together

        if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                      "memory", errp) ||
            object_property_set_link(cpuobj, OBJECT(s), "idau", errp) ||
            object_property_set_bool(cpuobj, true, "realized", errp)) {
             return;
        }
 
Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  8:11       ` Peter Maydell
@ 2020-04-02  8:49         ` Daniel P. Berrangé
  2020-04-02  8:55         ` Markus Armbruster
  1 sibling, 0 replies; 41+ messages in thread
From: Daniel P. Berrangé @ 2020-04-02  8:49 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	Markus Armbruster, QEMU Developers

On Thu, Apr 02, 2020 at 08:11:11AM +0000, Peter Maydell wrote:
> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
> <vsementsov@virtuozzo.com> wrote:
> > Somehow, in general, especially with long function names and long parameter lists I prefer
> >
> > ret = func(..);
> > if (ret < 0) {
> >      return ret;
> > }
> 
> Personally I prefer the other approach -- this one has an extra line
> in the source and
> needs an extra local variable.
> 
> > Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
> >
> > Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
> 
> When would we want to return an errno? I thought the whole point of the
> Error* was that that was where information about the error was returned.
> If all your callsites are just going to do "if (ret < 0) { ... } then having
> the functions pick an errno value to return is just extra work.

IMHO errno should only be returned if the callers are likely to have 
a genuine need to take action based on the errno in a way that isn't
possible from the Error alone. I expect very few scenarios need that.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  8:11       ` Peter Maydell
  2020-04-02  8:49         ` Daniel P. Berrangé
@ 2020-04-02  8:55         ` Markus Armbruster
  2020-04-02 14:35           ` Vladimir Sementsov-Ogievskiy
  2020-04-02 18:57           ` Paolo Bonzini
  1 sibling, 2 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-02  8:55 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	QEMU Developers

Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
> <vsementsov@virtuozzo.com> wrote:
>> Somehow, in general, especially with long function names and long parameter lists I prefer
>>
>> ret = func(..);
>> if (ret < 0) {
>>      return ret;
>> }
>
> Personally I prefer the other approach -- this one has an extra line
> in the source and
> needs an extra local variable.

Me too, except when func(...) is so long that

    if (func(...) < 0) {

becomes illegible like

    if (func(... yadda, yadda,
             yadda, ...) < 0) {

Then an extra variable can improve things.

>> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>>
>> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
>
> When would we want to return an errno? I thought the whole point of the
> Error* was that that was where information about the error was returned.
> If all your callsites are just going to do "if (ret < 0) { ... } then having
> the functions pick an errno value to return is just extra work.

0/-1 vs. true/false is a matter of convention.  Lacking convention, it's
a matter of taste.

0/-1 vs. 0/-errno depends on the function and its callers.  When -errno
enables callers to distinguish failures in a sane and simple way, use
it.  When -errno feels "natural", I'd say feel free to use it even when
all existing callers only check < 0.

When you return non-null/null or true/false on success/error, neglecting
to document that in a function contract can perhaps be tolerated; you're
using the return type the common, obvious way.  But when you return 0/-1
or 0/-errno, please spell it out.  I've seen too many "Operation not
permitted" that were actually -1 mistaken for -EPERM.  Also too many
functions that return -1 for some failures and -errno for others.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  8:47     ` Daniel P. Berrangé
@ 2020-04-02  9:19       ` Alex Bennée
  0 siblings, 0 replies; 41+ messages in thread
From: Alex Bennée @ 2020-04-02  9:19 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Philippe Mathieu-Daudé,
	Markus Armbruster, qemu-devel


Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Apr 02, 2020 at 07:54:11AM +0200, Markus Armbruster wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> 
>> > On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>> >>
>> >> QEMU's Error was patterned after GLib's GError.  Differences include:
>> >
>> > From my POV the major problem with Error as we have it today
>> > is that it makes the simple process of writing code like
>> > device realize functions horrifically boilerplate heavy;
>> > for instance this is from hw/arm/armsse.c:
>> >
>> >         object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>> >                                  "memory", &err);
>> >         if (err) {
>> >             error_propagate(errp, err);
>> >             return;
>> >         }
>> >         object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>> >         if (err) {
>> >             error_propagate(errp, err);
>> >             return;
>> >         }
>> >         object_property_set_bool(cpuobj, true, "realized", &err);
>> >         if (err) {
>> >             error_propagate(errp, err);
>> >             return;
>> >         }
>> >
>> > 16 lines of code just to set 2 properties on an object
>> > and realize it. It's a lot of boilerplate and as
>> > a result we frequently get it wrong or take shortcuts
>> > (eg forgetting the error-handling entirely, calling
>> > error_propagate just once for a whole sequence of
>> > calls, taking the lazy approach and using err_abort
>> > or err_fatal when we ought really to be propagating
>> > an error, etc). I haven't looked at 'auto propagation'
>> > yet, hopefully it will help?
>> 
>> With that, you can have
>> 
>>         object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                  "memory", errp);
>>         if (*errp) {
>>             return;
>>         }
>>         object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>>         if (*errp) {
>>             return;
>>         }
>>         object_property_set_bool(cpuobj, true, "realized", errp);
>>         if (*errp) {
>>             return;
>>         }
>> 
>> but you have to add
>> 
>>         ERRP_AUTO_PROPAGATE();
>> 
>> right at the beginning of the function.
>> 
>> It's a small improvement.  A bigger one is
>> 
>>         if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                      "memory", errp)) {
>>             return;
>>         }
>>         if (object_property_set_link(cpuobj, OBJECT(s), "idau", errp)) {
>>             return;
>>         }
>>         if (object_property_set_bool(cpuobj, true, "realized", errp)) {
>>             return;
>>         }
>> 
>> This is item "Return value conventions" in the message you replied to.
>
> Even better, we can then string the checks together
>
>         if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                       "memory", errp) ||
>             object_property_set_link(cpuobj, OBJECT(s), "idau", errp) ||
>             object_property_set_bool(cpuobj, true, "realized", errp)) {
>              return;
>         }

You know at this point I wonder if we can't come up with some data table
that describes all these object interactions and a helper function that
processes it and tells us if it worked or not?

We are essentially just filling out an data structure anyway with all
this stuff.

>  
> Regards,
> Daniel


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  5:31   ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02  9:36     ` BALATON Zoltan
  2020-04-02 14:11       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 41+ messages in thread
From: BALATON Zoltan @ 2020-04-02  9:36 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	Markus Armbruster, QEMU Developers

On Thu, 2 Apr 2020, Vladimir Sementsov-Ogievskiy wrote:
> 01.04.2020 23:15, Peter Maydell wrote:
>> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>> 
>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>
>>  From my POV the major problem with Error as we have it today
>> is that it makes the simple process of writing code like
>> device realize functions horrifically boilerplate heavy;
>> for instance this is from hw/arm/armsse.c:
>>
>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                   "memory", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>>          object_property_set_bool(cpuobj, true, "realized", &err);
>>          if (err) {
>>              error_propagate(errp, err);
>>              return;
>>          }
>> 
>> 16 lines of code just to set 2 properties on an object
>> and realize it. It's a lot of boilerplate and as
>> a result we frequently get it wrong or take shortcuts
>> (eg forgetting the error-handling entirely, calling
>> error_propagate just once for a whole sequence of
>> calls, taking the lazy approach and using err_abort
>> or err_fatal when we ought really to be propagating
>> an error, etc). I haven't looked at 'auto propagation'
>> yet, hopefully it will help?
>
> Yes, after it the code above will look like this:
>
> ... some_func(..., errp)
> {
>    ERRP_AUTO_PROPAGATE(); # magic macro at function start, and no "Error 
> *err" definition
>
> ...
>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                   "memory", errp);
>          if (*errp) {
>              return;
>          }
>          object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>          if (*errp) {
>              return;
>          }
>          object_property_set_bool(cpuobj, true, "realized", errp);
>          if (*errp) {
>              return;
>          }
> ...
> }
>
> - propagation is automatic, errp is used directly and may be safely 
> dereferenced.

Not much better. Could it be something like:

     ERRP_RET(object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                       "memory", errp));
     ERRP_RET(object_property_set_link(cpuobj, OBJECT(s), "idau", errp));
     ERRP_RET(object_property_set_bool(cpuobj, true, "realized", errp));

Regards,
BALATON Zoltan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  9:36     ` BALATON Zoltan
@ 2020-04-02 14:11       ` Vladimir Sementsov-Ogievskiy
  2020-04-02 14:34         ` Markus Armbruster
  0 siblings, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-02 14:11 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: Peter Maydell, Philippe Mathieu-Daudé,
	Markus Armbruster, QEMU Developers

02.04.2020 12:36, BALATON Zoltan wrote:
> On Thu, 2 Apr 2020, Vladimir Sementsov-Ogievskiy wrote:
>> 01.04.2020 23:15, Peter Maydell wrote:
>>> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>>>
>>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>>
>>>  From my POV the major problem with Error as we have it today
>>> is that it makes the simple process of writing code like
>>> device realize functions horrifically boilerplate heavy;
>>> for instance this is from hw/arm/armsse.c:
>>>
>>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>                                   "memory", &err);
>>>          if (err) {
>>>              error_propagate(errp, err);
>>>              return;
>>>          }
>>>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>>>          if (err) {
>>>              error_propagate(errp, err);
>>>              return;
>>>          }
>>>          object_property_set_bool(cpuobj, true, "realized", &err);
>>>          if (err) {
>>>              error_propagate(errp, err);
>>>              return;
>>>          }
>>>
>>> 16 lines of code just to set 2 properties on an object
>>> and realize it. It's a lot of boilerplate and as
>>> a result we frequently get it wrong or take shortcuts
>>> (eg forgetting the error-handling entirely, calling
>>> error_propagate just once for a whole sequence of
>>> calls, taking the lazy approach and using err_abort
>>> or err_fatal when we ought really to be propagating
>>> an error, etc). I haven't looked at 'auto propagation'
>>> yet, hopefully it will help?
>>
>> Yes, after it the code above will look like this:
>>
>> ... some_func(..., errp)
>> {
>>    ERRP_AUTO_PROPAGATE(); # magic macro at function start, and no "Error *err" definition
>>
>> ...
>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                   "memory", errp);
>>          if (*errp) {
>>              return;
>>          }
>>          object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>>          if (*errp) {
>>              return;
>>          }
>>          object_property_set_bool(cpuobj, true, "realized", errp);
>>          if (*errp) {
>>              return;
>>          }
>> ...
>> }
>>
>> - propagation is automatic, errp is used directly and may be safely dereferenced.
> 
> Not much better. Could it be something like:

Actually, much better, as it solves some real problems around error propagation.

> 
>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                        "memory", errp));
>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(s), "idau", errp));
>      ERRP_RET(object_property_set_bool(cpuobj, true, "realized", errp));
> 

and turn all

ret = func(...);
if (ret < 0) {
     return ret;
}

into

FAIL_RET(func(...))

?

Not a problem to make such macro.. But I think it's a bad idea to turn all the code
into sequence of macro invocations. It's hard to debug and follow.


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  5:54   ` Markus Armbruster
  2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
  2020-04-02  8:47     ` Daniel P. Berrangé
@ 2020-04-02 14:33     ` Eric Blake
  2 siblings, 0 replies; 41+ messages in thread
From: Eric Blake @ 2020-04-02 14:33 UTC (permalink / raw)
  To: Markus Armbruster, Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	QEMU Developers

On 4/2/20 12:54 AM, Markus Armbruster wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:

> It's a small improvement.  A bigger one is
> 
>          if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>                                       "memory", errp)) {
>              return;
>          }
>          if (object_property_set_link(cpuobj, OBJECT(s), "idau", errp)) {
>              return;
>          }
>          if (object_property_set_bool(cpuobj, true, "realized", errp)) {
>              return;
>          }

Or even:

if (object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
                                       "memory", errp) ||
     object_property_set_link(cpuobj, OBJECT(s), "idau", errp) ||
     object_property_set_bool(cpuobj, true, "realized", errp)) {
         return;
}

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 14:11       ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02 14:34         ` Markus Armbruster
  2020-04-02 15:28           ` BALATON Zoltan
  0 siblings, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2020-04-02 14:34 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé, QEMU Developers

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 02.04.2020 12:36, BALATON Zoltan wrote:
>> On Thu, 2 Apr 2020, Vladimir Sementsov-Ogievskiy wrote:
>>> 01.04.2020 23:15, Peter Maydell wrote:
>>>> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>>>>
>>>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>>>
>>>>  From my POV the major problem with Error as we have it today
>>>> is that it makes the simple process of writing code like
>>>> device realize functions horrifically boilerplate heavy;
>>>> for instance this is from hw/arm/armsse.c:
>>>>
>>>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>>                                   "memory", &err);
>>>>          if (err) {
>>>>              error_propagate(errp, err);
>>>>              return;
>>>>          }
>>>>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>>>>          if (err) {
>>>>              error_propagate(errp, err);
>>>>              return;
>>>>          }
>>>>          object_property_set_bool(cpuobj, true, "realized", &err);
>>>>          if (err) {
>>>>              error_propagate(errp, err);
>>>>              return;
>>>>          }
>>>>
>>>> 16 lines of code just to set 2 properties on an object
>>>> and realize it. It's a lot of boilerplate and as
>>>> a result we frequently get it wrong or take shortcuts
>>>> (eg forgetting the error-handling entirely, calling
>>>> error_propagate just once for a whole sequence of
>>>> calls, taking the lazy approach and using err_abort
>>>> or err_fatal when we ought really to be propagating
>>>> an error, etc). I haven't looked at 'auto propagation'
>>>> yet, hopefully it will help?
>>>
>>> Yes, after it the code above will look like this:
>>>
>>> ... some_func(..., errp)
>>> {
>>>    ERRP_AUTO_PROPAGATE(); # magic macro at function start, and no "Error *err" definition
>>>
>>> ...
>>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>                                   "memory", errp);
>>>          if (*errp) {
>>>              return;
>>>          }
>>>          object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>>>          if (*errp) {
>>>              return;
>>>          }
>>>          object_property_set_bool(cpuobj, true, "realized", errp);
>>>          if (*errp) {
>>>              return;
>>>          }
>>> ...
>>> }
>>>
>>> - propagation is automatic, errp is used directly and may be safely dereferenced.
>>
>> Not much better. Could it be something like:
>
> Actually, much better, as it solves some real problems around error propagation.

The auto propagation patches' stated aim is to fix &error_fatal not to
eat hints, and to provide more useful stack backtraces with
&error_abort.  The slight shrinking of boilerplate is a welcome bonus.

For a bigger improvement, have the functions return a useful value, as
discussed elsewhere in this thread.

>>
>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>                                        "memory", errp));
>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(s), "idau", errp));
>>      ERRP_RET(object_property_set_bool(cpuobj, true, "realized", errp));
>>
>
> and turn all
>
> ret = func(...);
> if (ret < 0) {
>     return ret;
> }
>
> into
>
> FAIL_RET(func(...))
>
> ?
>
> Not a problem to make such macro.. But I think it's a bad idea to turn all the code
> into sequence of macro invocations. It's hard to debug and follow.

Yes.  Hiding control flow in macros is almost always too much magic.
There are exceptions, but this doesn't look like one.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  8:55         ` Markus Armbruster
@ 2020-04-02 14:35           ` Vladimir Sementsov-Ogievskiy
  2020-04-02 15:06             ` Markus Armbruster
  2020-04-02 18:57           ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-02 14:35 UTC (permalink / raw)
  To: Markus Armbruster, Peter Maydell
  Cc: Philippe Mathieu-Daudé, QEMU Developers

02.04.2020 11:55, Markus Armbruster wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
>> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
>> <vsementsov@virtuozzo.com> wrote:
>>> Somehow, in general, especially with long function names and long parameter lists I prefer
>>>
>>> ret = func(..);
>>> if (ret < 0) {
>>>       return ret;
>>> }
>>
>> Personally I prefer the other approach -- this one has an extra line
>> in the source and
>> needs an extra local variable.
> 
> Me too, except when func(...) is so long that
> 
>      if (func(...) < 0) {
> 
> becomes illegible like
> 
>      if (func(... yadda, yadda,
>               yadda, ...) < 0) {
> 
> Then an extra variable can improve things.
> 
>>> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>>>
>>> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
>>
>> When would we want to return an errno? I thought the whole point of the
>> Error* was that that was where information about the error was returned.
>> If all your callsites are just going to do "if (ret < 0) { ... } then having
>> the functions pick an errno value to return is just extra work.
> 
> 0/-1 vs. true/false is a matter of convention.  Lacking convention, it's
> a matter of taste. >
> 0/-1 vs. 0/-errno depends on the function and its callers.  When -errno
> enables callers to distinguish failures in a sane and simple way, use
> it.  When -errno feels "natural", I'd say feel free to use it even when
> all existing callers only check < 0.
> 
> When you return non-null/null or true/false on success/error, neglecting
> to document that in a function contract can perhaps be tolerated; you're
> using the return type the common, obvious way.  But when you return 0/-1
> or 0/-errno, please spell it out.  I've seen too many "Operation not
> permitted" that were actually -1 mistaken for -EPERM.  Also too many
> functions that return -1 for some failures and -errno for others.
> 

I just want to add one note:

OK, you like the pattern

   if (func()) {
       <handle error>
   }

, I can live with it.

I believe, we have a lot of such patterns already in code.

Now, we are going to add a lot of functions, returning true on success and false on failure, so add a lot of patterns

   if (!func()) {
       <handle error>
   }

---

After it, looking at something like

   if (!func()) {} / if (func()) {}

I'll have to always jump to function definition, to check is it int or bool function, to understand what exactly is meant and is there a mistake in the code..
So, I'm afraid that such conversion will not help reviewing/understanding the code. I'd prefer to avoid using two opposite conventions in on project.

I can also imagine combining different function types (int/bool) in if conditions o_O, what will save us from it?

And don't forget about bool functions, which just check something, and false is not an error, but just negative answer on some question.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 14:35           ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02 15:06             ` Markus Armbruster
  2020-04-02 17:17               ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2020-04-02 15:06 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé, QEMU Developers

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 02.04.2020 11:55, Markus Armbruster wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
>>> <vsementsov@virtuozzo.com> wrote:
>>>> Somehow, in general, especially with long function names and long parameter lists I prefer
>>>>
>>>> ret = func(..);
>>>> if (ret < 0) {
>>>>       return ret;
>>>> }
>>>
>>> Personally I prefer the other approach -- this one has an extra line
>>> in the source and
>>> needs an extra local variable.
>>
>> Me too, except when func(...) is so long that
>>
>>      if (func(...) < 0) {
>>
>> becomes illegible like
>>
>>      if (func(... yadda, yadda,
>>               yadda, ...) < 0) {
>>
>> Then an extra variable can improve things.
>>
>>>> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>>>>
>>>> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
>>>
>>> When would we want to return an errno? I thought the whole point of the
>>> Error* was that that was where information about the error was returned.
>>> If all your callsites are just going to do "if (ret < 0) { ... } then having
>>> the functions pick an errno value to return is just extra work.
>>
>> 0/-1 vs. true/false is a matter of convention.  Lacking convention, it's
>> a matter of taste. >
>> 0/-1 vs. 0/-errno depends on the function and its callers.  When -errno
>> enables callers to distinguish failures in a sane and simple way, use
>> it.  When -errno feels "natural", I'd say feel free to use it even when
>> all existing callers only check < 0.
>>
>> When you return non-null/null or true/false on success/error, neglecting
>> to document that in a function contract can perhaps be tolerated; you're
>> using the return type the common, obvious way.  But when you return 0/-1
>> or 0/-errno, please spell it out.  I've seen too many "Operation not
>> permitted" that were actually -1 mistaken for -EPERM.  Also too many
>> functions that return -1 for some failures and -errno for others.
>>
>
> I just want to add one note:
>
> OK, you like the pattern
>
>   if (func()) {
>       <handle error>
>   }
>
> , I can live with it.
>
> I believe, we have a lot of such patterns already in code.
>
> Now, we are going to add a lot of functions, returning true on success and false on failure, so add a lot of patterns
>
>   if (!func()) {
>       <handle error>
>   }

We already have this pattern all over the place with functions returning
non-null pointers on success, null pointer on failure.

> ---
>
> After it, looking at something like
>
>   if (!func()) {} / if (func()) {}
>
> I'll have to always jump to function definition, to check is it int or bool function, to understand what exactly is meant and is there a mistake in the code..
> So, I'm afraid that such conversion will not help reviewing/understanding the code. I'd prefer to avoid using two opposite conventions in on project.

C sucks :)

Conventions help mitigate.  Here's one: when fun() returns
non-negative/negative on success/error, always use

    fun(...) < 0

or

    fun(...) >= 0

to check.  I dislike the latter.

When returns 0/negative, always use

    fun(...) < 0

Avoid

    fun(...) >= 0

because that suggests it could return a positive value, which is wrong.

Avoid

    fun(...)

because that requires the reader to know the return type.

When its returns non-null/null or true/false on success/failure, always
use

    !fun(...)

Avoid

    fun(...)

because that requires the reader to know the return type.

Note that we have a usable check for failure in all cases.  Goes well
with the habit to handle errors like this whenever practical

    if (failed) {
        handle failure
        bail out
    }
    handle success

> I can also imagine combining different function types (int/bool) in if conditions o_O, what will save us from it?

Can you give an example?

> And don't forget about bool functions, which just check something, and false is not an error, but just negative answer on some question.

For what it's worth, GLib specifically advises against such function
contracts:

    By convention, if you return a boolean value indicating success then
    TRUE means success and FALSE means failure.  Avoid creating
    functions which have a boolean return value and a GError parameter,
    but where the boolean does something other than signal whether the
    GError is set.  Among other problems, it requires C callers to
    allocate a temporary error.  Instead, provide a "gboolean *" out
    parameter.  There are functions in GLib itself such as
    g_key_file_has_key() that are deprecated because of this.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 14:34         ` Markus Armbruster
@ 2020-04-02 15:28           ` BALATON Zoltan
  2020-04-03  7:09             ` Markus Armbruster
  0 siblings, 1 reply; 41+ messages in thread
From: BALATON Zoltan @ 2020-04-02 15:28 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Philippe Mathieu-Daudé,
	QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 4603 bytes --]

On Thu, 2 Apr 2020, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>> 02.04.2020 12:36, BALATON Zoltan wrote:
>>> On Thu, 2 Apr 2020, Vladimir Sementsov-Ogievskiy wrote:
>>>> 01.04.2020 23:15, Peter Maydell wrote:
>>>>> On Wed, 1 Apr 2020 at 10:03, Markus Armbruster <armbru@redhat.com> wrote:
>>>>>>
>>>>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>>>>
>>>>>  From my POV the major problem with Error as we have it today
>>>>> is that it makes the simple process of writing code like
>>>>> device realize functions horrifically boilerplate heavy;
>>>>> for instance this is from hw/arm/armsse.c:
>>>>>
>>>>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>>>                                   "memory", &err);
>>>>>          if (err) {
>>>>>              error_propagate(errp, err);
>>>>>              return;
>>>>>          }
>>>>>          object_property_set_link(cpuobj, OBJECT(s), "idau", &err);
>>>>>          if (err) {
>>>>>              error_propagate(errp, err);
>>>>>              return;
>>>>>          }
>>>>>          object_property_set_bool(cpuobj, true, "realized", &err);
>>>>>          if (err) {
>>>>>              error_propagate(errp, err);
>>>>>              return;
>>>>>          }
>>>>>
>>>>> 16 lines of code just to set 2 properties on an object
>>>>> and realize it. It's a lot of boilerplate and as
>>>>> a result we frequently get it wrong or take shortcuts
>>>>> (eg forgetting the error-handling entirely, calling
>>>>> error_propagate just once for a whole sequence of
>>>>> calls, taking the lazy approach and using err_abort
>>>>> or err_fatal when we ought really to be propagating
>>>>> an error, etc). I haven't looked at 'auto propagation'
>>>>> yet, hopefully it will help?
>>>>
>>>> Yes, after it the code above will look like this:
>>>>
>>>> ... some_func(..., errp)
>>>> {
>>>>    ERRP_AUTO_PROPAGATE(); # magic macro at function start, and no "Error *err" definition
>>>>
>>>> ...
>>>>          object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>>                                   "memory", errp);
>>>>          if (*errp) {
>>>>              return;
>>>>          }
>>>>          object_property_set_link(cpuobj, OBJECT(s), "idau", errp);
>>>>          if (*errp) {
>>>>              return;
>>>>          }
>>>>          object_property_set_bool(cpuobj, true, "realized", errp);
>>>>          if (*errp) {
>>>>              return;
>>>>          }
>>>> ...
>>>> }
>>>>
>>>> - propagation is automatic, errp is used directly and may be safely dereferenced.
>>>
>>> Not much better. Could it be something like:
>>
>> Actually, much better, as it solves some real problems around error propagation.
>
> The auto propagation patches' stated aim is to fix &error_fatal not to
> eat hints, and to provide more useful stack backtraces with
> &error_abort.  The slight shrinking of boilerplate is a welcome bonus.
>
> For a bigger improvement, have the functions return a useful value, as
> discussed elsewhere in this thread.
>
>>>
>>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>                                        "memory", errp));
>>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(s), "idau", errp));
>>>      ERRP_RET(object_property_set_bool(cpuobj, true, "realized", errp));
>>>
>>
>> and turn all
>>
>> ret = func(...);
>> if (ret < 0) {
>>     return ret;
>> }
>>
>> into
>>
>> FAIL_RET(func(...))
>>
>> ?
>>
>> Not a problem to make such macro.. But I think it's a bad idea to turn all the code
>> into sequence of macro invocations. It's hard to debug and follow.
>
> Yes.  Hiding control flow in macros is almost always too much magic.
> There are exceptions, but this doesn't look like one.

I did't like this idea of mine too much either so I agree but I see no 
other easy way to simplify this. If you propose changing function return 
values maybe these should return errp instead of passing it as a func 
parameter? Could that result in simpler code and less macro magic needed?

Regards,
BALATON Zoltan

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 15:06             ` Markus Armbruster
@ 2020-04-02 17:17               ` Vladimir Sementsov-Ogievskiy
  2020-04-03  7:48                 ` Markus Armbruster
  0 siblings, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-02 17:17 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Philippe Mathieu-Daudé, QEMU Developers

02.04.2020 18:06, Markus Armbruster wrote:
> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
> 
>> 02.04.2020 11:55, Markus Armbruster wrote:
>>> Peter Maydell <peter.maydell@linaro.org> writes:
>>>
>>>> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
>>>> <vsementsov@virtuozzo.com> wrote:
>>>>> Somehow, in general, especially with long function names and long parameter lists I prefer
>>>>>
>>>>> ret = func(..);
>>>>> if (ret < 0) {
>>>>>        return ret;
>>>>> }
>>>>
>>>> Personally I prefer the other approach -- this one has an extra line
>>>> in the source and
>>>> needs an extra local variable.
>>>
>>> Me too, except when func(...) is so long that
>>>
>>>       if (func(...) < 0) {
>>>
>>> becomes illegible like
>>>
>>>       if (func(... yadda, yadda,
>>>                yadda, ...) < 0) {
>>>
>>> Then an extra variable can improve things.
>>>
>>>>> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>>>>>
>>>>> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
>>>>
>>>> When would we want to return an errno? I thought the whole point of the
>>>> Error* was that that was where information about the error was returned.
>>>> If all your callsites are just going to do "if (ret < 0) { ... } then having
>>>> the functions pick an errno value to return is just extra work.
>>>
>>> 0/-1 vs. true/false is a matter of convention.  Lacking convention, it's
>>> a matter of taste. >
>>> 0/-1 vs. 0/-errno depends on the function and its callers.  When -errno
>>> enables callers to distinguish failures in a sane and simple way, use
>>> it.  When -errno feels "natural", I'd say feel free to use it even when
>>> all existing callers only check < 0.
>>>
>>> When you return non-null/null or true/false on success/error, neglecting
>>> to document that in a function contract can perhaps be tolerated; you're
>>> using the return type the common, obvious way.  But when you return 0/-1
>>> or 0/-errno, please spell it out.  I've seen too many "Operation not
>>> permitted" that were actually -1 mistaken for -EPERM.  Also too many
>>> functions that return -1 for some failures and -errno for others.
>>>
>>
>> I just want to add one note:
>>
>> OK, you like the pattern
>>
>>    if (func()) {
>>        <handle error>
>>    }
>>
>> , I can live with it.
>>
>> I believe, we have a lot of such patterns already in code.
>>
>> Now, we are going to add a lot of functions, returning true on success and false on failure, so add a lot of patterns
>>
>>    if (!func()) {
>>        <handle error>
>>    }
> 
> We already have this pattern all over the place with functions returning
> non-null pointers on success, null pointer on failure.

Functions returning pointer are simpler to distinguish by name.

Hmm, strange. But this pattern lose the pointer.. You mean

ptr = func();
if (!ptr) {
   <handle error>
}

this is much more understandable. Usually ptr variable name and function name - all will help to understand that it's all about pointer.


> 
>> ---
>>
>> After it, looking at something like
>>
>>    if (!func()) {} / if (func()) {}
>>
>> I'll have to always jump to function definition, to check is it int or bool function, to understand what exactly is meant and is there a mistake in the code..
>> So, I'm afraid that such conversion will not help reviewing/understanding the code. I'd prefer to avoid using two opposite conventions in on project.
> 
> C sucks :)
> 
> Conventions help mitigate.  Here's one: when fun() returns
> non-negative/negative on success/error, always use
> 
>      fun(...) < 0

This reduces chances that it fit in one line..

But yes, if all use this convention, it makes it obvious what happening.

> 
> or
> 
>      fun(...) >= 0
> 
> to check.  I dislike the latter.
> 
> When returns 0/negative, always use
> 
>      fun(...) < 0
> 
> Avoid
> 
>      fun(...) >= 0
> 
> because that suggests it could return a positive value, which is wrong.
> 
> Avoid
> 
>      fun(...)
> 
> because that requires the reader to know the return type.

Exactly the problem I mention. To follow your suggestion, we'll have to update
the whole code base.. However, why not.

> 
> When its returns non-null/null or true/false on success/failure, always
> use
> 
>      !fun(...)
> 
> Avoid
> 
>      fun(...)
> 
> because that requires the reader to know the return type.
> 
> Note that we have a usable check for failure in all cases.  Goes well
> with the habit to handle errors like this whenever practical
> 
>      if (failed) {
>          handle failure
>          bail out
>      }
>      handle success
> 

OK. If convert everything to your suggestion it looks good.

The only possible problem is boolean functions, which just check something, not returning the error..

With a function like is_x_in_set(x, set), it's native to write

if (is_x_in_set(x, set)) {

    ...

}

which is a bit in conflict with your suggestion. Still, such functions should be simply distinguished by name.

>> I can also imagine combining different function types (int/bool) in if conditions o_O, what will save us from it?
> 
> Can you give an example?

I just meant something like if (f1() || !f2()) { < error > }, nothing special.

> 
>> And don't forget about bool functions, which just check something, and false is not an error, but just negative answer on some question.
> 
> For what it's worth, GLib specifically advises against such function
> contracts:
> 
>      By convention, if you return a boolean value indicating success then
>      TRUE means success and FALSE means failure.  Avoid creating
>      functions which have a boolean return value and a GError parameter,
>      but where the boolean does something other than signal whether the
>      GError is set.  Among other problems, it requires C callers to
>      allocate a temporary error.  Instead, provide a "gboolean *" out
>      parameter.  There are functions in GLib itself such as
>      g_key_file_has_key() that are deprecated because of this.
> 

Sounds good. But we are speaking about all functions, not only with errp parameter.. Or not?

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02  8:55         ` Markus Armbruster
  2020-04-02 14:35           ` Vladimir Sementsov-Ogievskiy
@ 2020-04-02 18:57           ` Paolo Bonzini
  1 sibling, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2020-04-02 18:57 UTC (permalink / raw)
  To: Markus Armbruster, Peter Maydell
  Cc: Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé,
	QEMU Developers

On 02/04/20 10:55, Markus Armbruster wrote:
> 
> When you return non-null/null or true/false on success/error, neglecting
> to document that in a function contract can perhaps be tolerated; you're
> using the return type the common, obvious way.  But when you return 0/-1
> or 0/-errno, please spell it out.  I've seen too many "Operation not
> permitted" that were actually -1 mistaken for -EPERM.

Hopefully that would be avoided by the usage of Error itself.

Paolo

> Also too many
> functions that return -1 for some failures and -errno for others.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 15:28           ` BALATON Zoltan
@ 2020-04-03  7:09             ` Markus Armbruster
  0 siblings, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-03  7:09 UTC (permalink / raw)
  To: BALATON Zoltan
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Philippe Mathieu-Daudé,
	QEMU Developers

BALATON Zoltan <balaton@eik.bme.hu> writes:

> On Thu, 2 Apr 2020, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>>> 02.04.2020 12:36, BALATON Zoltan wrote:
[...]
>>>> Not much better. Could it be something like:
[...]
>>>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(&s->cpu_container[i]),
>>>>                                        "memory", errp));
>>>>      ERRP_RET(object_property_set_link(cpuobj, OBJECT(s), "idau", errp));
>>>>      ERRP_RET(object_property_set_bool(cpuobj, true, "realized", errp));
>>>>
>>>
>>> and turn all
>>>
>>> ret = func(...);
>>> if (ret < 0) {
>>>     return ret;
>>> }
>>>
>>> into
>>>
>>> FAIL_RET(func(...))
>>>
>>> ?
>>>
>>> Not a problem to make such macro.. But I think it's a bad idea to turn all the code
>>> into sequence of macro invocations. It's hard to debug and follow.
>>
>> Yes.  Hiding control flow in macros is almost always too much magic.
>> There are exceptions, but this doesn't look like one.
>
> I did't like this idea of mine too much either so I agree but I see no
> other easy way to simplify this. If you propose changing function
> return values maybe these should return errp instead of passing it as
> a func parameter? Could that result in simpler code and less macro
> magic needed?

I don't think so.  Let's compare a few error handling patterns from
error.h.

 * Call a function and receive an error from it:
 *     Error *err = NULL;
 *     foo(arg, &err);
 *     if (err) {
 *         handle the error...
 *     }

Making foo() return the error object instead of void looks like an
obvious win:

       if (foo(arg)) {
           handle the error...
       }

Except it does not work, because surely a use of @err is hiding in
"handle the error" (or else we'd leak it).  So you actually need

       err = foo(arg)
       if (err) {
           handle the error...
       }

All this saves you is initializing @err.

You can save a bit more

       if ((err = foo(arg))) {
           handle the error...
       }

We generally frown upon assignemnts within if controlling expressions.

Next:

 * Receive an error and pass it on to the caller:
 [...]
 * But when all you do with the error is pass it on, please use
 *     foo(arg, errp);
 * for readability.

The obvious conversion

       *errp = foo(arg);

is wrong for !errp, errp == &error_fatal, and errp == &error_abort.
You'd need

       err = foo(arg);
       error_propagate(errp, err);

or, if you're so inclined

       error_propagate(errp, foo(arg));

Less legible, and it creates and destroys error objects where the
current method doesn't.

Returning the error object is even less attractive for functions that
now return values.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-02 17:17               ` Vladimir Sementsov-Ogievskiy
@ 2020-04-03  7:48                 ` Markus Armbruster
  0 siblings, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-03  7:48 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé, QEMU Developers

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 02.04.2020 18:06, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>>
>>> 02.04.2020 11:55, Markus Armbruster wrote:
>>>> Peter Maydell <peter.maydell@linaro.org> writes:
>>>>
>>>>> On Thu, 2 Apr 2020 at 07:11, Vladimir Sementsov-Ogievskiy
>>>>> <vsementsov@virtuozzo.com> wrote:
>>>>>> Somehow, in general, especially with long function names and long parameter lists I prefer
>>>>>>
>>>>>> ret = func(..);
>>>>>> if (ret < 0) {
>>>>>>        return ret;
>>>>>> }
>>>>>
>>>>> Personally I prefer the other approach -- this one has an extra line
>>>>> in the source and
>>>>> needs an extra local variable.
>>>>
>>>> Me too, except when func(...) is so long that
>>>>
>>>>       if (func(...) < 0) {
>>>>
>>>> becomes illegible like
>>>>
>>>>       if (func(... yadda, yadda,
>>>>                yadda, ...) < 0) {
>>>>
>>>> Then an extra variable can improve things.
>>>>
>>>>>> Are you sure that adding a lot of boolean functions is a good idea? I somehow feel better with more usual int functions with -errno on failure.
>>>>>>
>>>>>> Bool is a good return value for functions which are boolean by nature: checks, is something correspond to some criteria. But for reporting an error I'd prefer -errno.
>>>>>
>>>>> When would we want to return an errno? I thought the whole point of the
>>>>> Error* was that that was where information about the error was returned.
>>>>> If all your callsites are just going to do "if (ret < 0) { ... } then having
>>>>> the functions pick an errno value to return is just extra work.
>>>>
>>>> 0/-1 vs. true/false is a matter of convention.  Lacking convention, it's
>>>> a matter of taste. >
>>>> 0/-1 vs. 0/-errno depends on the function and its callers.  When -errno
>>>> enables callers to distinguish failures in a sane and simple way, use
>>>> it.  When -errno feels "natural", I'd say feel free to use it even when
>>>> all existing callers only check < 0.
>>>>
>>>> When you return non-null/null or true/false on success/error, neglecting
>>>> to document that in a function contract can perhaps be tolerated; you're
>>>> using the return type the common, obvious way.  But when you return 0/-1
>>>> or 0/-errno, please spell it out.  I've seen too many "Operation not
>>>> permitted" that were actually -1 mistaken for -EPERM.  Also too many
>>>> functions that return -1 for some failures and -errno for others.
>>>>
>>>
>>> I just want to add one note:
>>>
>>> OK, you like the pattern
>>>
>>>    if (func()) {
>>>        <handle error>
>>>    }
>>>
>>> , I can live with it.
>>>
>>> I believe, we have a lot of such patterns already in code.
>>>
>>> Now, we are going to add a lot of functions, returning true on success and false on failure, so add a lot of patterns
>>>
>>>    if (!func()) {
>>>        <handle error>
>>>    }
>>
>> We already have this pattern all over the place with functions returning
>> non-null pointers on success, null pointer on failure.
>
> Functions returning pointer are simpler to distinguish by name.
>
> Hmm, strange. But this pattern lose the pointer.. You mean
>
> ptr = func();
> if (!ptr) {
>   <handle error>
> }

Yes, when you actually need the pointer.  Common, but not universal.

> this is much more understandable. Usually ptr variable name and function name - all will help to understand that it's all about pointer.
>
>
>>
>>> ---
>>>
>>> After it, looking at something like
>>>
>>>    if (!func()) {} / if (func()) {}
>>>
>>> I'll have to always jump to function definition, to check is it int or bool function, to understand what exactly is meant and is there a mistake in the code..
>>> So, I'm afraid that such conversion will not help reviewing/understanding the code. I'd prefer to avoid using two opposite conventions in on project.
>>
>> C sucks :)
>>
>> Conventions help mitigate.  Here's one: when fun() returns
>> non-negative/negative on success/error, always use
>>
>>      fun(...) < 0
>
> This reduces chances that it fit in one line..

Yes, that's a drawback.

> But yes, if all use this convention, it makes it obvious what happening.
>
>>
>> or
>>
>>      fun(...) >= 0
>>
>> to check.  I dislike the latter.
>>
>> When returns 0/negative, always use
>>
>>      fun(...) < 0
>>
>> Avoid
>>
>>      fun(...) >= 0
>>
>> because that suggests it could return a positive value, which is wrong.
>>
>> Avoid
>>
>>      fun(...)
>>
>> because that requires the reader to know the return type.
>
> Exactly the problem I mention. To follow your suggestion, we'll have to update
> the whole code base.. However, why not.

Only if we value the consistency more than the update labor and churn.

>>
>> When its returns non-null/null or true/false on success/failure, always
>> use
>>
>>      !fun(...)
>>
>> Avoid
>>
>>      fun(...)
>>
>> because that requires the reader to know the return type.
>>
>> Note that we have a usable check for failure in all cases.  Goes well
>> with the habit to handle errors like this whenever practical
>>
>>      if (failed) {
>>          handle failure
>>          bail out
>>      }
>>      handle success
>>
>
> OK. If convert everything to your suggestion it looks good.
>
> The only possible problem is boolean functions, which just check something, not returning the error..
>
> With a function like is_x_in_set(x, set), it's native to write
>
> if (is_x_in_set(x, set)) {
>
>    ...
>
> }
>
> which is a bit in conflict with your suggestion. Still, such functions should be simply distinguished by name.

The conventions I described only apply to error checking.  I certainly
don't mean to discourage things like

    while (*p && qemu_isspace(*p)) {
        p++;
    }

Also, they're *conventions*, not law.  The purpose of conventions is to
help us write clearer code.  Don't make code less clear just to conform
to a convention.  Use your judgement.

>>> I can also imagine combining different function types (int/bool) in if conditions o_O, what will save us from it?
>>
>> Can you give an example?
>
> I just meant something like if (f1() || !f2()) { < error > }, nothing special.

Perfectly consistent error checking style, idiomatic C, pick one.

>>> And don't forget about bool functions, which just check something, and false is not an error, but just negative answer on some question.
>>
>> For what it's worth, GLib specifically advises against such function
>> contracts:
>>
>>      By convention, if you return a boolean value indicating success then
>>      TRUE means success and FALSE means failure.  Avoid creating
>>      functions which have a boolean return value and a GError parameter,
>>      but where the boolean does something other than signal whether the
>>      GError is set.  Among other problems, it requires C callers to
>>      allocate a temporary error.  Instead, provide a "gboolean *" out
>>      parameter.  There are functions in GLib itself such as
>>      g_key_file_has_key() that are deprecated because of this.
>>
>
> Sounds good. But we are speaking about all functions, not only with errp parameter.. Or not?

The conventions I described apply to error checking, regardless of Error
object use.  For instance, they'd apply to error-checking
remove("some-file").



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-01  9:02 Questionable aspects of QEMU Error's design Markus Armbruster
                   ` (2 preceding siblings ...)
  2020-04-01 20:15 ` Peter Maydell
@ 2020-04-04  7:59 ` Markus Armbruster
  2020-04-04 10:59   ` Markus Armbruster
                     ` (2 more replies)
  3 siblings, 3 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-04  7:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé

Markus Armbruster <armbru@redhat.com> writes:

> QEMU's Error was patterned after GLib's GError.  Differences include:
[...]
> * Return value conventions
>
>   Common: non-void functions return a distinct error value on failure
>   when such a value can be defined.  Patterns:
>
>   - Functions returning non-null pointers on success return null pointer
>     on failure.
>
>   - Functions returning non-negative integers on success return a
>     negative error code on failure.
>
>   Different: GLib discourages void functions, because these lead to
>   awkward error checking code.  We have tons of them, and tons of
>   awkward error checking code:
>
>     Error *err = NULL;
>     frobnicate(arg, &err);
>     if (err) {
>         ... recover ...
>         error_propagate(errp, err);
>     }
>
>   instead of
>
>     if (!frobnicate(arg, errp))
>         ... recover ...
>     }
>
>   Can also lead to pointless creation of Error objects.
>
>   I consider this a design mistake.  Can we still fix it?  We have more
>   than 2000 void functions taking an Error ** parameter...
>
>   Transforming code that receives and checks for errors with Coccinelle
>   shouldn't be hard.  Transforming code that returns errors seems more
>   difficult.  We need to transform explicit and implicit return to
>   either return true or return false, depending on what we did to the
>   @errp parameter on the way to the return.  Hmm.
[...]

To figure out what functions with an Error ** parameter return, I used
Coccinelle to find such function definitions and print the return types.
Summary of results:

   2155 void
    873 signed integer
    494 pointer
    153 bool
     33 unsigned integer
      6 enum
   ---------------------
   3714 total

I then used Coccinelle to find checked calls of void functions (passing
&error_fatal or &error_abort is not considered "checking" here).  These
calls become simpler if we make the functions return a useful value.  I
found a bit under 600 direct calls, and some 50 indirect calls.

Most frequent direct calls:

    127 object_property_set_bool
     27 qemu_opts_absorb_qdict
     16 visit_type_str
     14 visit_type_int
     10 visit_type_uint32

Let's have a closer look at object_property_set() & friends.  Out of
almost 1000 calls, some 150 are checked.  While I'm sure many of the
unchecked calls can't actually fail, I am concerned some unchecked calls
can.

If we adopt the convention to return a value that indicates success /
failure, we should consider converting object.h to it sooner rather than
later.

Please understand these are rough numbers from quick & dirty scripts.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-04  7:59 ` Markus Armbruster
@ 2020-04-04 10:59   ` Markus Armbruster
  2020-04-06 14:05     ` Eduardo Habkost
  2020-04-06 14:10     ` Daniel P. Berrangé
  2020-04-27 15:36   ` Markus Armbruster
  2020-07-03 12:21   ` Markus Armbruster
  2 siblings, 2 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-04-04 10:59 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Daniel P. Berrangé,
	Eduardo Habkost, qemu-devel, Philippe Mathieu-Daudé

Markus Armbruster <armbru@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> QEMU's Error was patterned after GLib's GError.  Differences include:
> [...]
>> * Return value conventions
>>
>>   Common: non-void functions return a distinct error value on failure
>>   when such a value can be defined.  Patterns:
>>
>>   - Functions returning non-null pointers on success return null pointer
>>     on failure.
>>
>>   - Functions returning non-negative integers on success return a
>>     negative error code on failure.
>>
>>   Different: GLib discourages void functions, because these lead to
>>   awkward error checking code.  We have tons of them, and tons of
>>   awkward error checking code:
>>
>>     Error *err = NULL;
>>     frobnicate(arg, &err);
>>     if (err) {
>>         ... recover ...
>>         error_propagate(errp, err);
>>     }
>>
>>   instead of
>>
>>     if (!frobnicate(arg, errp))
>>         ... recover ...
>>     }
>>
>>   Can also lead to pointless creation of Error objects.
>>
>>   I consider this a design mistake.  Can we still fix it?  We have more
>>   than 2000 void functions taking an Error ** parameter...
>>
>>   Transforming code that receives and checks for errors with Coccinelle
>>   shouldn't be hard.  Transforming code that returns errors seems more
>>   difficult.  We need to transform explicit and implicit return to
>>   either return true or return false, depending on what we did to the
>>   @errp parameter on the way to the return.  Hmm.
> [...]
>
> To figure out what functions with an Error ** parameter return, I used
> Coccinelle to find such function definitions and print the return types.
> Summary of results:
>
>    2155 void
>     873 signed integer
>     494 pointer
>     153 bool
>      33 unsigned integer
>       6 enum
>    ---------------------
>    3714 total
>
> I then used Coccinelle to find checked calls of void functions (passing
> &error_fatal or &error_abort is not considered "checking" here).  These
> calls become simpler if we make the functions return a useful value.  I
> found a bit under 600 direct calls, and some 50 indirect calls.
>
> Most frequent direct calls:
>
>     127 object_property_set_bool
>      27 qemu_opts_absorb_qdict
>      16 visit_type_str
>      14 visit_type_int
>      10 visit_type_uint32
>
> Let's have a closer look at object_property_set() & friends.  Out of
> almost 1000 calls, some 150 are checked.  While I'm sure many of the
> unchecked calls can't actually fail, I am concerned some unchecked calls
> can.
>
> If we adopt the convention to return a value that indicates success /
> failure, we should consider converting object.h to it sooner rather than
> later.
>
> Please understand these are rough numbers from quick & dirty scripts.

Paolo, Daniel, Eduardo,

Please pick one for QOM:

* Do nothing.  Looks like

     object_property_set_bool(..., &err);
     if (err) {
         error_propagate(errp, err);
         return;
     }

* Return true on success, false on error.  Looks like

     if (!object_property_set_bool(..., errp)) {
         return;
     }

* Return 0 on success, -1 on error.  Looks like

     if (object_property_set_bool(..., errp) < 0) {
         return;
     }

  This is slightly more likely to require line wrapping than the
  previous one.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-04 10:59   ` Markus Armbruster
@ 2020-04-06 14:05     ` Eduardo Habkost
  2020-04-06 14:38       ` Eduardo Habkost
  2020-04-06 14:10     ` Daniel P. Berrangé
  1 sibling, 1 reply; 41+ messages in thread
From: Eduardo Habkost @ 2020-04-06 14:05 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Daniel P. Berrangé,
	qemu-devel, Paolo Bonzini, Philippe Mathieu-Daudé

On Sat, Apr 04, 2020 at 12:59:27PM +0200, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Markus Armbruster <armbru@redhat.com> writes:
> >
> >> QEMU's Error was patterned after GLib's GError.  Differences include:
> > [...]
> >> * Return value conventions
> >>
> >>   Common: non-void functions return a distinct error value on failure
> >>   when such a value can be defined.  Patterns:
> >>
> >>   - Functions returning non-null pointers on success return null pointer
> >>     on failure.
> >>
> >>   - Functions returning non-negative integers on success return a
> >>     negative error code on failure.
> >>
> >>   Different: GLib discourages void functions, because these lead to
> >>   awkward error checking code.  We have tons of them, and tons of
> >>   awkward error checking code:
> >>
> >>     Error *err = NULL;
> >>     frobnicate(arg, &err);
> >>     if (err) {
> >>         ... recover ...
> >>         error_propagate(errp, err);
> >>     }
> >>
> >>   instead of
> >>
> >>     if (!frobnicate(arg, errp))
> >>         ... recover ...
> >>     }
> >>
> >>   Can also lead to pointless creation of Error objects.
> >>
> >>   I consider this a design mistake.  Can we still fix it?  We have more
> >>   than 2000 void functions taking an Error ** parameter...
> >>
> >>   Transforming code that receives and checks for errors with Coccinelle
> >>   shouldn't be hard.  Transforming code that returns errors seems more
> >>   difficult.  We need to transform explicit and implicit return to
> >>   either return true or return false, depending on what we did to the
> >>   @errp parameter on the way to the return.  Hmm.
> > [...]
> >
> > To figure out what functions with an Error ** parameter return, I used
> > Coccinelle to find such function definitions and print the return types.
> > Summary of results:
> >
> >    2155 void
> >     873 signed integer
> >     494 pointer
> >     153 bool
> >      33 unsigned integer
> >       6 enum
> >    ---------------------
> >    3714 total
> >
> > I then used Coccinelle to find checked calls of void functions (passing
> > &error_fatal or &error_abort is not considered "checking" here).  These
> > calls become simpler if we make the functions return a useful value.  I
> > found a bit under 600 direct calls, and some 50 indirect calls.
> >
> > Most frequent direct calls:
> >
> >     127 object_property_set_bool
> >      27 qemu_opts_absorb_qdict
> >      16 visit_type_str
> >      14 visit_type_int
> >      10 visit_type_uint32
> >
> > Let's have a closer look at object_property_set() & friends.  Out of
> > almost 1000 calls, some 150 are checked.  While I'm sure many of the
> > unchecked calls can't actually fail, I am concerned some unchecked calls
> > can.
> >
> > If we adopt the convention to return a value that indicates success /
> > failure, we should consider converting object.h to it sooner rather than
> > later.
> >
> > Please understand these are rough numbers from quick & dirty scripts.
> 
> Paolo, Daniel, Eduardo,
> 
> Please pick one for QOM:

Replying this without reading the whole discussion and context:

> 
> * Do nothing.  Looks like
> 
>      object_property_set_bool(..., &err);
>      if (err) {
>          error_propagate(errp, err);
>          return;
>      }
> 
> * Return true on success, false on error.  Looks like
> 

I prefer this one.

>      if (!object_property_set_bool(..., errp)) {
>          return;
>      }
> 
> * Return 0 on success, -1 on error.  Looks like
> 
>      if (object_property_set_bool(..., errp) < 0) {
>          return;
>      }
> 
>   This is slightly more likely to require line wrapping than the
>   previous one.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-04 10:59   ` Markus Armbruster
  2020-04-06 14:05     ` Eduardo Habkost
@ 2020-04-06 14:10     ` Daniel P. Berrangé
  1 sibling, 0 replies; 41+ messages in thread
From: Daniel P. Berrangé @ 2020-04-06 14:10 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy, Eduardo Habkost,
	qemu-devel, Paolo Bonzini, Philippe Mathieu-Daudé

On Sat, Apr 04, 2020 at 12:59:27PM +0200, Markus Armbruster wrote:
> Paolo, Daniel, Eduardo,
> 
> Please pick one for QOM:

> * Return true on success, false on error.  Looks like
> 
>      if (!object_property_set_bool(..., errp)) {
>          return;
>      }

My preference is this one, since it aligns with GLib recommendations
for error reporting & is simple & concise.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-06 14:05     ` Eduardo Habkost
@ 2020-04-06 14:38       ` Eduardo Habkost
  0 siblings, 0 replies; 41+ messages in thread
From: Eduardo Habkost @ 2020-04-06 14:38 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy,
	Daniel P. Berrangé,
	qemu-devel, Paolo Bonzini, Philippe Mathieu-Daudé

On Mon, Apr 06, 2020 at 11:05:36AM -0300, Eduardo Habkost wrote:
> On Sat, Apr 04, 2020 at 12:59:27PM +0200, Markus Armbruster wrote:
[...]
> > Paolo, Daniel, Eduardo,
> > 
> > Please pick one for QOM:
> 
> Replying this without reading the whole discussion and context:
> 
> > 
> > * Do nothing.  Looks like
> > 
> >      object_property_set_bool(..., &err);
> >      if (err) {
> >          error_propagate(errp, err);
> >          return;
> >      }
> > 
> > * Return true on success, false on error.  Looks like
> > 
> 
> I prefer this one.

My weird quoting choice was probably confusing, so for the sake
of clarity: I prefer the "Return true on success, false on error"
option.

> 
> >      if (!object_property_set_bool(..., errp)) {
> >          return;
> >      }
> > 
> > * Return 0 on success, -1 on error.  Looks like
> > 
> >      if (object_property_set_bool(..., errp) < 0) {
> >          return;
> >      }
> > 
> >   This is slightly more likely to require line wrapping than the
> >   previous one.
> 
> -- 
> Eduardo

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-04  7:59 ` Markus Armbruster
  2020-04-04 10:59   ` Markus Armbruster
@ 2020-04-27 15:36   ` Markus Armbruster
  2020-04-28  5:20     ` Vladimir Sementsov-Ogievskiy
  2020-07-03 12:21   ` Markus Armbruster
  2 siblings, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2020-04-27 15:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé

Markus Armbruster <armbru@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> QEMU's Error was patterned after GLib's GError.  Differences include:
> [...]
>> * Return value conventions
>>
>>   Common: non-void functions return a distinct error value on failure
>>   when such a value can be defined.  Patterns:
>>
>>   - Functions returning non-null pointers on success return null pointer
>>     on failure.
>>
>>   - Functions returning non-negative integers on success return a
>>     negative error code on failure.
>>
>>   Different: GLib discourages void functions, because these lead to
>>   awkward error checking code.  We have tons of them, and tons of
>>   awkward error checking code:
>>
>>     Error *err = NULL;
>>     frobnicate(arg, &err);
>>     if (err) {
>>         ... recover ...
>>         error_propagate(errp, err);
>>     }
>>
>>   instead of
>>
>>     if (!frobnicate(arg, errp))
>>         ... recover ...
>>     }
>>
>>   Can also lead to pointless creation of Error objects.
>>
>>   I consider this a design mistake.  Can we still fix it?  We have more
>>   than 2000 void functions taking an Error ** parameter...
>>
>>   Transforming code that receives and checks for errors with Coccinelle
>>   shouldn't be hard.  Transforming code that returns errors seems more
>>   difficult.  We need to transform explicit and implicit return to
>>   either return true or return false, depending on what we did to the
>>   @errp parameter on the way to the return.  Hmm.
> [...]
>
> To figure out what functions with an Error ** parameter return, I used
> Coccinelle to find such function definitions and print the return types.
> Summary of results:
>
>    2155 void
>     873 signed integer
>     494 pointer
>     153 bool
>      33 unsigned integer
>       6 enum
>    ---------------------
>    3714 total
>
> I then used Coccinelle to find checked calls of void functions (passing
> &error_fatal or &error_abort is not considered "checking" here).  These
> calls become simpler if we make the functions return a useful value.  I
> found a bit under 600 direct calls, and some 50 indirect calls.
>
> Most frequent direct calls:
>
>     127 object_property_set_bool
>      27 qemu_opts_absorb_qdict
>      16 visit_type_str
>      14 visit_type_int
>      10 visit_type_uint32
>
> Let's have a closer look at object_property_set() & friends.  Out of
> almost 1000 calls, some 150 are checked.  While I'm sure many of the
> unchecked calls can't actually fail, I am concerned some unchecked calls
> can.
>
> If we adopt the convention to return a value that indicates success /
> failure, we should consider converting object.h to it sooner rather than
> later.
>
> Please understand these are rough numbers from quick & dirty scripts.

FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
running into bugs.  So far:

    [PATCH v2 for-5.1 0/9] qemu-option: Fix corner cases and clean up
    [PATCH for-5.1 0/5] qobject: Minor spring cleaning
    [PATCH v2 00/14] Miscellaneous error handling fixes
    [PATCH 0/4] Subject: [PATCH 0/4] smbus: SPD fixes
    [PATCH 0/3] fuzz: Probably there is a better way to do this
    [PATCH v2 00/15] qapi: Spring cleaning
    [PATCH 00/11] More miscellaneous error handling fixes

I got another one coming for QOM and qdev before I can post the
conversion.

Vladimir, since the conversion will mess with error_propagate(), I'd
like to get it in before your auto-propagation work.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-27 15:36   ` Markus Armbruster
@ 2020-04-28  5:20     ` Vladimir Sementsov-Ogievskiy
  2020-05-14  7:59       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-04-28  5:20 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel; +Cc: Peter Maydell, Philippe Mathieu-Daudé

27.04.2020 18:36, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
>> Markus Armbruster <armbru@redhat.com> writes:
>>
>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>> [...]
>>> * Return value conventions
>>>
>>>    Common: non-void functions return a distinct error value on failure
>>>    when such a value can be defined.  Patterns:
>>>
>>>    - Functions returning non-null pointers on success return null pointer
>>>      on failure.
>>>
>>>    - Functions returning non-negative integers on success return a
>>>      negative error code on failure.
>>>
>>>    Different: GLib discourages void functions, because these lead to
>>>    awkward error checking code.  We have tons of them, and tons of
>>>    awkward error checking code:
>>>
>>>      Error *err = NULL;
>>>      frobnicate(arg, &err);
>>>      if (err) {
>>>          ... recover ...
>>>          error_propagate(errp, err);
>>>      }
>>>
>>>    instead of
>>>
>>>      if (!frobnicate(arg, errp))
>>>          ... recover ...
>>>      }
>>>
>>>    Can also lead to pointless creation of Error objects.
>>>
>>>    I consider this a design mistake.  Can we still fix it?  We have more
>>>    than 2000 void functions taking an Error ** parameter...
>>>
>>>    Transforming code that receives and checks for errors with Coccinelle
>>>    shouldn't be hard.  Transforming code that returns errors seems more
>>>    difficult.  We need to transform explicit and implicit return to
>>>    either return true or return false, depending on what we did to the
>>>    @errp parameter on the way to the return.  Hmm.
>> [...]
>>
>> To figure out what functions with an Error ** parameter return, I used
>> Coccinelle to find such function definitions and print the return types.
>> Summary of results:
>>
>>     2155 void
>>      873 signed integer
>>      494 pointer
>>      153 bool
>>       33 unsigned integer
>>        6 enum
>>     ---------------------
>>     3714 total
>>
>> I then used Coccinelle to find checked calls of void functions (passing
>> &error_fatal or &error_abort is not considered "checking" here).  These
>> calls become simpler if we make the functions return a useful value.  I
>> found a bit under 600 direct calls, and some 50 indirect calls.
>>
>> Most frequent direct calls:
>>
>>      127 object_property_set_bool
>>       27 qemu_opts_absorb_qdict
>>       16 visit_type_str
>>       14 visit_type_int
>>       10 visit_type_uint32
>>
>> Let's have a closer look at object_property_set() & friends.  Out of
>> almost 1000 calls, some 150 are checked.  While I'm sure many of the
>> unchecked calls can't actually fail, I am concerned some unchecked calls
>> can.
>>
>> If we adopt the convention to return a value that indicates success /
>> failure, we should consider converting object.h to it sooner rather than
>> later.
>>
>> Please understand these are rough numbers from quick & dirty scripts.
> 
> FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
> running into bugs.  So far:
> 
>      [PATCH v2 for-5.1 0/9] qemu-option: Fix corner cases and clean up
>      [PATCH for-5.1 0/5] qobject: Minor spring cleaning
>      [PATCH v2 00/14] Miscellaneous error handling fixes
>      [PATCH 0/4] Subject: [PATCH 0/4] smbus: SPD fixes
>      [PATCH 0/3] fuzz: Probably there is a better way to do this
>      [PATCH v2 00/15] qapi: Spring cleaning
>      [PATCH 00/11] More miscellaneous error handling fixes
> 
> I got another one coming for QOM and qdev before I can post the
> conversion.
> 
> Vladimir, since the conversion will mess with error_propagate(), I'd
> like to get it in before your auto-propagation work.
> 

OK, just let me know when to regenerate the series, it's not hard.

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-28  5:20     ` Vladimir Sementsov-Ogievskiy
@ 2020-05-14  7:59       ` Vladimir Sementsov-Ogievskiy
  2020-05-15  4:28         ` Markus Armbruster
  0 siblings, 1 reply; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-05-14  7:59 UTC (permalink / raw)
  To: Markus Armbruster, qemu-devel; +Cc: Peter Maydell, Philippe Mathieu-Daudé

28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:
> 27.04.2020 18:36, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>>
>>> Markus Armbruster <armbru@redhat.com> writes:
>>>
>>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>> [...]
>>>> * Return value conventions
>>>>
>>>>    Common: non-void functions return a distinct error value on failure
>>>>    when such a value can be defined.  Patterns:
>>>>
>>>>    - Functions returning non-null pointers on success return null pointer
>>>>      on failure.
>>>>
>>>>    - Functions returning non-negative integers on success return a
>>>>      negative error code on failure.
>>>>
>>>>    Different: GLib discourages void functions, because these lead to
>>>>    awkward error checking code.  We have tons of them, and tons of
>>>>    awkward error checking code:
>>>>
>>>>      Error *err = NULL;
>>>>      frobnicate(arg, &err);
>>>>      if (err) {
>>>>          ... recover ...
>>>>          error_propagate(errp, err);
>>>>      }
>>>>
>>>>    instead of
>>>>
>>>>      if (!frobnicate(arg, errp))
>>>>          ... recover ...
>>>>      }
>>>>
>>>>    Can also lead to pointless creation of Error objects.
>>>>
>>>>    I consider this a design mistake.  Can we still fix it?  We have more
>>>>    than 2000 void functions taking an Error ** parameter...
>>>>
>>>>    Transforming code that receives and checks for errors with Coccinelle
>>>>    shouldn't be hard.  Transforming code that returns errors seems more
>>>>    difficult.  We need to transform explicit and implicit return to
>>>>    either return true or return false, depending on what we did to the
>>>>    @errp parameter on the way to the return.  Hmm.
>>> [...]
>>>
>>> To figure out what functions with an Error ** parameter return, I used
>>> Coccinelle to find such function definitions and print the return types.
>>> Summary of results:
>>>
>>>     2155 void
>>>      873 signed integer
>>>      494 pointer
>>>      153 bool
>>>       33 unsigned integer
>>>        6 enum
>>>     ---------------------
>>>     3714 total
>>>
>>> I then used Coccinelle to find checked calls of void functions (passing
>>> &error_fatal or &error_abort is not considered "checking" here).  These
>>> calls become simpler if we make the functions return a useful value.  I
>>> found a bit under 600 direct calls, and some 50 indirect calls.
>>>
>>> Most frequent direct calls:
>>>
>>>      127 object_property_set_bool
>>>       27 qemu_opts_absorb_qdict
>>>       16 visit_type_str
>>>       14 visit_type_int
>>>       10 visit_type_uint32
>>>
>>> Let's have a closer look at object_property_set() & friends.  Out of
>>> almost 1000 calls, some 150 are checked.  While I'm sure many of the
>>> unchecked calls can't actually fail, I am concerned some unchecked calls
>>> can.
>>>
>>> If we adopt the convention to return a value that indicates success /
>>> failure, we should consider converting object.h to it sooner rather than
>>> later.
>>>
>>> Please understand these are rough numbers from quick & dirty scripts.
>>
>> FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
>> running into bugs.  So far:
>>
>>      [PATCH v2 for-5.1 0/9] qemu-option: Fix corner cases and clean up
>>      [PATCH for-5.1 0/5] qobject: Minor spring cleaning
>>      [PATCH v2 00/14] Miscellaneous error handling fixes
>>      [PATCH 0/4] Subject: [PATCH 0/4] smbus: SPD fixes
>>      [PATCH 0/3] fuzz: Probably there is a better way to do this
>>      [PATCH v2 00/15] qapi: Spring cleaning
>>      [PATCH 00/11] More miscellaneous error handling fixes
>>
>> I got another one coming for QOM and qdev before I can post the
>> conversion.
>>
>> Vladimir, since the conversion will mess with error_propagate(), I'd
>> like to get it in before your auto-propagation work.
>>
> 
> OK, just let me know when to regenerate the series, it's not hard.
> 

Hi! Is all that merged? Should I resend now?

-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-05-14  7:59       ` Vladimir Sementsov-Ogievskiy
@ 2020-05-15  4:28         ` Markus Armbruster
  2020-07-03  7:38           ` Markus Armbruster
  0 siblings, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2020-05-15  4:28 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé, qemu-devel

Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:

> 28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:
>> 27.04.2020 18:36, Markus Armbruster wrote:
>>> Markus Armbruster <armbru@redhat.com> writes:
>>>
>>>> Markus Armbruster <armbru@redhat.com> writes:
>>>>
>>>>> QEMU's Error was patterned after GLib's GError.  Differences include:
>>>> [...]
>>>>> * Return value conventions
>>>>>
>>>>>    Common: non-void functions return a distinct error value on failure
>>>>>    when such a value can be defined.  Patterns:
>>>>>
>>>>>    - Functions returning non-null pointers on success return null pointer
>>>>>      on failure.
>>>>>
>>>>>    - Functions returning non-negative integers on success return a
>>>>>      negative error code on failure.
>>>>>
>>>>>    Different: GLib discourages void functions, because these lead to
>>>>>    awkward error checking code.  We have tons of them, and tons of
>>>>>    awkward error checking code:
>>>>>
>>>>>      Error *err = NULL;
>>>>>      frobnicate(arg, &err);
>>>>>      if (err) {
>>>>>          ... recover ...
>>>>>          error_propagate(errp, err);
>>>>>      }
>>>>>
>>>>>    instead of
>>>>>
>>>>>      if (!frobnicate(arg, errp))
>>>>>          ... recover ...
>>>>>      }
>>>>>
>>>>>    Can also lead to pointless creation of Error objects.
>>>>>
>>>>>    I consider this a design mistake.  Can we still fix it?  We have more
>>>>>    than 2000 void functions taking an Error ** parameter...
>>>>>
>>>>>    Transforming code that receives and checks for errors with Coccinelle
>>>>>    shouldn't be hard.  Transforming code that returns errors seems more
>>>>>    difficult.  We need to transform explicit and implicit return to
>>>>>    either return true or return false, depending on what we did to the
>>>>>    @errp parameter on the way to the return.  Hmm.
>>>> [...]
>>>>
>>>> To figure out what functions with an Error ** parameter return, I used
>>>> Coccinelle to find such function definitions and print the return types.
>>>> Summary of results:
>>>>
>>>>     2155 void
>>>>      873 signed integer
>>>>      494 pointer
>>>>      153 bool
>>>>       33 unsigned integer
>>>>        6 enum
>>>>     ---------------------
>>>>     3714 total
>>>>
>>>> I then used Coccinelle to find checked calls of void functions (passing
>>>> &error_fatal or &error_abort is not considered "checking" here).  These
>>>> calls become simpler if we make the functions return a useful value.  I
>>>> found a bit under 600 direct calls, and some 50 indirect calls.
>>>>
>>>> Most frequent direct calls:
>>>>
>>>>      127 object_property_set_bool
>>>>       27 qemu_opts_absorb_qdict
>>>>       16 visit_type_str
>>>>       14 visit_type_int
>>>>       10 visit_type_uint32
>>>>
>>>> Let's have a closer look at object_property_set() & friends.  Out of
>>>> almost 1000 calls, some 150 are checked.  While I'm sure many of the
>>>> unchecked calls can't actually fail, I am concerned some unchecked calls
>>>> can.
>>>>
>>>> If we adopt the convention to return a value that indicates success /
>>>> failure, we should consider converting object.h to it sooner rather than
>>>> later.
>>>>
>>>> Please understand these are rough numbers from quick & dirty scripts.
>>>
>>> FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
>>> running into bugs.  So far:
>>>
>>>      [PATCH v2 for-5.1 0/9] qemu-option: Fix corner cases and clean up
>>>      [PATCH for-5.1 0/5] qobject: Minor spring cleaning
>>>      [PATCH v2 00/14] Miscellaneous error handling fixes
>>>      [PATCH 0/4] Subject: [PATCH 0/4] smbus: SPD fixes
>>>      [PATCH 0/3] fuzz: Probably there is a better way to do this
>>>      [PATCH v2 00/15] qapi: Spring cleaning
>>>      [PATCH 00/11] More miscellaneous error handling fixes
>>>
>>> I got another one coming for QOM and qdev before I can post the
>>> conversion.
>>>
>>> Vladimir, since the conversion will mess with error_propagate(), I'd
>>> like to get it in before your auto-propagation work.
>>>
>>
>> OK, just let me know when to regenerate the series, it's not hard.
>>
>
> Hi! Is all that merged? Should I resend now?

I ran into many bugs and fell into a few rabbit holes.  I'm busy
finishing and flushing the patches.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-05-15  4:28         ` Markus Armbruster
@ 2020-07-03  7:38           ` Markus Armbruster
  2020-07-03  9:07             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2020-07-03  7:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: Peter Maydell, Philippe Mathieu-Daudé, qemu-devel

Markus Armbruster <armbru@redhat.com> writes:

> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>
>> 28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:
>>> 27.04.2020 18:36, Markus Armbruster wrote:
>>>> FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
>>>> running into bugs.  So far:
[...]
>>>> I got another one coming for QOM and qdev before I can post the
>>>> conversion.
>>>>
>>>> Vladimir, since the conversion will mess with error_propagate(), I'd
>>>> like to get it in before your auto-propagation work.
>>>>
>>>
>>> OK, just let me know when to regenerate the series, it's not hard.
>>>
>>
>> Hi! Is all that merged? Should I resend now?
>
> I ran into many bugs and fell into a few rabbit holes.  I'm busy
> finishing and flushing the patches.

All merged except for the final series "[PATCH v2 00/44] Less clumsy
error checking".  v2 has a lot of change within the series, but in
aggregate it's really close to v1.  This makes be optimistic it can
serve as a base for your auto-propagation work.  To get it into 5.1, we
need a respin, a re-review, and a pull request.  Time is awfully short.
Sorry for taking so long!  If you want to try, I can give it priority on
my side.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-07-03  7:38           ` Markus Armbruster
@ 2020-07-03  9:07             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 41+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2020-07-03  9:07 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Peter Maydell, Philippe Mathieu-Daudé, qemu-devel

03.07.2020 10:38, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
>> Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> writes:
>>
>>> 28.04.2020 08:20, Vladimir Sementsov-Ogievskiy wrote:
>>>> 27.04.2020 18:36, Markus Armbruster wrote:
>>>>> FYI, I'm working on converting QemuOpts, QAPI visitors and QOM.  I keep
>>>>> running into bugs.  So far:
> [...]
>>>>> I got another one coming for QOM and qdev before I can post the
>>>>> conversion.
>>>>>
>>>>> Vladimir, since the conversion will mess with error_propagate(), I'd
>>>>> like to get it in before your auto-propagation work.
>>>>>
>>>>
>>>> OK, just let me know when to regenerate the series, it's not hard.
>>>>
>>>
>>> Hi! Is all that merged? Should I resend now?
>>
>> I ran into many bugs and fell into a few rabbit holes.  I'm busy
>> finishing and flushing the patches.
> 
> All merged except for the final series "[PATCH v2 00/44] Less clumsy
> error checking".  v2 has a lot of change within the series, but in
> aggregate it's really close to v1.  This makes be optimistic it can
> serve as a base for your auto-propagation work.  To get it into 5.1, we
> need a respin, a re-review, and a pull request.  Time is awfully short.
> Sorry for taking so long!  If you want to try, I can give it priority on
> my side.
> 

Of course let's try)


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: Questionable aspects of QEMU Error's design
  2020-04-04  7:59 ` Markus Armbruster
  2020-04-04 10:59   ` Markus Armbruster
  2020-04-27 15:36   ` Markus Armbruster
@ 2020-07-03 12:21   ` Markus Armbruster
  2 siblings, 0 replies; 41+ messages in thread
From: Markus Armbruster @ 2020-07-03 12:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Vladimir Sementsov-Ogievskiy, Philippe Mathieu-Daudé

Markus Armbruster <armbru@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> QEMU's Error was patterned after GLib's GError.  Differences include:
> [...]
>> * Return value conventions
>>
>>   Common: non-void functions return a distinct error value on failure
>>   when such a value can be defined.  Patterns:
>>
>>   - Functions returning non-null pointers on success return null pointer
>>     on failure.
>>
>>   - Functions returning non-negative integers on success return a
>>     negative error code on failure.
>>
>>   Different: GLib discourages void functions, because these lead to
>>   awkward error checking code.  We have tons of them, and tons of
>>   awkward error checking code:
>>
>>     Error *err = NULL;
>>     frobnicate(arg, &err);
>>     if (err) {
>>         ... recover ...
>>         error_propagate(errp, err);
>>     }
>>
>>   instead of
>>
>>     if (!frobnicate(arg, errp))
>>         ... recover ...
>>     }
>>
>>   Can also lead to pointless creation of Error objects.
>>
>>   I consider this a design mistake.  Can we still fix it?  We have more
>>   than 2000 void functions taking an Error ** parameter...
>>
>>   Transforming code that receives and checks for errors with Coccinelle
>>   shouldn't be hard.  Transforming code that returns errors seems more
>>   difficult.  We need to transform explicit and implicit return to
>>   either return true or return false, depending on what we did to the
>>   @errp parameter on the way to the return.  Hmm.
> [...]
>
> To figure out what functions with an Error ** parameter return, I used
> Coccinelle to find such function definitions and print the return types.
> Summary of results:
>
>    2155 void
>     873 signed integer
>     494 pointer
>     153 bool
>      33 unsigned integer
>       6 enum
>    ---------------------
>    3714 total

With my "[PATCH v2 00/44] Less clumsy error checking" applied, I now count

     1946 void
      879 signed integer
      484 pointer
      301 bool
       33 unsigned integer
        3 GuestFsfreezeStatus
        1 gnutls_x509_crt_t
        1 QCryptoCipherAlgorithm
        1 COLOMessage
        1 BlockdevDetectZeroesOptions
     ---------------------
     3650 total

About 7% complete for function definitions.

> I then used Coccinelle to find checked calls of void functions (passing
> &error_fatal or &error_abort is not considered "checking" here).  These
> calls become simpler if we make the functions return a useful value.  I
> found a bit under 600 direct calls, and some 50 indirect calls.

Different method this time: I count any direct function call that takes
&err other than &error_abort, &error_fatal, and whose value, if any, is
not used.

Current master: 1050

With my "[PATCH v2 00/44] Less clumsy error checking" applied: 649

About 38% complete for function calls.

> Most frequent direct calls:
>
>     127 object_property_set_bool
>      27 qemu_opts_absorb_qdict
>      16 visit_type_str
>      14 visit_type_int
>      10 visit_type_uint32

Top scorers master:

    151 sysbus_realize()
     34 qemu_opts_absorb_qdict()
     29 visit_type_int()
     24 visit_type_str()
     23 cpu_exec_realizefn()
     19 visit_type_size()
     16 qdev_realize()
     14 visit_type_bool()
     12 visit_type_uint32()
     11 visit_type_int32()
     11 object_property_set_bool()
     10 object_property_set_uint()
     10 object_property_set_int()
    +420 functions with fewer than 10 calls

Top scorers with my patches applied:

     23 cpu_exec_realizefn()
     15 visit_type_int()
     10 visit_type_size()
    +387 functions with fewer than 10 calls

Looks like this is going to be a long slog.

With functions into buckets by same name prefix up to the first '_':

     63 visit
     57 qmp
     33 bdrv
     29 cpu
     26 xen
     25 memory
   +113 buckets with fewer than 25 calls

[...]
>
> Please understand these are rough numbers from quick & dirty scripts.

Still are.



^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-07-03 12:22 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01  9:02 Questionable aspects of QEMU Error's design Markus Armbruster
2020-04-01 12:10 ` Vladimir Sementsov-Ogievskiy
2020-04-01 12:14   ` Vladimir Sementsov-Ogievskiy
2020-04-01 14:01   ` Alex Bennée
2020-04-01 15:49     ` Markus Armbruster
2020-04-01 15:05   ` Markus Armbruster
2020-04-01 12:44 ` Daniel P. Berrangé
2020-04-01 12:47   ` Vladimir Sementsov-Ogievskiy
2020-04-01 15:34   ` Markus Armbruster
2020-04-01 20:15 ` Peter Maydell
2020-04-02  5:31   ` Vladimir Sementsov-Ogievskiy
2020-04-02  9:36     ` BALATON Zoltan
2020-04-02 14:11       ` Vladimir Sementsov-Ogievskiy
2020-04-02 14:34         ` Markus Armbruster
2020-04-02 15:28           ` BALATON Zoltan
2020-04-03  7:09             ` Markus Armbruster
2020-04-02  5:54   ` Markus Armbruster
2020-04-02  6:11     ` Vladimir Sementsov-Ogievskiy
2020-04-02  8:11       ` Peter Maydell
2020-04-02  8:49         ` Daniel P. Berrangé
2020-04-02  8:55         ` Markus Armbruster
2020-04-02 14:35           ` Vladimir Sementsov-Ogievskiy
2020-04-02 15:06             ` Markus Armbruster
2020-04-02 17:17               ` Vladimir Sementsov-Ogievskiy
2020-04-03  7:48                 ` Markus Armbruster
2020-04-02 18:57           ` Paolo Bonzini
2020-04-02  8:47     ` Daniel P. Berrangé
2020-04-02  9:19       ` Alex Bennée
2020-04-02 14:33     ` Eric Blake
2020-04-04  7:59 ` Markus Armbruster
2020-04-04 10:59   ` Markus Armbruster
2020-04-06 14:05     ` Eduardo Habkost
2020-04-06 14:38       ` Eduardo Habkost
2020-04-06 14:10     ` Daniel P. Berrangé
2020-04-27 15:36   ` Markus Armbruster
2020-04-28  5:20     ` Vladimir Sementsov-Ogievskiy
2020-05-14  7:59       ` Vladimir Sementsov-Ogievskiy
2020-05-15  4:28         ` Markus Armbruster
2020-07-03  7:38           ` Markus Armbruster
2020-07-03  9:07             ` Vladimir Sementsov-Ogievskiy
2020-07-03 12:21   ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.