All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Claudio Fontana <cfontana@suse.de>
Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	qemu-devel@nongnu.org, dinechin@redhat.com,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH v6 5/5] accel: abort if we fail to load the accelerator plugin
Date: Mon, 26 Sep 2022 12:56:39 +0200	[thread overview]
Message-ID: <YzGFZ8A1InmPkNb/@redhat.com> (raw)
In-Reply-To: <0e32098d-4f76-8a40-3214-98fb58dd4192@suse.de>

Am 26.09.2022 um 09:58 hat Claudio Fontana geschrieben:
> On 9/24/22 14:35, Philippe Mathieu-Daudé via wrote:
> > On 24/9/22 01:21, Claudio Fontana wrote:
> >> if QEMU is configured with modules enabled, it is possible that the
> >> load of an accelerator module will fail.
> >> Abort in this case, relying on module_object_class_by_name to report
> >> the specific load error if any.
> >>
> >> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> >> ---
> >>   accel/accel-softmmu.c | 8 +++++++-
> >>   1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
> >> index 67276e4f52..9fa4849f2c 100644
> >> --- a/accel/accel-softmmu.c
> >> +++ b/accel/accel-softmmu.c
> >> @@ -66,6 +66,7 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>   {
> >>       const char *ac_name;
> >>       char *ops_name;
> >> +    ObjectClass *oc;
> >>       AccelOpsClass *ops;
> >>   
> >>       ac_name = object_class_get_name(OBJECT_CLASS(ac));
> >> @@ -73,8 +74,13 @@ void accel_init_ops_interfaces(AccelClass *ac)
> >>   
> >>       ops_name = g_strdup_printf("%s" ACCEL_OPS_SUFFIX, ac_name);
> >>       ops = ACCEL_OPS_CLASS(module_object_class_by_name(ops_name));
> >> +    oc = module_object_class_by_name(ops_name);
> >> +    if (!oc) {
> >> +        error_report("fatal: could not load module for type '%s'", ops_name);
> >> +        abort();
> > 
> > I still think a coredump won't help at all to figure the problem here: a 
> 
> I can change this from abort to exit(1), the issue I am seeing is, usually when we fail to create or initialize objects
> we seem to be using abort(), the most prominent examples are in qom/object.c:
> 
> static TypeImpl *type_new(const TypeInfo *info)
> {
>     TypeImpl *ti = g_malloc0(sizeof(*ti));
>     int i;
> 
>     g_assert(info->name != NULL);
> 
>     if (type_table_lookup(info->name) != NULL) {
>         fprintf(stderr, "Registering `%s' which already exists\n", info->name);
>         abort();
>     }
> 
> ...
> 
> void object_initialize(void *data, size_t size, const char *typename)
> {
>     TypeImpl *type = type_get_by_name(typename);
> 
> #ifdef CONFIG_MODULES
>     if (!type) {
>         Error *local_err = NULL;
>         int rv = module_load_qom(typename, &local_err);
>         if (rv > 0) {
>             type = type_get_by_name(typename);
>         } else if (rv < 0) {
>             error_report_err(local_err);
>         }
>     }
> #endif
>     if (!type) {
>         error_report("missing object type '%s'", typename);
>         abort();
>     }
> 
>     object_initialize_with_type(data, size, type);
> }
> 
> 
> Do you propose to change only the assert in accel_init_ops_interfaces
> to exit(1)?
> 
> Or the other case as well in the series? (ie hw/core/qdev.c qdev_new()
> ?)
> 
> Do you propose to change this consistently through the codebase
> including the object.c snippets above?

The difference with the snippets above (in the non-module case) is that
calling object_new() with a type that doesn't exist is a bug, it's an
programming error. Calling type_new() twice for the same TypeInfo or for
two TypeInfos with the same name is a programming error, too. abort() is
correct for situations that should never happen in a bug free QEMU.

Not being able to load a module is generally not a bug in QEMU, it's an
error of external origin. So here abort() is not appropriate.

The CONFIG_MODULES code in object_initialize() is problematic because it
doesn't have a way to deal with an error case that can happen without a
bug in QEMU. Without changing the prototype of the function to actually
allow error returns (which I suspect might be a very invasive change),
maybe the best approach is to just make it a fatal error and leave the
code mostly as it is in current master:

#ifdef CONFIG_MODULES
    if (!type) {
        /* Assuming that module_load_qom_one() returns an error if the
         * module doesn't exist */
        module_load_qom_one(typename, &error_fatal);
        type = type_get_by_name(typename);
    }
#endif
    if (!type) {
        error_report("missing object type '%s'", typename);
        abort();
    }

    object_initialize_with_type(data, size, type);

This makes it print an error message and exit(). Which is honestly not
great during runtime because it doesn't properly shut down QEMU, let
alone just fail the operation and keep running, but at least slightly
better than abort().

> > module is missing, we know its name. Anyhow I don't mind much, and this
> > can be cleaned later, so:
> 
> Sure this could be fixed later with a series that tries to use exit()
> vs abort() consistently throughout the codebase when initializing and
> creating objects.

This should mean consistently distinguishing programming errors (i.e.
QEMU bugs) from errors of external origin.

Kevin



  reply	other threads:[~2022-09-26 11:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 23:20 [PATCH v6 0/5] improve error handling for module load Claudio Fontana
2022-09-23 23:21 ` [PATCH v6 1/5] module: removed unused function argument "mayfail" Claudio Fontana
2022-09-23 23:21 ` [PATCH v6 2/5] module: rename module_load_one to module_load Claudio Fontana
2022-09-23 23:21 ` [PATCH v6 3/5] module: add Error arguments to module_load and module_load_qom Claudio Fontana
2022-09-26 10:38   ` Kevin Wolf
2022-09-26 13:28     ` Claudio Fontana
2022-09-26 13:54     ` Claudio Fontana
2022-09-27  7:54       ` Markus Armbruster
2022-09-27  9:13         ` Claudio Fontana
2022-09-27 11:53           ` Kevin Wolf
2022-09-27 12:54             ` Claudio Fontana
2022-09-27 11:57         ` Kevin Wolf
2022-09-23 23:21 ` [PATCH v6 4/5] dmg: warn when opening dmg images containing blocks of unknown type Claudio Fontana
2022-09-23 23:21 ` [PATCH v6 5/5] accel: abort if we fail to load the accelerator plugin Claudio Fontana
2022-09-24 12:35   ` Philippe Mathieu-Daudé via
2022-09-26  7:58     ` Claudio Fontana
2022-09-26 10:56       ` Kevin Wolf [this message]
2022-09-26 11:21         ` Claudio Fontana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzGFZ8A1InmPkNb/@redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cfontana@suse.de \
    --cc=dinechin@redhat.com \
    --cc=f4bug@amsat.org \
    --cc=kraxel@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.