All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] improve error handling for module load
@ 2022-09-08 18:30 Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 1/3] module: removed unused function argument "mayfail" Claudio Fontana
                   ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-08 18:30 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé,
	Claudio Fontana

CHANGELOG:

v3 -> v4: (Richard)

* module_object_class_by_name: return NULL immediately on load error.
* audio_driver_lookup: same.
* bdrv_find_format: same.

* dmg_open: handle optional compression submodules better: f.e.,
  if "dmg-bz2" is not present, continue but offer a warning.
  If "dmg-bz2" load fails with error, error out and return.

* module_load_dso: add newline to error_append_hint.

v2 -> v3:

* take the file existence check outside of module_load_file,
  rename module_load_file to module_load_dso, will be called only on
  an existing file. This will simplify the return value. (Richard)

* move exported function documentation into header files (Richard)

v1 -> v2:

* do not treat the display help text any differently and do report
  module load _errors_. If the module does not exist (ENOENT, ENOTDIR),
  no error will be produced.

DESCRIPTION:

while investigating a permission issue in accel, where accel-tcg-x86_64.so
was not accessible, I noticed that no errors were produced regarding the
module load failure.

This series attempts to improve module_load_one and module_load_qom_one
to handle the error cases better and produce some errors.

Patch 1 is already reviewed and is about removing an unused existing
argument "mayfail" from the call stack.

Patch 2 is the real meat, and that one I would say is RFC.
Will follow up with comments on the specific questions I have.

Patch 3 finally adds a simple check in accel/, aborting if a module
is not found, but relying on the existing error report from
module_load_qom_one.

Claudio Fontana (3):
  module: removed unused function argument "mayfail"
  module: add Error arguments to module_load_one and module_load_qom_one
  accel: abort if we fail to load the accelerator plugin

 accel/accel-softmmu.c |   8 ++-
 audio/audio.c         |   9 ++-
 block.c               |  15 ++++-
 block/dmg.c           |  18 +++++-
 hw/core/qdev.c        |  10 ++-
 include/qemu/module.h |  38 +++++++++--
 qom/object.c          |  18 +++++-
 softmmu/qtest.c       |   6 +-
 ui/console.c          |  18 +++++-
 util/module.c         | 142 ++++++++++++++++++++++++------------------
 10 files changed, 201 insertions(+), 81 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v4 1/3] module: removed unused function argument "mayfail"
  2022-09-08 18:30 [PATCH v4 0/3] improve error handling for module load Claudio Fontana
@ 2022-09-08 18:30 ` Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 3/3] accel: abort if we fail to load the accelerator plugin Claudio Fontana
  2 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-08 18:30 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé,
	Claudio Fontana

mayfail is always passed as false for every invocation throughout the program.
It controls whether to printf or not to printf an error on
g_module_open failure.

Remove this unused argument.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 include/qemu/module.h |  8 ++++----
 softmmu/qtest.c       |  2 +-
 util/module.c         | 20 +++++++++-----------
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/include/qemu/module.h b/include/qemu/module.h
index bd73607104..8c012bbe03 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -61,15 +61,15 @@ typedef enum {
 #define fuzz_target_init(function) module_init(function, \
                                                MODULE_INIT_FUZZ_TARGET)
 #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
-#define block_module_load_one(lib) module_load_one("block-", lib, false)
-#define ui_module_load_one(lib) module_load_one("ui-", lib, false)
-#define audio_module_load_one(lib) module_load_one("audio-", lib, false)
+#define block_module_load_one(lib) module_load_one("block-", lib)
+#define ui_module_load_one(lib) module_load_one("ui-", lib)
+#define audio_module_load_one(lib) module_load_one("audio-", lib)
 
 void register_module_init(void (*fn)(void), module_init_type type);
 void register_dso_module_init(void (*fn)(void), module_init_type type);
 
 void module_call_init(module_init_type type);
-bool module_load_one(const char *prefix, const char *lib_name, bool mayfail);
+bool module_load_one(const char *prefix, const char *lib_name);
 void module_load_qom_one(const char *type);
 void module_load_qom_all(void);
 void module_allow_arch(const char *arch);
diff --git a/softmmu/qtest.c b/softmmu/qtest.c
index f8acef2628..76eb7bac56 100644
--- a/softmmu/qtest.c
+++ b/softmmu/qtest.c
@@ -756,7 +756,7 @@ static void qtest_process_command(CharBackend *chr, gchar **words)
         g_assert(words[1] && words[2]);
 
         qtest_send_prefix(chr);
-        if (module_load_one(words[1], words[2], false)) {
+        if (module_load_one(words[1], words[2])) {
             qtest_sendf(chr, "OK\n");
         } else {
             qtest_sendf(chr, "FAIL\n");
diff --git a/util/module.c b/util/module.c
index 8ddb0e18f5..8563edd626 100644
--- a/util/module.c
+++ b/util/module.c
@@ -144,7 +144,7 @@ static bool module_check_arch(const QemuModinfo *modinfo)
     return true;
 }
 
-static int module_load_file(const char *fname, bool mayfail, bool export_symbols)
+static int module_load_file(const char *fname, bool export_symbols)
 {
     GModule *g_module;
     void (*sym)(void);
@@ -172,10 +172,8 @@ static int module_load_file(const char *fname, bool mayfail, bool export_symbols
     }
     g_module = g_module_open(fname, flags);
     if (!g_module) {
-        if (!mayfail) {
-            fprintf(stderr, "Failed to open module: %s\n",
-                    g_module_error());
-        }
+        fprintf(stderr, "Failed to open module: %s\n",
+                g_module_error());
         ret = -EINVAL;
         goto out;
     }
@@ -208,7 +206,7 @@ out:
 }
 #endif
 
-bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
+bool module_load_one(const char *prefix, const char *lib_name)
 {
     bool success = false;
 
@@ -256,7 +254,7 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
             if (strcmp(modinfo->name, module_name) == 0) {
                 /* we depend on other module(s) */
                 for (sl = modinfo->deps; *sl != NULL; sl++) {
-                    module_load_one("", *sl, false);
+                    module_load_one("", *sl);
                 }
             } else {
                 for (sl = modinfo->deps; *sl != NULL; sl++) {
@@ -287,7 +285,7 @@ bool module_load_one(const char *prefix, const char *lib_name, bool mayfail)
     for (i = 0; i < n_dirs; i++) {
         fname = g_strdup_printf("%s/%s%s",
                 dirs[i], module_name, CONFIG_HOST_DSOSUF);
-        ret = module_load_file(fname, mayfail, export_symbols);
+        ret = module_load_file(fname, export_symbols);
         g_free(fname);
         fname = NULL;
         /* Try loading until loaded a module file */
@@ -333,7 +331,7 @@ void module_load_qom_one(const char *type)
         }
         for (sl = modinfo->objs; *sl != NULL; sl++) {
             if (strcmp(type, *sl) == 0) {
-                module_load_one("", modinfo->name, false);
+                module_load_one("", modinfo->name);
             }
         }
     }
@@ -354,7 +352,7 @@ void module_load_qom_all(void)
         if (!module_check_arch(modinfo)) {
             continue;
         }
-        module_load_one("", modinfo->name, false);
+        module_load_one("", modinfo->name);
     }
     module_loaded_qom_all = true;
 }
@@ -370,7 +368,7 @@ void qemu_load_module_for_opts(const char *group)
         }
         for (sl = modinfo->opts; *sl != NULL; sl++) {
             if (strcmp(group, *sl) == 0) {
-                module_load_one("", modinfo->name, false);
+                module_load_one("", modinfo->name);
             }
         }
     }
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-08 18:30 [PATCH v4 0/3] improve error handling for module load Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 1/3] module: removed unused function argument "mayfail" Claudio Fontana
@ 2022-09-08 18:30 ` Claudio Fontana
  2022-09-15  8:43   ` Claudio Fontana
                     ` (2 more replies)
  2022-09-08 18:30 ` [PATCH v4 3/3] accel: abort if we fail to load the accelerator plugin Claudio Fontana
  2 siblings, 3 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-08 18:30 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé,
	Claudio Fontana

improve error handling during module load, by changing:

bool module_load_one(const char *prefix, const char *lib_name);
void module_load_qom_one(const char *type);

to:

bool module_load_one(const char *prefix, const char *name, Error **errp);
bool module_load_qom_one(const char *type, Error **errp);

module_load_qom_one has been introduced in:

commit 28457744c345 ("module: qom module support"), which built on top of
module_load_one, but discarded the bool return value. Restore it.

Adapt all callers to emit errors, or ignore them, or fail hard,
as appropriate in each context.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
---
 audio/audio.c         |   9 ++-
 block.c               |  15 ++++-
 block/dmg.c           |  18 +++++-
 hw/core/qdev.c        |  10 ++-
 include/qemu/module.h |  38 ++++++++++--
 qom/object.c          |  18 +++++-
 softmmu/qtest.c       |   6 +-
 ui/console.c          |  18 +++++-
 util/module.c         | 140 ++++++++++++++++++++++++------------------
 9 files changed, 194 insertions(+), 78 deletions(-)

diff --git a/audio/audio.c b/audio/audio.c
index 76b8735b44..cff7464c07 100644
--- a/audio/audio.c
+++ b/audio/audio.c
@@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
 audio_driver *audio_driver_lookup(const char *name)
 {
     struct audio_driver *d;
+    Error *local_err = NULL;
 
     QLIST_FOREACH(d, &audio_drivers, next) {
         if (strcmp(name, d->name) == 0) {
@@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
         }
     }
 
-    audio_module_load_one(name);
+    if (!audio_module_load_one(name, &local_err)) {
+        if (local_err) {
+            error_report_err(local_err);
+        }
+        return NULL;
+    }
+
     QLIST_FOREACH(d, &audio_drivers, next) {
         if (strcmp(name, d->name) == 0) {
             return d;
diff --git a/block.c b/block.c
index bc85f46eed..8b610c6d95 100644
--- a/block.c
+++ b/block.c
@@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
     /* The driver isn't registered, maybe we need to load a module */
     for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
         if (!strcmp(block_driver_modules[i].format_name, format_name)) {
-            block_module_load_one(block_driver_modules[i].library_name);
+            Error *local_err = NULL;
+            if (!block_module_load_one(block_driver_modules[i].library_name,
+                                       &local_err)) {
+                if (local_err) {
+                    error_report_err(local_err);
+                }
+                return NULL;
+            }
             break;
         }
     }
@@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
     for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
         if (block_driver_modules[i].protocol_name &&
             !strcmp(block_driver_modules[i].protocol_name, protocol)) {
-            block_module_load_one(block_driver_modules[i].library_name);
+            Error *local_err = NULL;
+            if (!block_module_load_one(block_driver_modules[i].library_name,
+                                       &local_err) && local_err) {
+                error_report_err(local_err);
+            }
             break;
         }
     }
diff --git a/block/dmg.c b/block/dmg.c
index 98db18d82a..11d184d39c 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
     uint64_t plist_xml_offset, plist_xml_length;
     int64_t offset;
     int ret;
+    Error *local_err = NULL;
 
     ret = bdrv_apply_auto_read_only(bs, NULL, errp);
     if (ret < 0) {
@@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
-    block_module_load_one("dmg-bz2");
-    block_module_load_one("dmg-lzfse");
+    if (!block_module_load_one("dmg-bz2", &local_err)) {
+        if (local_err) {
+            error_report_err(local_err);
+            return -EINVAL;
+        }
+        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
+    }
+    local_err = NULL;
+    if (!block_module_load_one("dmg-lzfse", &local_err)) {
+        if (local_err) {
+            error_report_err(local_err);
+            return -EINVAL;
+        }
+        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
+    }
 
     s->n_chunks = 0;
     s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 0806d8fcaa..5902c59c94 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
 DeviceState *qdev_new(const char *name)
 {
     if (!object_class_by_name(name)) {
-        module_load_qom_one(name);
+        Error *local_err = NULL;
+        if (!module_load_qom_one(name, &local_err)) {
+            if (local_err) {
+                error_report_err(local_err);
+            } else {
+                error_report("could not find a module for type '%s'", name);
+            }
+            abort();
+        }
     }
     return DEVICE(object_new(name));
 }
diff --git a/include/qemu/module.h b/include/qemu/module.h
index 8c012bbe03..78d4c4de96 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -61,16 +61,44 @@ typedef enum {
 #define fuzz_target_init(function) module_init(function, \
                                                MODULE_INIT_FUZZ_TARGET)
 #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
-#define block_module_load_one(lib) module_load_one("block-", lib)
-#define ui_module_load_one(lib) module_load_one("ui-", lib)
-#define audio_module_load_one(lib) module_load_one("audio-", lib)
+#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
+#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
+#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
 
 void register_module_init(void (*fn)(void), module_init_type type);
 void register_dso_module_init(void (*fn)(void), module_init_type type);
 
 void module_call_init(module_init_type type);
-bool module_load_one(const char *prefix, const char *lib_name);
-void module_load_qom_one(const char *type);
+
+/*
+ * module_load_one: attempt to load a module from a set of directories
+ *
+ * directories searched are:
+ * - getenv("QEMU_MODULE_DIR")
+ * - get_relocated_path(CONFIG_QEMU_MODDIR);
+ * - /var/run/qemu/${version_dir}
+ *
+ * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
+ * name:           name of the module
+ * errp:           error to set in case the module is found, but load failed.
+ *
+ * Return value:   true on success (found and loaded);
+ *                 if module if found, but load failed, errp will be set.
+ *                 if module is not found, errp will not be set.
+ */
+bool module_load_one(const char *prefix, const char *name, Error **errp);
+
+/*
+ * module_load_qom_one: attempt to load a module to provide a QOM type
+ *
+ * type:           the type to be provided
+ * errp:           error to set.
+ *
+ * Return value:   true on success (found and loaded), false on failure.
+ *                 If a module is simply not found for the type,
+ *                 errp will not be set.
+ */
+bool module_load_qom_one(const char *type, Error **errp);
 void module_load_qom_all(void);
 void module_allow_arch(const char *arch);
 
diff --git a/qom/object.c b/qom/object.c
index d34608558e..493e5ce231 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -526,8 +526,14 @@ void object_initialize(void *data, size_t size, const char *typename)
 
 #ifdef CONFIG_MODULES
     if (!type) {
-        module_load_qom_one(typename);
-        type = type_get_by_name(typename);
+        Error *local_err = NULL;
+        if (!module_load_qom_one(typename, &local_err)) {
+            if (local_err) {
+                error_report_err(local_err);
+            }
+        } else {
+            type = type_get_by_name(typename);
+        }
     }
 #endif
     if (!type) {
@@ -1033,7 +1039,13 @@ ObjectClass *module_object_class_by_name(const char *typename)
     oc = object_class_by_name(typename);
 #ifdef CONFIG_MODULES
     if (!oc) {
-        module_load_qom_one(typename);
+        Error *local_err = NULL;
+        if (!module_load_qom_one(typename, &local_err)) {
+            if (local_err) {
+                error_report_err(local_err);
+            }
+            return NULL;
+        }
         oc = object_class_by_name(typename);
     }
 #endif
diff --git a/softmmu/qtest.c b/softmmu/qtest.c
index 76eb7bac56..bb83c7aae9 100644
--- a/softmmu/qtest.c
+++ b/softmmu/qtest.c
@@ -753,12 +753,16 @@ static void qtest_process_command(CharBackend *chr, gchar **words)
         qtest_sendf(chr, "OK %"PRIi64"\n",
                     (int64_t)qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL));
     } else if (strcmp(words[0], "module_load") == 0) {
+        Error *local_err = NULL;
         g_assert(words[1] && words[2]);
 
         qtest_send_prefix(chr);
-        if (module_load_one(words[1], words[2])) {
+        if (module_load_one(words[1], words[2], &local_err)) {
             qtest_sendf(chr, "OK\n");
         } else {
+            if (local_err) {
+                error_report_err(local_err);
+            }
             qtest_sendf(chr, "FAIL\n");
         }
     } else if (qtest_enabled() && strcmp(words[0], "clock_set") == 0) {
diff --git a/ui/console.c b/ui/console.c
index 765892f84f..34dd206167 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2632,7 +2632,11 @@ bool qemu_display_find_default(DisplayOptions *opts)
 
     for (i = 0; i < (int)ARRAY_SIZE(prio); i++) {
         if (dpys[prio[i]] == NULL) {
-            ui_module_load_one(DisplayType_str(prio[i]));
+            Error *local_err = NULL;
+            if (!ui_module_load_one(DisplayType_str(prio[i]), &local_err)
+                && local_err) {
+                error_report_err(local_err);
+            }
         }
         if (dpys[prio[i]] == NULL) {
             continue;
@@ -2650,7 +2654,11 @@ void qemu_display_early_init(DisplayOptions *opts)
         return;
     }
     if (dpys[opts->type] == NULL) {
-        ui_module_load_one(DisplayType_str(opts->type));
+        Error *local_err = NULL;
+        if (!ui_module_load_one(DisplayType_str(opts->type), &local_err)
+            && local_err) {
+            error_report_err(local_err);
+        }
     }
     if (dpys[opts->type] == NULL) {
         error_report("Display '%s' is not available.",
@@ -2680,7 +2688,11 @@ void qemu_display_help(void)
     printf("none\n");
     for (idx = DISPLAY_TYPE_NONE; idx < DISPLAY_TYPE__MAX; idx++) {
         if (!dpys[idx]) {
-            ui_module_load_one(DisplayType_str(idx));
+            Error *local_err = NULL;
+            if (!ui_module_load_one(DisplayType_str(idx), &local_err)
+                && local_err) {
+                error_report_err(local_err);
+            }
         }
         if (dpys[idx]) {
             printf("%s\n",  DisplayType_str(dpys[idx]->type));
diff --git a/util/module.c b/util/module.c
index 8563edd626..90ea8adf7d 100644
--- a/util/module.c
+++ b/util/module.c
@@ -21,6 +21,7 @@
 #include "qemu/module.h"
 #include "qemu/cutils.h"
 #include "qemu/config-file.h"
+#include "qapi/error.h"
 #ifdef CONFIG_MODULE_UPGRADES
 #include "qemu-version.h"
 #endif
@@ -144,25 +145,22 @@ static bool module_check_arch(const QemuModinfo *modinfo)
     return true;
 }
 
-static int module_load_file(const char *fname, bool export_symbols)
+/*
+ * module_load_dso: attempt to load an existing dso file
+ *
+ * fname:          full pathname of the file to load
+ * export_symbols: if true, add the symbols to the global name space
+ * errp:           error to set.
+ *
+ * Return value:   true on success, false on error, and errp will be set.
+ */
+static bool module_load_dso(const char *fname, bool export_symbols,
+                            Error **errp)
 {
     GModule *g_module;
     void (*sym)(void);
-    const char *dsosuf = CONFIG_HOST_DSOSUF;
-    int len = strlen(fname);
-    int suf_len = strlen(dsosuf);
     ModuleEntry *e, *next;
-    int ret, flags;
-
-    if (len <= suf_len || strcmp(&fname[len - suf_len], dsosuf)) {
-        /* wrong suffix */
-        ret = -EINVAL;
-        goto out;
-    }
-    if (access(fname, F_OK)) {
-        ret = -ENOENT;
-        goto out;
-    }
+    int flags;
 
     assert(QTAILQ_EMPTY(&dso_init_list));
 
@@ -172,46 +170,38 @@ static int module_load_file(const char *fname, bool export_symbols)
     }
     g_module = g_module_open(fname, flags);
     if (!g_module) {
-        fprintf(stderr, "Failed to open module: %s\n",
-                g_module_error());
-        ret = -EINVAL;
-        goto out;
+        error_setg(errp, "failed to open module: %s", g_module_error());
+        return false;
     }
     if (!g_module_symbol(g_module, DSO_STAMP_FUN_STR, (gpointer *)&sym)) {
-        fprintf(stderr, "Failed to initialize module: %s\n",
-                fname);
-        /* Print some info if this is a QEMU module (but from different build),
-         * this will make debugging user problems easier. */
+        error_setg(errp, "failed to initialize module: %s", fname);
+        /*
+         * Print some info if this is a QEMU module (but from different build),
+         * this will make debugging user problems easier.
+         */
         if (g_module_symbol(g_module, "qemu_module_dummy", (gpointer *)&sym)) {
-            fprintf(stderr,
-                    "Note: only modules from the same build can be loaded.\n");
+            error_append_hint(errp,
+                "Only modules from the same build can be loaded.\n");
         }
         g_module_close(g_module);
-        ret = -EINVAL;
-    } else {
-        QTAILQ_FOREACH(e, &dso_init_list, node) {
-            e->init();
-            register_module_init(e->init, e->type);
-        }
-        ret = 0;
+        return false;
     }
 
+    QTAILQ_FOREACH(e, &dso_init_list, node) {
+        e->init();
+        register_module_init(e->init, e->type);
+    }
     trace_module_load_module(fname);
     QTAILQ_FOREACH_SAFE(e, &dso_init_list, node, next) {
         QTAILQ_REMOVE(&dso_init_list, e, node);
         g_free(e);
     }
-out:
-    return ret;
+    return true;
 }
-#endif
 
-bool module_load_one(const char *prefix, const char *lib_name)
+bool module_load_one(const char *prefix, const char *name, Error **errp)
 {
     bool success = false;
-
-#ifdef CONFIG_MODULES
-    char *fname = NULL;
 #ifdef CONFIG_MODULE_UPGRADES
     char *version_dir;
 #endif
@@ -219,14 +209,13 @@ bool module_load_one(const char *prefix, const char *lib_name)
     char *dirs[5];
     char *module_name;
     int i = 0, n_dirs = 0;
-    int ret;
     bool export_symbols = false;
     static GHashTable *loaded_modules;
     const QemuModinfo *modinfo;
     const char **sl;
 
     if (!g_module_supported()) {
-        fprintf(stderr, "Module is not supported by system.\n");
+        error_setg(errp, "%s", "this platform does not support GLib modules");
         return false;
     }
 
@@ -234,7 +223,7 @@ bool module_load_one(const char *prefix, const char *lib_name)
         loaded_modules = g_hash_table_new(g_str_hash, g_str_equal);
     }
 
-    module_name = g_strdup_printf("%s%s", prefix, lib_name);
+    module_name = g_strdup_printf("%s%s", prefix, name);
 
     if (g_hash_table_contains(loaded_modules, module_name)) {
         g_free(module_name);
@@ -246,6 +235,8 @@ bool module_load_one(const char *prefix, const char *lib_name)
         if (modinfo->arch) {
             if (strcmp(modinfo->name, module_name) == 0) {
                 if (!module_check_arch(modinfo)) {
+                    error_setg(errp, "module arch does not match: "
+                        "expected '%s', got '%s'", module_arch, modinfo->arch);
                     return false;
                 }
             }
@@ -254,7 +245,9 @@ bool module_load_one(const char *prefix, const char *lib_name)
             if (strcmp(modinfo->name, module_name) == 0) {
                 /* we depend on other module(s) */
                 for (sl = modinfo->deps; *sl != NULL; sl++) {
-                    module_load_one("", *sl);
+                    if (!(module_load_one("", *sl, errp))) {
+                        return false;
+                    }
                 }
             } else {
                 for (sl = modinfo->deps; *sl != NULL; sl++) {
@@ -283,16 +276,26 @@ bool module_load_one(const char *prefix, const char *lib_name)
     assert(n_dirs <= ARRAY_SIZE(dirs));
 
     for (i = 0; i < n_dirs; i++) {
-        fname = g_strdup_printf("%s/%s%s",
-                dirs[i], module_name, CONFIG_HOST_DSOSUF);
-        ret = module_load_file(fname, export_symbols);
-        g_free(fname);
-        fname = NULL;
-        /* Try loading until loaded a module file */
-        if (!ret) {
-            success = true;
+        char *fname = g_strdup_printf("%s/%s%s",
+                                      dirs[i], module_name, CONFIG_HOST_DSOSUF);
+        int ret = access(fname, F_OK);
+        if (ret != 0) {
+            if (errno == ENOENT || errno == ENOTDIR) {
+                /*
+                 * if we don't find the module in this dir, try the next one.
+                 * If we don't find it in any dir, that can be fine too: user
+                 * did not install the module. We will return false in this
+                 * case, with no error set.
+                 */
+                continue;
+            }
+            /* most common is EACCES here */
+            error_setg_errno(errp, errno, "error trying to access %s", fname);
             break;
         }
+        success = module_load_dso(fname, export_symbols, errp);
+        g_free(fname);
+        break;
     }
 
     if (!success) {
@@ -304,21 +307,20 @@ bool module_load_one(const char *prefix, const char *lib_name)
         g_free(dirs[i]);
     }
 
-#endif
     return success;
 }
 
-#ifdef CONFIG_MODULES
-
 static bool module_loaded_qom_all;
 
-void module_load_qom_one(const char *type)
+bool module_load_qom_one(const char *type, Error **errp)
 {
+    bool found = false;
     const QemuModinfo *modinfo;
     const char **sl;
 
     if (!type) {
-        return;
+        error_setg(errp, "%s", "type is NULL");
+        return false;
     }
 
     trace_module_lookup_object_type(type);
@@ -331,15 +333,26 @@ void module_load_qom_one(const char *type)
         }
         for (sl = modinfo->objs; *sl != NULL; sl++) {
             if (strcmp(type, *sl) == 0) {
-                module_load_one("", modinfo->name);
+                if (found) {
+                    error_setg(errp, "multiple modules providing '%s'", type);
+                    found = false;
+                    break;
+                }
+                found = module_load_one("", modinfo->name, errp);
+                if (!found) {
+                    /* errp optionally set in module_load_one */
+                    break;
+                }
             }
         }
     }
+    return found;
 }
 
 void module_load_qom_all(void)
 {
     const QemuModinfo *modinfo;
+    Error *local_err = NULL;
 
     if (module_loaded_qom_all) {
         return;
@@ -352,7 +365,9 @@ void module_load_qom_all(void)
         if (!module_check_arch(modinfo)) {
             continue;
         }
-        module_load_one("", modinfo->name);
+        if (!module_load_one("", modinfo->name, &local_err) && local_err) {
+            error_report_err(local_err);
+        }
     }
     module_loaded_qom_all = true;
 }
@@ -368,7 +383,11 @@ void qemu_load_module_for_opts(const char *group)
         }
         for (sl = modinfo->opts; *sl != NULL; sl++) {
             if (strcmp(group, *sl) == 0) {
-                module_load_one("", modinfo->name);
+                Error *local_err = NULL;
+                if (!module_load_one("", modinfo->name, &local_err)
+                    && local_err) {
+                    error_report_err(local_err);
+                }
             }
         }
     }
@@ -378,7 +397,8 @@ void qemu_load_module_for_opts(const char *group)
 
 void module_allow_arch(const char *arch) {}
 void qemu_load_module_for_opts(const char *group) {}
-void module_load_qom_one(const char *type) {}
+bool module_load_one(const char *prefix, const char *name, Error **errp) { return true; }
+bool module_load_qom_one(const char *type, Error **errp) { return true; }
 void module_load_qom_all(void) {}
 
 #endif
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v4 3/3] accel: abort if we fail to load the accelerator plugin
  2022-09-08 18:30 [PATCH v4 0/3] improve error handling for module load Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 1/3] module: removed unused function argument "mayfail" Claudio Fontana
  2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
@ 2022-09-08 18:30 ` Claudio Fontana
  2 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-08 18:30 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé,
	Claudio Fontana

if QEMU is configured with modules enabled, it is possible that the
load of an accelerator module will fail.
Abort in this case, relying on module_object_class_by_name to report
the specific load error if any.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/accel-softmmu.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/accel/accel-softmmu.c b/accel/accel-softmmu.c
index 67276e4f52..9fa4849f2c 100644
--- a/accel/accel-softmmu.c
+++ b/accel/accel-softmmu.c
@@ -66,6 +66,7 @@ void accel_init_ops_interfaces(AccelClass *ac)
 {
     const char *ac_name;
     char *ops_name;
+    ObjectClass *oc;
     AccelOpsClass *ops;
 
     ac_name = object_class_get_name(OBJECT_CLASS(ac));
@@ -73,8 +74,13 @@ void accel_init_ops_interfaces(AccelClass *ac)
 
     ops_name = g_strdup_printf("%s" ACCEL_OPS_SUFFIX, ac_name);
     ops = ACCEL_OPS_CLASS(module_object_class_by_name(ops_name));
+    oc = module_object_class_by_name(ops_name);
+    if (!oc) {
+        error_report("fatal: could not load module for type '%s'", ops_name);
+        abort();
+    }
     g_free(ops_name);
-
+    ops = ACCEL_OPS_CLASS(oc);
     /*
      * all accelerators need to define ops, providing at least a mandatory
      * non-NULL create_vcpu_thread operation.
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
@ 2022-09-15  8:43   ` Claudio Fontana
  2022-09-16  8:13   ` Richard Henderson
  2022-09-16  9:27   ` Markus Armbruster
  2 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-15  8:43 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé

On 9/8/22 20:30, Claudio Fontana wrote:
> improve error handling during module load, by changing:
> 
> bool module_load_one(const char *prefix, const char *lib_name);
> void module_load_qom_one(const char *type);
> 
> to:
> 
> bool module_load_one(const char *prefix, const char *name, Error **errp);
> bool module_load_qom_one(const char *type, Error **errp);
> 
> module_load_qom_one has been introduced in:
> 
> commit 28457744c345 ("module: qom module support"), which built on top of
> module_load_one, but discarded the bool return value. Restore it.
> 
> Adapt all callers to emit errors, or ignore them, or fail hard,
> as appropriate in each context.
> 
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  audio/audio.c         |   9 ++-
>  block.c               |  15 ++++-
>  block/dmg.c           |  18 +++++-
>  hw/core/qdev.c        |  10 ++-
>  include/qemu/module.h |  38 ++++++++++--
>  qom/object.c          |  18 +++++-
>  softmmu/qtest.c       |   6 +-
>  ui/console.c          |  18 +++++-
>  util/module.c         | 140 ++++++++++++++++++++++++------------------
>  9 files changed, 194 insertions(+), 78 deletions(-)
> 
> diff --git a/audio/audio.c b/audio/audio.c
> index 76b8735b44..cff7464c07 100644
> --- a/audio/audio.c
> +++ b/audio/audio.c
> @@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
>  audio_driver *audio_driver_lookup(const char *name)
>  {
>      struct audio_driver *d;
> +    Error *local_err = NULL;
>  
>      QLIST_FOREACH(d, &audio_drivers, next) {
>          if (strcmp(name, d->name) == 0) {
> @@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
>          }
>      }
>  
> -    audio_module_load_one(name);
> +    if (!audio_module_load_one(name, &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +        }
> +        return NULL;
> +    }
> +
>      QLIST_FOREACH(d, &audio_drivers, next) {
>          if (strcmp(name, d->name) == 0) {
>              return d;
> diff --git a/block.c b/block.c
> index bc85f46eed..8b610c6d95 100644
> --- a/block.c
> +++ b/block.c
> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>      /* The driver isn't registered, maybe we need to load a module */
>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
> -            block_module_load_one(block_driver_modules[i].library_name);
> +            Error *local_err = NULL;
> +            if (!block_module_load_one(block_driver_modules[i].library_name,
> +                                       &local_err)) {
> +                if (local_err) {
> +                    error_report_err(local_err);
> +                }
> +                return NULL;
> +            }
>              break;
>          }
>      }
> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>          if (block_driver_modules[i].protocol_name &&
>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
> -            block_module_load_one(block_driver_modules[i].library_name);
> +            Error *local_err = NULL;
> +            if (!block_module_load_one(block_driver_modules[i].library_name,
> +                                       &local_err) && local_err) {
> +                error_report_err(local_err);
> +            }
>              break;
>          }
>      }
> diff --git a/block/dmg.c b/block/dmg.c
> index 98db18d82a..11d184d39c 100644
> --- a/block/dmg.c
> +++ b/block/dmg.c
> @@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>      uint64_t plist_xml_offset, plist_xml_length;
>      int64_t offset;
>      int ret;
> +    Error *local_err = NULL;
>  
>      ret = bdrv_apply_auto_read_only(bs, NULL, errp);
>      if (ret < 0) {
> @@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>          return -EINVAL;
>      }
>  
> -    block_module_load_one("dmg-bz2");
> -    block_module_load_one("dmg-lzfse");
> +    if (!block_module_load_one("dmg-bz2", &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
> +    }
> +    local_err = NULL;
> +    if (!block_module_load_one("dmg-lzfse", &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
> +    }

Hi Richard, Kevin, does this look ok now?

Thanks,

Claudio

>  
>      s->n_chunks = 0;
>      s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 0806d8fcaa..5902c59c94 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
>  DeviceState *qdev_new(const char *name)
>  {
>      if (!object_class_by_name(name)) {
> -        module_load_qom_one(name);
> +        Error *local_err = NULL;
> +        if (!module_load_qom_one(name, &local_err)) {
> +            if (local_err) {
> +                error_report_err(local_err);
> +            } else {
> +                error_report("could not find a module for type '%s'", name);
> +            }
> +            abort();
> +        }
>      }
>      return DEVICE(object_new(name));
>  }
> diff --git a/include/qemu/module.h b/include/qemu/module.h
> index 8c012bbe03..78d4c4de96 100644
> --- a/include/qemu/module.h
> +++ b/include/qemu/module.h
> @@ -61,16 +61,44 @@ typedef enum {
>  #define fuzz_target_init(function) module_init(function, \
>                                                 MODULE_INIT_FUZZ_TARGET)
>  #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
> -#define block_module_load_one(lib) module_load_one("block-", lib)
> -#define ui_module_load_one(lib) module_load_one("ui-", lib)
> -#define audio_module_load_one(lib) module_load_one("audio-", lib)
> +#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
> +#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
> +#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
>  
>  void register_module_init(void (*fn)(void), module_init_type type);
>  void register_dso_module_init(void (*fn)(void), module_init_type type);
>  
>  void module_call_init(module_init_type type);
> -bool module_load_one(const char *prefix, const char *lib_name);
> -void module_load_qom_one(const char *type);
> +
> +/*
> + * module_load_one: attempt to load a module from a set of directories
> + *
> + * directories searched are:
> + * - getenv("QEMU_MODULE_DIR")
> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
> + * - /var/run/qemu/${version_dir}
> + *
> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
> + * name:           name of the module
> + * errp:           error to set in case the module is found, but load failed.
> + *
> + * Return value:   true on success (found and loaded);
> + *                 if module if found, but load failed, errp will be set.
> + *                 if module is not found, errp will not be set.
> + */
> +bool module_load_one(const char *prefix, const char *name, Error **errp);
> +
> +/*
> + * module_load_qom_one: attempt to load a module to provide a QOM type
> + *
> + * type:           the type to be provided
> + * errp:           error to set.
> + *
> + * Return value:   true on success (found and loaded), false on failure.
> + *                 If a module is simply not found for the type,
> + *                 errp will not be set.
> + */
> +bool module_load_qom_one(const char *type, Error **errp);
>  void module_load_qom_all(void);
>  void module_allow_arch(const char *arch);
>  
> diff --git a/qom/object.c b/qom/object.c
> index d34608558e..493e5ce231 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -526,8 +526,14 @@ void object_initialize(void *data, size_t size, const char *typename)
>  
>  #ifdef CONFIG_MODULES
>      if (!type) {
> -        module_load_qom_one(typename);
> -        type = type_get_by_name(typename);
> +        Error *local_err = NULL;
> +        if (!module_load_qom_one(typename, &local_err)) {
> +            if (local_err) {
> +                error_report_err(local_err);
> +            }
> +        } else {
> +            type = type_get_by_name(typename);
> +        }
>      }
>  #endif
>      if (!type) {
> @@ -1033,7 +1039,13 @@ ObjectClass *module_object_class_by_name(const char *typename)
>      oc = object_class_by_name(typename);
>  #ifdef CONFIG_MODULES
>      if (!oc) {
> -        module_load_qom_one(typename);
> +        Error *local_err = NULL;
> +        if (!module_load_qom_one(typename, &local_err)) {
> +            if (local_err) {
> +                error_report_err(local_err);
> +            }
> +            return NULL;
> +        }
>          oc = object_class_by_name(typename);
>      }
>  #endif
> diff --git a/softmmu/qtest.c b/softmmu/qtest.c
> index 76eb7bac56..bb83c7aae9 100644
> --- a/softmmu/qtest.c
> +++ b/softmmu/qtest.c
> @@ -753,12 +753,16 @@ static void qtest_process_command(CharBackend *chr, gchar **words)
>          qtest_sendf(chr, "OK %"PRIi64"\n",
>                      (int64_t)qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL));
>      } else if (strcmp(words[0], "module_load") == 0) {
> +        Error *local_err = NULL;
>          g_assert(words[1] && words[2]);
>  
>          qtest_send_prefix(chr);
> -        if (module_load_one(words[1], words[2])) {
> +        if (module_load_one(words[1], words[2], &local_err)) {
>              qtest_sendf(chr, "OK\n");
>          } else {
> +            if (local_err) {
> +                error_report_err(local_err);
> +            }
>              qtest_sendf(chr, "FAIL\n");
>          }
>      } else if (qtest_enabled() && strcmp(words[0], "clock_set") == 0) {
> diff --git a/ui/console.c b/ui/console.c
> index 765892f84f..34dd206167 100644
> --- a/ui/console.c
> +++ b/ui/console.c
> @@ -2632,7 +2632,11 @@ bool qemu_display_find_default(DisplayOptions *opts)
>  
>      for (i = 0; i < (int)ARRAY_SIZE(prio); i++) {
>          if (dpys[prio[i]] == NULL) {
> -            ui_module_load_one(DisplayType_str(prio[i]));
> +            Error *local_err = NULL;
> +            if (!ui_module_load_one(DisplayType_str(prio[i]), &local_err)
> +                && local_err) {
> +                error_report_err(local_err);
> +            }
>          }
>          if (dpys[prio[i]] == NULL) {
>              continue;
> @@ -2650,7 +2654,11 @@ void qemu_display_early_init(DisplayOptions *opts)
>          return;
>      }
>      if (dpys[opts->type] == NULL) {
> -        ui_module_load_one(DisplayType_str(opts->type));
> +        Error *local_err = NULL;
> +        if (!ui_module_load_one(DisplayType_str(opts->type), &local_err)
> +            && local_err) {
> +            error_report_err(local_err);
> +        }
>      }
>      if (dpys[opts->type] == NULL) {
>          error_report("Display '%s' is not available.",
> @@ -2680,7 +2688,11 @@ void qemu_display_help(void)
>      printf("none\n");
>      for (idx = DISPLAY_TYPE_NONE; idx < DISPLAY_TYPE__MAX; idx++) {
>          if (!dpys[idx]) {
> -            ui_module_load_one(DisplayType_str(idx));
> +            Error *local_err = NULL;
> +            if (!ui_module_load_one(DisplayType_str(idx), &local_err)
> +                && local_err) {
> +                error_report_err(local_err);
> +            }
>          }
>          if (dpys[idx]) {
>              printf("%s\n",  DisplayType_str(dpys[idx]->type));
> diff --git a/util/module.c b/util/module.c
> index 8563edd626..90ea8adf7d 100644
> --- a/util/module.c
> +++ b/util/module.c
> @@ -21,6 +21,7 @@
>  #include "qemu/module.h"
>  #include "qemu/cutils.h"
>  #include "qemu/config-file.h"
> +#include "qapi/error.h"
>  #ifdef CONFIG_MODULE_UPGRADES
>  #include "qemu-version.h"
>  #endif
> @@ -144,25 +145,22 @@ static bool module_check_arch(const QemuModinfo *modinfo)
>      return true;
>  }
>  
> -static int module_load_file(const char *fname, bool export_symbols)
> +/*
> + * module_load_dso: attempt to load an existing dso file
> + *
> + * fname:          full pathname of the file to load
> + * export_symbols: if true, add the symbols to the global name space
> + * errp:           error to set.
> + *
> + * Return value:   true on success, false on error, and errp will be set.
> + */
> +static bool module_load_dso(const char *fname, bool export_symbols,
> +                            Error **errp)
>  {
>      GModule *g_module;
>      void (*sym)(void);
> -    const char *dsosuf = CONFIG_HOST_DSOSUF;
> -    int len = strlen(fname);
> -    int suf_len = strlen(dsosuf);
>      ModuleEntry *e, *next;
> -    int ret, flags;
> -
> -    if (len <= suf_len || strcmp(&fname[len - suf_len], dsosuf)) {
> -        /* wrong suffix */
> -        ret = -EINVAL;
> -        goto out;
> -    }
> -    if (access(fname, F_OK)) {
> -        ret = -ENOENT;
> -        goto out;
> -    }
> +    int flags;
>  
>      assert(QTAILQ_EMPTY(&dso_init_list));
>  
> @@ -172,46 +170,38 @@ static int module_load_file(const char *fname, bool export_symbols)
>      }
>      g_module = g_module_open(fname, flags);
>      if (!g_module) {
> -        fprintf(stderr, "Failed to open module: %s\n",
> -                g_module_error());
> -        ret = -EINVAL;
> -        goto out;
> +        error_setg(errp, "failed to open module: %s", g_module_error());
> +        return false;
>      }
>      if (!g_module_symbol(g_module, DSO_STAMP_FUN_STR, (gpointer *)&sym)) {
> -        fprintf(stderr, "Failed to initialize module: %s\n",
> -                fname);
> -        /* Print some info if this is a QEMU module (but from different build),
> -         * this will make debugging user problems easier. */
> +        error_setg(errp, "failed to initialize module: %s", fname);
> +        /*
> +         * Print some info if this is a QEMU module (but from different build),
> +         * this will make debugging user problems easier.
> +         */
>          if (g_module_symbol(g_module, "qemu_module_dummy", (gpointer *)&sym)) {
> -            fprintf(stderr,
> -                    "Note: only modules from the same build can be loaded.\n");
> +            error_append_hint(errp,
> +                "Only modules from the same build can be loaded.\n");
>          }
>          g_module_close(g_module);
> -        ret = -EINVAL;
> -    } else {
> -        QTAILQ_FOREACH(e, &dso_init_list, node) {
> -            e->init();
> -            register_module_init(e->init, e->type);
> -        }
> -        ret = 0;
> +        return false;
>      }
>  
> +    QTAILQ_FOREACH(e, &dso_init_list, node) {
> +        e->init();
> +        register_module_init(e->init, e->type);
> +    }
>      trace_module_load_module(fname);
>      QTAILQ_FOREACH_SAFE(e, &dso_init_list, node, next) {
>          QTAILQ_REMOVE(&dso_init_list, e, node);
>          g_free(e);
>      }
> -out:
> -    return ret;
> +    return true;
>  }
> -#endif
>  
> -bool module_load_one(const char *prefix, const char *lib_name)
> +bool module_load_one(const char *prefix, const char *name, Error **errp)
>  {
>      bool success = false;
> -
> -#ifdef CONFIG_MODULES
> -    char *fname = NULL;
>  #ifdef CONFIG_MODULE_UPGRADES
>      char *version_dir;
>  #endif
> @@ -219,14 +209,13 @@ bool module_load_one(const char *prefix, const char *lib_name)
>      char *dirs[5];
>      char *module_name;
>      int i = 0, n_dirs = 0;
> -    int ret;
>      bool export_symbols = false;
>      static GHashTable *loaded_modules;
>      const QemuModinfo *modinfo;
>      const char **sl;
>  
>      if (!g_module_supported()) {
> -        fprintf(stderr, "Module is not supported by system.\n");
> +        error_setg(errp, "%s", "this platform does not support GLib modules");
>          return false;
>      }
>  
> @@ -234,7 +223,7 @@ bool module_load_one(const char *prefix, const char *lib_name)
>          loaded_modules = g_hash_table_new(g_str_hash, g_str_equal);
>      }
>  
> -    module_name = g_strdup_printf("%s%s", prefix, lib_name);
> +    module_name = g_strdup_printf("%s%s", prefix, name);
>  
>      if (g_hash_table_contains(loaded_modules, module_name)) {
>          g_free(module_name);
> @@ -246,6 +235,8 @@ bool module_load_one(const char *prefix, const char *lib_name)
>          if (modinfo->arch) {
>              if (strcmp(modinfo->name, module_name) == 0) {
>                  if (!module_check_arch(modinfo)) {
> +                    error_setg(errp, "module arch does not match: "
> +                        "expected '%s', got '%s'", module_arch, modinfo->arch);
>                      return false;
>                  }
>              }
> @@ -254,7 +245,9 @@ bool module_load_one(const char *prefix, const char *lib_name)
>              if (strcmp(modinfo->name, module_name) == 0) {
>                  /* we depend on other module(s) */
>                  for (sl = modinfo->deps; *sl != NULL; sl++) {
> -                    module_load_one("", *sl);
> +                    if (!(module_load_one("", *sl, errp))) {
> +                        return false;
> +                    }
>                  }
>              } else {
>                  for (sl = modinfo->deps; *sl != NULL; sl++) {
> @@ -283,16 +276,26 @@ bool module_load_one(const char *prefix, const char *lib_name)
>      assert(n_dirs <= ARRAY_SIZE(dirs));
>  
>      for (i = 0; i < n_dirs; i++) {
> -        fname = g_strdup_printf("%s/%s%s",
> -                dirs[i], module_name, CONFIG_HOST_DSOSUF);
> -        ret = module_load_file(fname, export_symbols);
> -        g_free(fname);
> -        fname = NULL;
> -        /* Try loading until loaded a module file */
> -        if (!ret) {
> -            success = true;
> +        char *fname = g_strdup_printf("%s/%s%s",
> +                                      dirs[i], module_name, CONFIG_HOST_DSOSUF);
> +        int ret = access(fname, F_OK);
> +        if (ret != 0) {
> +            if (errno == ENOENT || errno == ENOTDIR) {
> +                /*
> +                 * if we don't find the module in this dir, try the next one.
> +                 * If we don't find it in any dir, that can be fine too: user
> +                 * did not install the module. We will return false in this
> +                 * case, with no error set.
> +                 */
> +                continue;
> +            }
> +            /* most common is EACCES here */
> +            error_setg_errno(errp, errno, "error trying to access %s", fname);
>              break;
>          }
> +        success = module_load_dso(fname, export_symbols, errp);
> +        g_free(fname);
> +        break;
>      }
>  
>      if (!success) {
> @@ -304,21 +307,20 @@ bool module_load_one(const char *prefix, const char *lib_name)
>          g_free(dirs[i]);
>      }
>  
> -#endif
>      return success;
>  }
>  
> -#ifdef CONFIG_MODULES
> -
>  static bool module_loaded_qom_all;
>  
> -void module_load_qom_one(const char *type)
> +bool module_load_qom_one(const char *type, Error **errp)
>  {
> +    bool found = false;
>      const QemuModinfo *modinfo;
>      const char **sl;
>  
>      if (!type) {
> -        return;
> +        error_setg(errp, "%s", "type is NULL");
> +        return false;
>      }
>  
>      trace_module_lookup_object_type(type);
> @@ -331,15 +333,26 @@ void module_load_qom_one(const char *type)
>          }
>          for (sl = modinfo->objs; *sl != NULL; sl++) {
>              if (strcmp(type, *sl) == 0) {
> -                module_load_one("", modinfo->name);
> +                if (found) {
> +                    error_setg(errp, "multiple modules providing '%s'", type);
> +                    found = false;
> +                    break;
> +                }
> +                found = module_load_one("", modinfo->name, errp);
> +                if (!found) {
> +                    /* errp optionally set in module_load_one */
> +                    break;
> +                }
>              }
>          }
>      }
> +    return found;
>  }
>  
>  void module_load_qom_all(void)
>  {
>      const QemuModinfo *modinfo;
> +    Error *local_err = NULL;
>  
>      if (module_loaded_qom_all) {
>          return;
> @@ -352,7 +365,9 @@ void module_load_qom_all(void)
>          if (!module_check_arch(modinfo)) {
>              continue;
>          }
> -        module_load_one("", modinfo->name);
> +        if (!module_load_one("", modinfo->name, &local_err) && local_err) {
> +            error_report_err(local_err);
> +        }
>      }
>      module_loaded_qom_all = true;
>  }
> @@ -368,7 +383,11 @@ void qemu_load_module_for_opts(const char *group)
>          }
>          for (sl = modinfo->opts; *sl != NULL; sl++) {
>              if (strcmp(group, *sl) == 0) {
> -                module_load_one("", modinfo->name);
> +                Error *local_err = NULL;
> +                if (!module_load_one("", modinfo->name, &local_err)
> +                    && local_err) {
> +                    error_report_err(local_err);
> +                }
>              }
>          }
>      }
> @@ -378,7 +397,8 @@ void qemu_load_module_for_opts(const char *group)
>  
>  void module_allow_arch(const char *arch) {}
>  void qemu_load_module_for_opts(const char *group) {}
> -void module_load_qom_one(const char *type) {}
> +bool module_load_one(const char *prefix, const char *name, Error **errp) { return true; }
> +bool module_load_qom_one(const char *type, Error **errp) { return true; }
>  void module_load_qom_all(void) {}
>  
>  #endif



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
  2022-09-15  8:43   ` Claudio Fontana
@ 2022-09-16  8:13   ` Richard Henderson
  2022-09-16  8:16     ` Claudio Fontana
  2022-09-16  9:27   ` Markus Armbruster
  2 siblings, 1 reply; 50+ messages in thread
From: Richard Henderson @ 2022-09-16  8:13 UTC (permalink / raw)
  To: Claudio Fontana, Paolo Bonzini, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé

On 9/8/22 20:30, Claudio Fontana wrote:
> improve error handling during module load, by changing:
> 
> bool module_load_one(const char *prefix, const char *lib_name);
> void module_load_qom_one(const char *type);
> 
> to:
> 
> bool module_load_one(const char *prefix, const char *name, Error **errp);
> bool module_load_qom_one(const char *type, Error **errp);
> 
> module_load_qom_one has been introduced in:
> 
> commit 28457744c345 ("module: qom module support"), which built on top of
> module_load_one, but discarded the bool return value. Restore it.
> 
> Adapt all callers to emit errors, or ignore them, or fail hard,
> as appropriate in each context.
> 
> Signed-off-by: Claudio Fontana<cfontana@suse.de>
> ---
>   audio/audio.c         |   9 ++-
>   block.c               |  15 ++++-
>   block/dmg.c           |  18 +++++-
>   hw/core/qdev.c        |  10 ++-
>   include/qemu/module.h |  38 ++++++++++--
>   qom/object.c          |  18 +++++-
>   softmmu/qtest.c       |   6 +-
>   ui/console.c          |  18 +++++-
>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>   9 files changed, 194 insertions(+), 78 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16  8:13   ` Richard Henderson
@ 2022-09-16  8:16     ` Claudio Fontana
  0 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-16  8:16 UTC (permalink / raw)
  To: Richard Henderson, Paolo Bonzini, Markus Armbruster, Kevin Wolf
  Cc: qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé

On 9/16/22 10:13, Richard Henderson wrote:
> On 9/8/22 20:30, Claudio Fontana wrote:
>> improve error handling during module load, by changing:
>>
>> bool module_load_one(const char *prefix, const char *lib_name);
>> void module_load_qom_one(const char *type);
>>
>> to:
>>
>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>> bool module_load_qom_one(const char *type, Error **errp);
>>
>> module_load_qom_one has been introduced in:
>>
>> commit 28457744c345 ("module: qom module support"), which built on top of
>> module_load_one, but discarded the bool return value. Restore it.
>>
>> Adapt all callers to emit errors, or ignore them, or fail hard,
>> as appropriate in each context.
>>
>> Signed-off-by: Claudio Fontana<cfontana@suse.de>
>> ---
>>   audio/audio.c         |   9 ++-
>>   block.c               |  15 ++++-
>>   block/dmg.c           |  18 +++++-
>>   hw/core/qdev.c        |  10 ++-
>>   include/qemu/module.h |  38 ++++++++++--
>>   qom/object.c          |  18 +++++-
>>   softmmu/qtest.c       |   6 +-
>>   ui/console.c          |  18 +++++-
>>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>>   9 files changed, 194 insertions(+), 78 deletions(-)
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> r~

Thanks for your review Richard.

Whose queue can I expect this series to get into?

Thanks again,

Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
  2022-09-15  8:43   ` Claudio Fontana
  2022-09-16  8:13   ` Richard Henderson
@ 2022-09-16  9:27   ` Markus Armbruster
  2022-09-16 10:48     ` Claudio Fontana
  2022-09-19 10:18     ` Philippe Mathieu-Daudé via
  2 siblings, 2 replies; 50+ messages in thread
From: Markus Armbruster @ 2022-09-16  9:27 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P . Berrangé, Philippe Mathieu-Daudé

Claudio Fontana <cfontana@suse.de> writes:

> improve error handling during module load, by changing:
>
> bool module_load_one(const char *prefix, const char *lib_name);
> void module_load_qom_one(const char *type);
>
> to:
>
> bool module_load_one(const char *prefix, const char *name, Error **errp);
> bool module_load_qom_one(const char *type, Error **errp);
>
> module_load_qom_one has been introduced in:
>
> commit 28457744c345 ("module: qom module support"), which built on top of
> module_load_one, but discarded the bool return value. Restore it.
>
> Adapt all callers to emit errors, or ignore them, or fail hard,
> as appropriate in each context.

How exactly does behavior change?  The commit message is mum on the
behavior before the patch, and vague on the behavior afterwards.

> Signed-off-by: Claudio Fontana <cfontana@suse.de>
> ---
>  audio/audio.c         |   9 ++-
>  block.c               |  15 ++++-
>  block/dmg.c           |  18 +++++-
>  hw/core/qdev.c        |  10 ++-
>  include/qemu/module.h |  38 ++++++++++--
>  qom/object.c          |  18 +++++-
>  softmmu/qtest.c       |   6 +-
>  ui/console.c          |  18 +++++-
>  util/module.c         | 140 ++++++++++++++++++++++++------------------
>  9 files changed, 194 insertions(+), 78 deletions(-)
>
> diff --git a/audio/audio.c b/audio/audio.c
> index 76b8735b44..cff7464c07 100644
> --- a/audio/audio.c
> +++ b/audio/audio.c
> @@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
>  audio_driver *audio_driver_lookup(const char *name)
>  {
>      struct audio_driver *d;
> +    Error *local_err = NULL;
>  
>      QLIST_FOREACH(d, &audio_drivers, next) {
>          if (strcmp(name, d->name) == 0) {
> @@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
>          }
>      }
>  
> -    audio_module_load_one(name);
> +    if (!audio_module_load_one(name, &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +        }
> +        return NULL;
> +    }
> +
>      QLIST_FOREACH(d, &audio_drivers, next) {
>          if (strcmp(name, d->name) == 0) {
>              return d;
> diff --git a/block.c b/block.c
> index bc85f46eed..8b610c6d95 100644
> --- a/block.c
> +++ b/block.c
> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>      /* The driver isn't registered, maybe we need to load a module */
>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
> -            block_module_load_one(block_driver_modules[i].library_name);
> +            Error *local_err = NULL;
> +            if (!block_module_load_one(block_driver_modules[i].library_name,
> +                                       &local_err)) {
> +                if (local_err) {
> +                    error_report_err(local_err);
> +                }
> +                return NULL;
> +            }
>              break;
>          }
>      }

Before the patch, bdrv_find_format() fails silently[*].

Afterwards, it reports an error on some failures, but not on others.
Sure this is what we want?

> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>          if (block_driver_modules[i].protocol_name &&
>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
> -            block_module_load_one(block_driver_modules[i].library_name);
> +            Error *local_err = NULL;
> +            if (!block_module_load_one(block_driver_modules[i].library_name,
> +                                       &local_err) && local_err) {

Break the line before && local_err, please, like you do elsewhere.

> +                error_report_err(local_err);
> +            }
>              break;
>          }
>      }

Uh-oh: error_report() or equivalent in a function with an Error **
parameter.  This is almost always wrong.  Shouldn't you pass the error
to the caller?

Please check all uses of your FOO_load_one() for this issue.

> diff --git a/block/dmg.c b/block/dmg.c
> index 98db18d82a..11d184d39c 100644
> --- a/block/dmg.c
> +++ b/block/dmg.c
> @@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>      uint64_t plist_xml_offset, plist_xml_length;
>      int64_t offset;
>      int ret;
> +    Error *local_err = NULL;
>  
>      ret = bdrv_apply_auto_read_only(bs, NULL, errp);
>      if (ret < 0) {
> @@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>          return -EINVAL;
>      }
>  
> -    block_module_load_one("dmg-bz2");
> -    block_module_load_one("dmg-lzfse");
> +    if (!block_module_load_one("dmg-bz2", &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
> +    }
> +    local_err = NULL;
> +    if (!block_module_load_one("dmg-lzfse", &local_err)) {
> +        if (local_err) {
> +            error_report_err(local_err);
> +            return -EINVAL;
> +        }
> +        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
> +    }
>  
>      s->n_chunks = 0;
>      s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 0806d8fcaa..5902c59c94 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
>  DeviceState *qdev_new(const char *name)
>  {
>      if (!object_class_by_name(name)) {
> -        module_load_qom_one(name);
> +        Error *local_err = NULL;
> +        if (!module_load_qom_one(name, &local_err)) {
> +            if (local_err) {
> +                error_report_err(local_err);
> +            } else {
> +                error_report("could not find a module for type '%s'", name);
> +            }
> +            abort();

Why is this a programming error?

> +        }
>      }
>      return DEVICE(object_new(name));
>  }
> diff --git a/include/qemu/module.h b/include/qemu/module.h
> index 8c012bbe03..78d4c4de96 100644
> --- a/include/qemu/module.h
> +++ b/include/qemu/module.h
> @@ -61,16 +61,44 @@ typedef enum {
>  #define fuzz_target_init(function) module_init(function, \
>                                                 MODULE_INIT_FUZZ_TARGET)
>  #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
> -#define block_module_load_one(lib) module_load_one("block-", lib)
> -#define ui_module_load_one(lib) module_load_one("ui-", lib)
> -#define audio_module_load_one(lib) module_load_one("audio-", lib)
> +#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
> +#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
> +#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
>  
>  void register_module_init(void (*fn)(void), module_init_type type);
>  void register_dso_module_init(void (*fn)(void), module_init_type type);
>  
>  void module_call_init(module_init_type type);
> -bool module_load_one(const char *prefix, const char *lib_name);
> -void module_load_qom_one(const char *type);
> +
> +/*
> + * module_load_one: attempt to load a module from a set of directories
> + *
> + * directories searched are:
> + * - getenv("QEMU_MODULE_DIR")
> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
> + * - /var/run/qemu/${version_dir}
> + *
> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
> + * name:           name of the module
> + * errp:           error to set in case the module is found, but load failed.
> + *
> + * Return value:   true on success (found and loaded);
> + *                 if module if found, but load failed, errp will be set.
> + *                 if module is not found, errp will not be set.

I understand you need to distingush two failure modes "found, but load
failed" and "not found".

Functions that set an error on some failures only tend to be awkward: in
addition to checking the return value for failure, you have to check
@errp for special failures.  This is particularly cumbersome when it
requires a @local_err and an error_propagate() just for that.  I
generally prefer to return an error code and always set an error.

That said, the patch doesn't look bad.  Perhaps it will be once the
issues I pointed out above have been addressed.  Let's not worry about
it right now.

> + */
> +bool module_load_one(const char *prefix, const char *name, Error **errp);
> +
> +/*
> + * module_load_qom_one: attempt to load a module to provide a QOM type
> + *
> + * type:           the type to be provided
> + * errp:           error to set.
> + *
> + * Return value:   true on success (found and loaded), false on failure.
> + *                 If a module is simply not found for the type,
> + *                 errp will not be set.
> + */
> +bool module_load_qom_one(const char *type, Error **errp);
>  void module_load_qom_all(void);
>  void module_allow_arch(const char *arch);
>  

[...]


[*] Except it prints "Module is not supported by system\n" to stderr
when !g_module_supported(), which doesn't look right to me.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16  9:27   ` Markus Armbruster
@ 2022-09-16 10:48     ` Claudio Fontana
  2022-09-16 14:26       ` Markus Armbruster
  2022-09-19 10:18     ` Philippe Mathieu-Daudé via
  1 sibling, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-16 10:48 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

On 9/16/22 11:27, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> improve error handling during module load, by changing:
>>
>> bool module_load_one(const char *prefix, const char *lib_name);
>> void module_load_qom_one(const char *type);
>>
>> to:
>>
>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>> bool module_load_qom_one(const char *type, Error **errp);
>>
>> module_load_qom_one has been introduced in:
>>
>> commit 28457744c345 ("module: qom module support"), which built on top of
>> module_load_one, but discarded the bool return value. Restore it.
>>
>> Adapt all callers to emit errors, or ignore them, or fail hard,
>> as appropriate in each context.
> 
> How exactly does behavior change?  The commit message is mum on the
> behavior before the patch, and vague on the behavior afterwards.

The current behavior is difficult to describe as it is basically devoid of any error handling for QOM modules.
What error handling facilities were potentially available before were rendered moot by commit 28457744c345.

the behavior changes in this way:

audio: when attempting to load an audio module, now actually report module load errors.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

block: when attempting to load a block module, now actually report module load errors.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

block/dmg: specifically for the dmg module, which contains submodules dmg-bz2 and dmg-lzfse, now also warn that if the submodules are not present, the corresponding decompression will not be available.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

qdev: when creating a new qdev Device object (DeviceState), actually report module load errors. If a module cannot be loaded to create that device, abort execution as intended.
QEMU users of qdev assume that qdev_new() returns non-NULL. There is a separate qdev_try_new() function whose only difference from qdev_new() is that it can return NULL rather than asserting.
(see include/hw/qdev-core.h :: qdev_try_new)

Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution (likely, just segfault).

qom/object.c : when initializing a QOM object, actually report module load errors.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

qom/object.c : when looking up an ObjectClass by name (module_object_class_by_name), actually report module load errors.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

qtest: when processing the "module_load" qtest command, if there is an actual error in the load of the module, report it in addition to sending a "FAIL" response.

console: when attempting to load a display module, now actually report module load errors.
Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.

util/module: provide a boolean return value and an Error parameter to module_load_qom_one and provide an Error parameter to module_load_one what was missing it.
Report an error using the Error facilities when module is present, but fails to be loaded due to an error occurring.
Also report an error if multiple modules are loadable and try to provide the same type.

Previous behavior:
difficult to describe, a really confused implementation.

The module_load_file part of the code seemed to assume being called separately, and was checking the caller arguments as though it would be a self-contained API,
rechecking the suffix of the string for a valid ".so" (CONFIG_HOST_DSOSUF), when it is called that way explicitly and only by module_load_one as a static function.

It was printing an error with fprintf specifically only in the case where g_module_open failed, and never for any other error condition.
It only printed this error if the boolean "mayfail" was set to false, but the argument was always passed as false, in all execution paths.
It was not reporting any error if multiple modules are present all providing the same type.

Let me know if you have further questions, what I described is a summary of all the changes contained. I can add to the commit message if necessary.

> 
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>> ---
>>  audio/audio.c         |   9 ++-
>>  block.c               |  15 ++++-
>>  block/dmg.c           |  18 +++++-
>>  hw/core/qdev.c        |  10 ++-
>>  include/qemu/module.h |  38 ++++++++++--
>>  qom/object.c          |  18 +++++-
>>  softmmu/qtest.c       |   6 +-
>>  ui/console.c          |  18 +++++-
>>  util/module.c         | 140 ++++++++++++++++++++++++------------------
>>  9 files changed, 194 insertions(+), 78 deletions(-)
>>
>> diff --git a/audio/audio.c b/audio/audio.c
>> index 76b8735b44..cff7464c07 100644
>> --- a/audio/audio.c
>> +++ b/audio/audio.c
>> @@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
>>  audio_driver *audio_driver_lookup(const char *name)
>>  {
>>      struct audio_driver *d;
>> +    Error *local_err = NULL;
>>  
>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>          if (strcmp(name, d->name) == 0) {
>> @@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
>>          }
>>      }
>>  
>> -    audio_module_load_one(name);
>> +    if (!audio_module_load_one(name, &local_err)) {
>> +        if (local_err) {
>> +            error_report_err(local_err);
>> +        }
>> +        return NULL;
>> +    }
>> +
>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>          if (strcmp(name, d->name) == 0) {
>>              return d;
>> diff --git a/block.c b/block.c
>> index bc85f46eed..8b610c6d95 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>      /* The driver isn't registered, maybe we need to load a module */
>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>> -            block_module_load_one(block_driver_modules[i].library_name);
>> +            Error *local_err = NULL;
>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>> +                                       &local_err)) {
>> +                if (local_err) {
>> +                    error_report_err(local_err);
>> +                }
>> +                return NULL;
>> +            }
>>              break;
>>          }
>>      }
> 
> Before the patch, bdrv_find_format() fails silently[*].
> 
> Afterwards, it reports an error on some failures, but not on others.
> Sure this is what we want?

Yes, I am sure. It only reports an error if an error exists.
When a module is not present, no error should be printed in this context.

> 
>> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>          if (block_driver_modules[i].protocol_name &&
>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>> -            block_module_load_one(block_driver_modules[i].library_name);
>> +            Error *local_err = NULL;
>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>> +                                       &local_err) && local_err) {
> 
> Break the line before && local_err, please, like you do elsewhere.

This is to avoid hitting the 80 chars on one line, as warned by checkpatch. Should I disregard checkpatch, is it not used anymore?

> 
>> +                error_report_err(local_err);
>> +            }
>>              break;
>>          }
>>      }
> 
> Uh-oh: error_report() or equivalent in a function with an Error **
> parameter.  This is almost always wrong.  Shouldn't you pass the error
> to the caller?

I guess this is the "almost" case. No I should not pass anything back.

The Error **errp parameter of bdrv_find_protocol, and the Error local_err parameter inside it are different errors.

The first is about whether a protocol has been found or not.
The second is about whether there was an error during the load of a module.

No, local_err should not be passed back to the caller in my view. The error passed back to the caller is errp and is already set correctly.

> 
> Please check all uses of your FOO_load_one() for this issue.

I have checked all instances multiple times. If there are issues please present them so I can address them.

> 
>> diff --git a/block/dmg.c b/block/dmg.c
>> index 98db18d82a..11d184d39c 100644
>> --- a/block/dmg.c
>> +++ b/block/dmg.c
>> @@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>      uint64_t plist_xml_offset, plist_xml_length;
>>      int64_t offset;
>>      int ret;
>> +    Error *local_err = NULL;
>>  
>>      ret = bdrv_apply_auto_read_only(bs, NULL, errp);
>>      if (ret < 0) {
>> @@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>          return -EINVAL;
>>      }
>>  
>> -    block_module_load_one("dmg-bz2");
>> -    block_module_load_one("dmg-lzfse");
>> +    if (!block_module_load_one("dmg-bz2", &local_err)) {
>> +        if (local_err) {
>> +            error_report_err(local_err);
>> +            return -EINVAL;
>> +        }
>> +        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
>> +    }
>> +    local_err = NULL;
>> +    if (!block_module_load_one("dmg-lzfse", &local_err)) {
>> +        if (local_err) {
>> +            error_report_err(local_err);
>> +            return -EINVAL;
>> +        }
>> +        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
>> +    }
>>  
>>      s->n_chunks = 0;
>>      s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> index 0806d8fcaa..5902c59c94 100644
>> --- a/hw/core/qdev.c
>> +++ b/hw/core/qdev.c
>> @@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
>>  DeviceState *qdev_new(const char *name)
>>  {
>>      if (!object_class_by_name(name)) {
>> -        module_load_qom_one(name);
>> +        Error *local_err = NULL;
>> +        if (!module_load_qom_one(name, &local_err)) {
>> +            if (local_err) {
>> +                error_report_err(local_err);
>> +            } else {
>> +                error_report("could not find a module for type '%s'", name);
>> +            }
>> +            abort();
> 
> Why is this a programming error?

As described before in your request for previous/after, this is a programming error because qdev_new is not expected to fail.
All QEMU users of qdev_new expect the API to return non-NULL and immediately use and dereference the pointer returned.
There is a separate API, qdev_try_new, for creating a device and allowing a NULL return value.

> 
>> +        }
>>      }
>>      return DEVICE(object_new(name));
>>  }
>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>> index 8c012bbe03..78d4c4de96 100644
>> --- a/include/qemu/module.h
>> +++ b/include/qemu/module.h
>> @@ -61,16 +61,44 @@ typedef enum {
>>  #define fuzz_target_init(function) module_init(function, \
>>                                                 MODULE_INIT_FUZZ_TARGET)
>>  #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
>> -#define block_module_load_one(lib) module_load_one("block-", lib)
>> -#define ui_module_load_one(lib) module_load_one("ui-", lib)
>> -#define audio_module_load_one(lib) module_load_one("audio-", lib)
>> +#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
>> +#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
>> +#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
>>  
>>  void register_module_init(void (*fn)(void), module_init_type type);
>>  void register_dso_module_init(void (*fn)(void), module_init_type type);
>>  
>>  void module_call_init(module_init_type type);
>> -bool module_load_one(const char *prefix, const char *lib_name);
>> -void module_load_qom_one(const char *type);
>> +
>> +/*
>> + * module_load_one: attempt to load a module from a set of directories
>> + *
>> + * directories searched are:
>> + * - getenv("QEMU_MODULE_DIR")
>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>> + * - /var/run/qemu/${version_dir}
>> + *
>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>> + * name:           name of the module
>> + * errp:           error to set in case the module is found, but load failed.
>> + *
>> + * Return value:   true on success (found and loaded);
>> + *                 if module if found, but load failed, errp will be set.
>> + *                 if module is not found, errp will not be set.
> 
> I understand you need to distingush two failure modes "found, but load
> failed" and "not found".

That is correct.

> 
> Functions that set an error on some failures only tend to be awkward: in
> addition to checking the return value for failure, you have to check
> @errp for special failures.  This is particularly cumbersome when it
> requires a @local_err and an error_propagate() just for that.  I
> generally prefer to return an error code and always set an error.
> 
> That said, the patch doesn't look bad.  Perhaps it will be once the
> issues I pointed out above have been addressed.  Let's not worry about
> it right now.
> 
>> + */
>> +bool module_load_one(const char *prefix, const char *name, Error **errp);
>> +
>> +/*
>> + * module_load_qom_one: attempt to load a module to provide a QOM type
>> + *
>> + * type:           the type to be provided
>> + * errp:           error to set.
>> + *
>> + * Return value:   true on success (found and loaded), false on failure.
>> + *                 If a module is simply not found for the type,
>> + *                 errp will not be set.
>> + */
>> +bool module_load_qom_one(const char *type, Error **errp);
>>  void module_load_qom_all(void);
>>  void module_allow_arch(const char *arch);
>>  
> 
> [...]
> 
> 
> [*] Except it prints "Module is not supported by system\n" to stderr
> when !g_module_supported(), which doesn't look right to me.
> 
I don't know what this comment relates to.

Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16 10:48     ` Claudio Fontana
@ 2022-09-16 14:26       ` Markus Armbruster
  2022-09-16 15:06         ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-16 14:26 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Markus Armbruster, Paolo Bonzini, Richard Henderson, Kevin Wolf,
	qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

Claudio Fontana <cfontana@suse.de> writes:

> On 9/16/22 11:27, Markus Armbruster wrote:
>> Claudio Fontana <cfontana@suse.de> writes:
>> 
>>> improve error handling during module load, by changing:
>>>
>>> bool module_load_one(const char *prefix, const char *lib_name);
>>> void module_load_qom_one(const char *type);
>>>
>>> to:
>>>
>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>> bool module_load_qom_one(const char *type, Error **errp);
>>>
>>> module_load_qom_one has been introduced in:
>>>
>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>> module_load_one, but discarded the bool return value. Restore it.
>>>
>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>> as appropriate in each context.
>> 
>> How exactly does behavior change?  The commit message is mum on the
>> behavior before the patch, and vague on the behavior afterwards.
>
> The current behavior is difficult to describe as it is basically devoid of any error handling for QOM modules.
> What error handling facilities were potentially available before were rendered moot by commit 28457744c345.

*Sigh*

> the behavior changes in this way:
>
> audio: when attempting to load an audio module, now actually report module load errors.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> block: when attempting to load a block module, now actually report module load errors.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> block/dmg: specifically for the dmg module, which contains submodules dmg-bz2 and dmg-lzfse, now also warn that if the submodules are not present, the corresponding decompression will not be available.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> qdev: when creating a new qdev Device object (DeviceState), actually report module load errors. If a module cannot be loaded to create that device, abort execution as intended.
> QEMU users of qdev assume that qdev_new() returns non-NULL. There is a separate qdev_try_new() function whose only difference from qdev_new() is that it can return NULL rather than asserting.
> (see include/hw/qdev-core.h :: qdev_try_new)
>
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution (likely, just segfault).
>
> qom/object.c : when initializing a QOM object, actually report module load errors.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> qom/object.c : when looking up an ObjectClass by name (module_object_class_by_name), actually report module load errors.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> qtest: when processing the "module_load" qtest command, if there is an actual error in the load of the module, report it in addition to sending a "FAIL" response.
>
> console: when attempting to load a display module, now actually report module load errors.
> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>
> util/module: provide a boolean return value and an Error parameter to module_load_qom_one and provide an Error parameter to module_load_one what was missing it.
> Report an error using the Error facilities when module is present, but fails to be loaded due to an error occurring.
> Also report an error if multiple modules are loadable and try to provide the same type.
>
> Previous behavior:
> difficult to describe, a really confused implementation.
>
> The module_load_file part of the code seemed to assume being called separately, and was checking the caller arguments as though it would be a self-contained API,
> rechecking the suffix of the string for a valid ".so" (CONFIG_HOST_DSOSUF), when it is called that way explicitly and only by module_load_one as a static function.
>
> It was printing an error with fprintf specifically only in the case where g_module_open failed, and never for any other error condition.
> It only printed this error if the boolean "mayfail" was set to false, but the argument was always passed as false, in all execution paths.
> It was not reporting any error if multiple modules are present all providing the same type.

What a mess.

> Let me know if you have further questions, what I described is a summary of all the changes contained. I can add to the commit message if necessary.

Yes, please.

>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>> ---
>>>  audio/audio.c         |   9 ++-
>>>  block.c               |  15 ++++-
>>>  block/dmg.c           |  18 +++++-
>>>  hw/core/qdev.c        |  10 ++-
>>>  include/qemu/module.h |  38 ++++++++++--
>>>  qom/object.c          |  18 +++++-
>>>  softmmu/qtest.c       |   6 +-
>>>  ui/console.c          |  18 +++++-
>>>  util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>  9 files changed, 194 insertions(+), 78 deletions(-)
>>>
>>> diff --git a/audio/audio.c b/audio/audio.c
>>> index 76b8735b44..cff7464c07 100644
>>> --- a/audio/audio.c
>>> +++ b/audio/audio.c
>>> @@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
>>>  audio_driver *audio_driver_lookup(const char *name)
>>>  {
>>>      struct audio_driver *d;
>>> +    Error *local_err = NULL;
>>>  
>>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>>          if (strcmp(name, d->name) == 0) {
>>> @@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
>>>          }
>>>      }
>>>  
>>> -    audio_module_load_one(name);
>>> +    if (!audio_module_load_one(name, &local_err)) {
>>> +        if (local_err) {
>>> +            error_report_err(local_err);
>>> +        }
>>> +        return NULL;
>>> +    }
>>> +
>>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>>          if (strcmp(name, d->name) == 0) {
>>>              return d;
>>> diff --git a/block.c b/block.c
>>> index bc85f46eed..8b610c6d95 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>>      /* The driver isn't registered, maybe we need to load a module */
>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>> +            Error *local_err = NULL;
>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>> +                                       &local_err)) {
>>> +                if (local_err) {
>>> +                    error_report_err(local_err);
>>> +                }
>>> +                return NULL;
>>> +            }
>>>              break;
>>>          }
>>>      }
>> 
>> Before the patch, bdrv_find_format() fails silently[*].
>> 
>> Afterwards, it reports an error on some failures, but not on others.
>> Sure this is what we want?
>
> Yes, I am sure. It only reports an error if an error exists.
> When a module is not present, no error should be printed in this context.

Callers commonly look like this (this one is in bdrv_fill_options()):

        drv = bdrv_find_format(drvname);
        if (!drv) {
            error_setg(errp, "Unknown driver '%s'", drvname);
            return -ENOENT;
        }

where @errp is a parameter.

The patch changes bdrv_find_format() to report some failures with
error_report_err().  This is effectively the same anti-pattern I pointed
out below, starting with "Uh-oh": use of error_report() or equivalent
within a function with an Error ** parameter.

If the "Unknown driver" Error object is ultimately passed to
error_report_err(), we report two errors if module loading fails, else
one.

Emitting two error messages for the same problem, a useful one and a
useless one, is bad UI, but it's arguably still better than just a
useless one.

But what if a caller wants to handle errors without reporting them?
This is a legitimate use of the Error API.  No go, we spam stderr
regardless.

What if a caller wants to report the error in some other way, say write
it to a log file?  No go, we still spam stderr.

The patch breaks on of the Error API's design maxims.  error.h's big
comment:

 * - Separation of concerns: the function is responsible for detecting
 *   errors and failing cleanly; handling the error is its caller's
 *   job.  [...]

The patch changes bdrv_find_options() to no longer satisfy this rule: it
doesn't leave handling entirely to the caller anymore.

You might argue that no existing caller does such things.  Fine.  Then
I'll ask you to document that bdrv_fill_options() may only be used in
certain ways, unlike other functions using the Error API.  Same for all
the other affected functions.  And then Kevin will probably NAK the
whole thing :)

>>> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>          if (block_driver_modules[i].protocol_name &&
>>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>> +            Error *local_err = NULL;
>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>> +                                       &local_err) && local_err) {
>> 
>> Break the line before && local_err, please, like you do elsewhere.
>
> This is to avoid hitting the 80 chars on one line, as warned by checkpatch. Should I disregard checkpatch, is it not used anymore?

Elsewhere, you format like this:

                 if (!block_module_load_one(block_driver_modules[i].library_name,
                                            &local_err)
                     && local_err) {

Better, isn't it?

>
>> 
>>> +                error_report_err(local_err);
>>> +            }
>>>              break;
>>>          }
>>>      }
>> 
>> Uh-oh: error_report() or equivalent in a function with an Error **
>> parameter.  This is almost always wrong.  Shouldn't you pass the error
>> to the caller?
>
> I guess this is the "almost" case. No I should not pass anything back.
>
> The Error **errp parameter of bdrv_find_protocol, and the Error local_err parameter inside it are different errors.
>
> The first is about whether a protocol has been found or not.
> The second is about whether there was an error during the load of a module.
>
> No, local_err should not be passed back to the caller in my view. The error passed back to the caller is errp and is already set correctly.

Part of bdrv_find_protocol()'s (unstated) contract is "either succeed
and return non-null, or fail, return null, and set an error."

This is pretty much ubiquitous among Error-using functions.  error.h's
big comment:

 * - On success, the function should not touch *errp.  On failure, it
 *   should set a new error, e.g. with error_setg(errp, ...), or
 *   propagate an existing one, e.g. with error_propagate(errp, ...).

Note there is no third case "except on some failures, don't set an
error".

Of course, bdrv_find_protocol() might be an exception, i.e. I still owe
you proof of my claim about its contract.  So let's examine callers:

* print_block_option_help() in qemu-img.c:

            proto_drv = bdrv_find_protocol(filename, true, &local_err);
            if (!proto_drv) {
                error_report_err(local_err);
                qemu_opts_free(create_opts);
                return 1;
            }

  If bdrv_find_protocol() returns null without setting @local_err,
  error_report_err() will crash.

  Two more instances elsewhere in qemu-img.c

* bdrv_create_file() in block.c:

        drv = bdrv_find_protocol(filename, true, errp);
        if (drv == NULL) {
            return -ENOENT;
        }

  If bdrv_find_protocol() returns null without setting @local_err, then
  bdrv_create_file() returns -ENOENT without setting an error.

  bdrv_create_file() is called by a bunch of BlockDriver
  .bdrv_co_create_opts() methods, e.g. raw_co_create_opts():

        return bdrv_create_file(filename, opts, errp);

  The method is called by bdrv_create_co_entry():

        ret = cco->drv->bdrv_co_create_opts(cco->drv,
                                            cco->filename, cco->opts, &local_err);
        error_propagate(&cco->err, local_err);
        cco->ret = ret;

       If bdrv_find_protocol() returns null without setting an error, then
       we set cco->ret = -ENOENT and cco->err = NULL.

       One of its users is bdrv_create():

        if (qemu_in_coroutine()) {
            /* Fast-path if already in coroutine context */
            bdrv_create_co_entry(&cco);
        } else {
            co = qemu_coroutine_create(bdrv_create_co_entry, &cco);
            qemu_coroutine_enter(co);
            while (cco.ret == NOT_DONE) {
                aio_poll(qemu_get_aio_context(), true);
            }
        }

        ret = cco.ret;
        if (ret < 0) {
            if (cco.err) {
                error_propagate(errp, cco.err);
            } else {
                error_setg_errno(errp, -ret, "Could not create image");
            }
        }

    out:
        g_free(cco.filename);
        return ret;

  If bdrv_find_protocol() returns null without setting an error, then
  bdrv_create() returns -ENOENT without setting an error.

  One of its callers is bdrv_append_temp_snapshot():

    ret = bdrv_create(&bdrv_qcow2, tmp_filename, opts, errp);
    qemu_opts_del(opts);
    if (ret < 0) {
        error_prepend(errp, "Could not create temporary overlay '%s': ",
                      tmp_filename);
        goto out;
    }

  Crash.

Adding failure modes that use @errp differently than existing ones is
playing with fire :)

>> Please check all uses of your FOO_load_one() for this issue.
>
> I have checked all instances multiple times. If there are issues please present them so I can address them.

I humbly suggest that instances in functions that set an error on
failures before this patch need re-checking.

>>> diff --git a/block/dmg.c b/block/dmg.c
>>> index 98db18d82a..11d184d39c 100644
>>> --- a/block/dmg.c
>>> +++ b/block/dmg.c
>>> @@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>>      uint64_t plist_xml_offset, plist_xml_length;
>>>      int64_t offset;
>>>      int ret;
>>> +    Error *local_err = NULL;
>>>  
>>>      ret = bdrv_apply_auto_read_only(bs, NULL, errp);
>>>      if (ret < 0) {
>>> @@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>>          return -EINVAL;
>>>      }
>>>  
>>> -    block_module_load_one("dmg-bz2");
>>> -    block_module_load_one("dmg-lzfse");
>>> +    if (!block_module_load_one("dmg-bz2", &local_err)) {
>>> +        if (local_err) {
>>> +            error_report_err(local_err);
>>> +            return -EINVAL;
>>> +        }
>>> +        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
>>> +    }
>>> +    local_err = NULL;
>>> +    if (!block_module_load_one("dmg-lzfse", &local_err)) {
>>> +        if (local_err) {
>>> +            error_report_err(local_err);
>>> +            return -EINVAL;
>>> +        }
>>> +        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
>>> +    }
>>>  
>>>      s->n_chunks = 0;
>>>      s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>> index 0806d8fcaa..5902c59c94 100644
>>> --- a/hw/core/qdev.c
>>> +++ b/hw/core/qdev.c
>>> @@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
>>>  DeviceState *qdev_new(const char *name)
>>>  {
>>>      if (!object_class_by_name(name)) {
>>> -        module_load_qom_one(name);
>>> +        Error *local_err = NULL;
>>> +        if (!module_load_qom_one(name, &local_err)) {
>>> +            if (local_err) {
>>> +                error_report_err(local_err);
>>> +            } else {
>>> +                error_report("could not find a module for type '%s'", name);
>>> +            }
>>> +            abort();
>> 
>> Why is this a programming error?
>
> As described before in your request for previous/after, this is a programming error because qdev_new is not expected to fail.
> All QEMU users of qdev_new expect the API to return non-NULL and immediately use and dereference the pointer returned.
> There is a separate API, qdev_try_new, for creating a device and allowing a NULL return value.

Well, then the programming error is "adding loadable device modules
broke qdev_new()".

This is qdev_new() before commit 7ab6e7fcce "qdev: device module
support":

    DeviceState *qdev_new(const char *name)
    {
        return DEVICE(object_new(name));
    }

Precondition: @name is a valid device type name.  If it isn't, we fail
an assertion.

If you got a @name that may not be valid, you must use qdev_try_new()
instead:

    DeviceState *qdev_try_new(const char *name)
    {
        if (!object_class_by_name(name)) {
            return NULL;
        }

        return DEVICE(object_new(name));
    }

The guard ensures we call object_new() only when the precondition holds.

Except it doesn't: when @name is a valid type name, but not a *device*
type name, we still crash.  Bug.

Commit 7ab6e7fcce changed qdev_new() to

    DeviceState *qdev_new(const char *name)
    {
        if (!object_class_by_name(name)) {
            module_load_qom_one(name);
        }
        return DEVICE(object_new(name));
    }

Now the (enforced!) precondition is @name is a valid device type name
and the device is either compiled in or it will succeed to load.

"Will succeed to load" is an inappropriate precondition.

Devices compiled as modules must be instantiated with qdev_try_new().
Loadable module support should be dropped from qdev_new().

Not your patch's fault, of course.  By the way, welcome to the swamp!
Do not mind the crocodiles, they just want to play.

>>> +        }
>>>      }
>>>      return DEVICE(object_new(name));
>>>  }
>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>> index 8c012bbe03..78d4c4de96 100644
>>> --- a/include/qemu/module.h
>>> +++ b/include/qemu/module.h
>>> @@ -61,16 +61,44 @@ typedef enum {
>>>  #define fuzz_target_init(function) module_init(function, \
>>>                                                 MODULE_INIT_FUZZ_TARGET)
>>>  #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
>>> -#define block_module_load_one(lib) module_load_one("block-", lib)
>>> -#define ui_module_load_one(lib) module_load_one("ui-", lib)
>>> -#define audio_module_load_one(lib) module_load_one("audio-", lib)
>>> +#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
>>> +#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
>>> +#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
>>>  
>>>  void register_module_init(void (*fn)(void), module_init_type type);
>>>  void register_dso_module_init(void (*fn)(void), module_init_type type);
>>>  
>>>  void module_call_init(module_init_type type);
>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>> -void module_load_qom_one(const char *type);
>>> +
>>> +/*
>>> + * module_load_one: attempt to load a module from a set of directories
>>> + *
>>> + * directories searched are:
>>> + * - getenv("QEMU_MODULE_DIR")
>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>> + * - /var/run/qemu/${version_dir}
>>> + *
>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>> + * name:           name of the module
>>> + * errp:           error to set in case the module is found, but load failed.
>>> + *
>>> + * Return value:   true on success (found and loaded);
>>> + *                 if module if found, but load failed, errp will be set.
>>> + *                 if module is not found, errp will not be set.
>> 
>> I understand you need to distingush two failure modes "found, but load
>> failed" and "not found".
>
> That is correct.
>
>> 
>> Functions that set an error on some failures only tend to be awkward: in
>> addition to checking the return value for failure, you have to check
>> @errp for special failures.  This is particularly cumbersome when it
>> requires a @local_err and an error_propagate() just for that.  I
>> generally prefer to return an error code and always set an error.
>> 
>> That said, the patch doesn't look bad.  Perhaps it will be once the
>> issues I pointed out above have been addressed.  Let's not worry about
>> it right now.
>> 
>>> + */
>>> +bool module_load_one(const char *prefix, const char *name, Error **errp);
>>> +
>>> +/*
>>> + * module_load_qom_one: attempt to load a module to provide a QOM type
>>> + *
>>> + * type:           the type to be provided
>>> + * errp:           error to set.
>>> + *
>>> + * Return value:   true on success (found and loaded), false on failure.
>>> + *                 If a module is simply not found for the type,
>>> + *                 errp will not be set.
>>> + */
>>> +bool module_load_qom_one(const char *type, Error **errp);
>>>  void module_load_qom_all(void);
>>>  void module_allow_arch(const char *arch);
>>>  
>> 
>> [...]
>> 
>> 
>> [*] Except it prints "Module is not supported by system\n" to stderr
>> when !g_module_supported(), which doesn't look right to me.
>> 
> I don't know what this comment relates to.

It's a footnote attached to "Before the patch, bdrv_find_format() fails
silently[*]" above.


I think the root of the problems with this patch is that you convert the
FOO_module_load_one() functions to the Error API without propagating the
error far enough.  It really needs to go all the way up.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16 14:26       ` Markus Armbruster
@ 2022-09-16 15:06         ` Claudio Fontana
  2022-09-19  8:17           ` Markus Armbruster
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-16 15:06 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

On 9/16/22 16:26, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> On 9/16/22 11:27, Markus Armbruster wrote:
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>>> improve error handling during module load, by changing:
>>>>
>>>> bool module_load_one(const char *prefix, const char *lib_name);
>>>> void module_load_qom_one(const char *type);
>>>>
>>>> to:
>>>>
>>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>>> bool module_load_qom_one(const char *type, Error **errp);
>>>>
>>>> module_load_qom_one has been introduced in:
>>>>
>>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>>> module_load_one, but discarded the bool return value. Restore it.
>>>>
>>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>>> as appropriate in each context.
>>>
>>> How exactly does behavior change?  The commit message is mum on the
>>> behavior before the patch, and vague on the behavior afterwards.
>>
>> The current behavior is difficult to describe as it is basically devoid of any error handling for QOM modules.
>> What error handling facilities were potentially available before were rendered moot by commit 28457744c345.
> 
> *Sigh*
> 
>> the behavior changes in this way:
>>
>> audio: when attempting to load an audio module, now actually report module load errors.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> block: when attempting to load a block module, now actually report module load errors.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> block/dmg: specifically for the dmg module, which contains submodules dmg-bz2 and dmg-lzfse, now also warn that if the submodules are not present, the corresponding decompression will not be available.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> qdev: when creating a new qdev Device object (DeviceState), actually report module load errors. If a module cannot be loaded to create that device, abort execution as intended.
>> QEMU users of qdev assume that qdev_new() returns non-NULL. There is a separate qdev_try_new() function whose only difference from qdev_new() is that it can return NULL rather than asserting.
>> (see include/hw/qdev-core.h :: qdev_try_new)
>>
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution (likely, just segfault).
>>
>> qom/object.c : when initializing a QOM object, actually report module load errors.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> qom/object.c : when looking up an ObjectClass by name (module_object_class_by_name), actually report module load errors.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> qtest: when processing the "module_load" qtest command, if there is an actual error in the load of the module, report it in addition to sending a "FAIL" response.
>>
>> console: when attempting to load a display module, now actually report module load errors.
>> Previous behavior: say nothing, and leave it to the user to figure out what is wrong by failures later on during execution.
>>
>> util/module: provide a boolean return value and an Error parameter to module_load_qom_one and provide an Error parameter to module_load_one what was missing it.
>> Report an error using the Error facilities when module is present, but fails to be loaded due to an error occurring.
>> Also report an error if multiple modules are loadable and try to provide the same type.
>>
>> Previous behavior:
>> difficult to describe, a really confused implementation.
>>
>> The module_load_file part of the code seemed to assume being called separately, and was checking the caller arguments as though it would be a self-contained API,
>> rechecking the suffix of the string for a valid ".so" (CONFIG_HOST_DSOSUF), when it is called that way explicitly and only by module_load_one as a static function.
>>
>> It was printing an error with fprintf specifically only in the case where g_module_open failed, and never for any other error condition.
>> It only printed this error if the boolean "mayfail" was set to false, but the argument was always passed as false, in all execution paths.
>> It was not reporting any error if multiple modules are present all providing the same type.
> 
> What a mess.
> 
>> Let me know if you have further questions, what I described is a summary of all the changes contained. I can add to the commit message if necessary.
> 
> Yes, please.

Will do.

> 
>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>> ---
>>>>  audio/audio.c         |   9 ++-
>>>>  block.c               |  15 ++++-
>>>>  block/dmg.c           |  18 +++++-
>>>>  hw/core/qdev.c        |  10 ++-
>>>>  include/qemu/module.h |  38 ++++++++++--
>>>>  qom/object.c          |  18 +++++-
>>>>  softmmu/qtest.c       |   6 +-
>>>>  ui/console.c          |  18 +++++-
>>>>  util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>>  9 files changed, 194 insertions(+), 78 deletions(-)
>>>>
>>>> diff --git a/audio/audio.c b/audio/audio.c
>>>> index 76b8735b44..cff7464c07 100644
>>>> --- a/audio/audio.c
>>>> +++ b/audio/audio.c
>>>> @@ -72,6 +72,7 @@ void audio_driver_register(audio_driver *drv)
>>>>  audio_driver *audio_driver_lookup(const char *name)
>>>>  {
>>>>      struct audio_driver *d;
>>>> +    Error *local_err = NULL;
>>>>  
>>>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>>>          if (strcmp(name, d->name) == 0) {
>>>> @@ -79,7 +80,13 @@ audio_driver *audio_driver_lookup(const char *name)
>>>>          }
>>>>      }
>>>>  
>>>> -    audio_module_load_one(name);
>>>> +    if (!audio_module_load_one(name, &local_err)) {
>>>> +        if (local_err) {
>>>> +            error_report_err(local_err);
>>>> +        }
>>>> +        return NULL;
>>>> +    }
>>>> +
>>>>      QLIST_FOREACH(d, &audio_drivers, next) {
>>>>          if (strcmp(name, d->name) == 0) {
>>>>              return d;
>>>> diff --git a/block.c b/block.c
>>>> index bc85f46eed..8b610c6d95 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>>>      /* The driver isn't registered, maybe we need to load a module */
>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>> +            Error *local_err = NULL;
>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>> +                                       &local_err)) {
>>>> +                if (local_err) {
>>>> +                    error_report_err(local_err);
>>>> +                }
>>>> +                return NULL;
>>>> +            }
>>>>              break;
>>>>          }
>>>>      }
>>>
>>> Before the patch, bdrv_find_format() fails silently[*].
>>>
>>> Afterwards, it reports an error on some failures, but not on others.
>>> Sure this is what we want?
>>
>> Yes, I am sure. It only reports an error if an error exists.
>> When a module is not present, no error should be printed in this context.
> 
> Callers commonly look like this (this one is in bdrv_fill_options()):
> 
>         drv = bdrv_find_format(drvname);
>         if (!drv) {
>             error_setg(errp, "Unknown driver '%s'", drvname);
>             return -ENOENT;
>         }
> 
> where @errp is a parameter.
> 
> The patch changes bdrv_find_format() to report some failures with
> error_report_err().  This is effectively the same anti-pattern I pointed
> out below, starting with "Uh-oh": use of error_report() or equivalent
> within a function with an Error ** parameter.
> 
> If the "Unknown driver" Error object is ultimately passed to
> error_report_err(), we report two errors if module loading fails, else
> one.
> 
> Emitting two error messages for the same problem, a useful one and a
> useless one, is bad UI, but it's arguably still better than just a
> useless one.

I'll see if it makes sense to merge them.
I wanted to avoid removing the existing Unknown driver / or Unknown Protocol errors
which people might be relying on.

> 
> But what if a caller wants to handle errors without reporting them?
> This is a legitimate use of the Error API.  No go, we spam stderr
> regardless.

I suppose the Error API can be directed to something else than stderr?

> 
> What if a caller wants to report the error in some other way, say write
> it to a log file?  No go, we still spam stderr.
> 
> The patch breaks on of the Error API's design maxims.  error.h's big
> comment:
> 
>  * - Separation of concerns: the function is responsible for detecting
>  *   errors and failing cleanly; handling the error is its caller's
>  *   job.  [...]
> 
> The patch changes bdrv_find_options() to no longer satisfy this rule: it
> doesn't leave handling entirely to the caller anymore.
> 
> You might argue that no existing caller does such things.  Fine.  Then
> I'll ask you to document that bdrv_fill_options() may only be used in
> certain ways, unlike other functions using the Error API.  Same for all
> the other affected functions.  And then Kevin will probably NAK the
> whole thing :)
> 
>>>> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>          if (block_driver_modules[i].protocol_name &&
>>>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>> +            Error *local_err = NULL;
>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>> +                                       &local_err) && local_err) {
>>>
>>> Break the line before && local_err, please, like you do elsewhere.
>>
>> This is to avoid hitting the 80 chars on one line, as warned by checkpatch. Should I disregard checkpatch, is it not used anymore?
> 
> Elsewhere, you format like this:
> 
>                  if (!block_module_load_one(block_driver_modules[i].library_name,
>                                             &local_err)
>                      && local_err) {
> 
> Better, isn't it?

I would not say so, I have no preference, but I can change it.

> 
>>
>>>
>>>> +                error_report_err(local_err);
>>>> +            }
>>>>              break;
>>>>          }
>>>>      }
>>>
>>> Uh-oh: error_report() or equivalent in a function with an Error **
>>> parameter.  This is almost always wrong.  Shouldn't you pass the error
>>> to the caller?
>>
>> I guess this is the "almost" case. No I should not pass anything back.
>>
>> The Error **errp parameter of bdrv_find_protocol, and the Error local_err parameter inside it are different errors.
>>
>> The first is about whether a protocol has been found or not.
>> The second is about whether there was an error during the load of a module.
>>
>> No, local_err should not be passed back to the caller in my view. The error passed back to the caller is errp and is already set correctly.
> 
> Part of bdrv_find_protocol()'s (unstated) contract is "either succeed
> and return non-null, or fail, return null, and set an error."
> 
> This is pretty much ubiquitous among Error-using functions.  error.h's
> big comment:
> 
>  * - On success, the function should not touch *errp.  On failure, it
>  *   should set a new error, e.g. with error_setg(errp, ...), or
>  *   propagate an existing one, e.g. with error_propagate(errp, ...).
> 
> Note there is no third case "except on some failures, don't set an
> error".
> 
> Of course, bdrv_find_protocol() might be an exception, i.e. I still owe
> you proof of my claim about its contract.  So let's examine callers:
> 
> * print_block_option_help() in qemu-img.c:
> 
>             proto_drv = bdrv_find_protocol(filename, true, &local_err);
>             if (!proto_drv) {
>                 error_report_err(local_err);
>                 qemu_opts_free(create_opts);
>                 return 1;
>             }
> 
>   If bdrv_find_protocol() returns null without setting @local_err,
>   error_report_err() will crash.

I don't see how this patch changes anything here, or why it relates to this code at all.

No, this patch does not change bdrv_find_protocol() unwritten claim about the Error parameter.

I think you should re-read the code honestly.

> 
>   Two more instances elsewhere in qemu-img.c
> 
> * bdrv_create_file() in block.c:
> 
>         drv = bdrv_find_protocol(filename, true, errp);
>         if (drv == NULL) {
>             return -ENOENT;
>         }
> 
>   If bdrv_find_protocol() returns null without setting @local_err, then
>   bdrv_create_file() returns -ENOENT without setting an error.

Again, I do not see how this is an argument pertinent to this patch.

> 
>   bdrv_create_file() is called by a bunch of BlockDriver
>   .bdrv_co_create_opts() methods, e.g. raw_co_create_opts():
> 
>         return bdrv_create_file(filename, opts, errp);
> 
>   The method is called by bdrv_create_co_entry():
> 
>         ret = cco->drv->bdrv_co_create_opts(cco->drv,
>                                             cco->filename, cco->opts, &local_err);
>         error_propagate(&cco->err, local_err);
>         cco->ret = ret;
> 
>        If bdrv_find_protocol() returns null without setting an error, then
>        we set cco->ret = -ENOENT and cco->err = NULL.
> 
>        One of its users is bdrv_create():
> 
>         if (qemu_in_coroutine()) {
>             /* Fast-path if already in coroutine context */
>             bdrv_create_co_entry(&cco);
>         } else {
>             co = qemu_coroutine_create(bdrv_create_co_entry, &cco);
>             qemu_coroutine_enter(co);
>             while (cco.ret == NOT_DONE) {
>                 aio_poll(qemu_get_aio_context(), true);
>             }
>         }
> 
>         ret = cco.ret;
>         if (ret < 0) {
>             if (cco.err) {
>                 error_propagate(errp, cco.err);
>             } else {
>                 error_setg_errno(errp, -ret, "Could not create image");
>             }
>         }
> 
>     out:
>         g_free(cco.filename);
>         return ret;
> 
>   If bdrv_find_protocol() returns null without setting an error, then
>   bdrv_create() returns -ENOENT without setting an error.
> 
>   One of its callers is bdrv_append_temp_snapshot():
> 
>     ret = bdrv_create(&bdrv_qcow2, tmp_filename, opts, errp);
>     qemu_opts_del(opts);
>     if (ret < 0) {
>         error_prepend(errp, "Could not create temporary overlay '%s': ",
>                       tmp_filename);
>         goto out;
>     }
> 
>   Crash.
> 
> Adding failure modes that use @errp differently than existing ones is
> playing with fire :)


Except this patch has nothing to do with the material presented here.

> 
>>> Please check all uses of your FOO_load_one() for this issue.

Will do again, but to me it looks unrelated.

>>
>> I have checked all instances multiple times. If there are issues please present them so I can address them.
> 
> I humbly suggest that instances in functions that set an error on
> failures before this patch need re-checking.

Will go over it again.

> 
>>>> diff --git a/block/dmg.c b/block/dmg.c
>>>> index 98db18d82a..11d184d39c 100644
>>>> --- a/block/dmg.c
>>>> +++ b/block/dmg.c
>>>> @@ -434,6 +434,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>>>      uint64_t plist_xml_offset, plist_xml_length;
>>>>      int64_t offset;
>>>>      int ret;
>>>> +    Error *local_err = NULL;
>>>>  
>>>>      ret = bdrv_apply_auto_read_only(bs, NULL, errp);
>>>>      if (ret < 0) {
>>>> @@ -446,8 +447,21 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
>>>>          return -EINVAL;
>>>>      }
>>>>  
>>>> -    block_module_load_one("dmg-bz2");
>>>> -    block_module_load_one("dmg-lzfse");
>>>> +    if (!block_module_load_one("dmg-bz2", &local_err)) {
>>>> +        if (local_err) {
>>>> +            error_report_err(local_err);
>>>> +            return -EINVAL;
>>>> +        }
>>>> +        warn_report("dmg-bz2 module not present, bz2 decomp unavailable");
>>>> +    }
>>>> +    local_err = NULL;
>>>> +    if (!block_module_load_one("dmg-lzfse", &local_err)) {
>>>> +        if (local_err) {
>>>> +            error_report_err(local_err);
>>>> +            return -EINVAL;
>>>> +        }
>>>> +        warn_report("dmg-lzfse module not present, lzfse decomp unavailable");
>>>> +    }
>>>>  
>>>>      s->n_chunks = 0;
>>>>      s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
>>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>>> index 0806d8fcaa..5902c59c94 100644
>>>> --- a/hw/core/qdev.c
>>>> +++ b/hw/core/qdev.c
>>>> @@ -148,7 +148,15 @@ bool qdev_set_parent_bus(DeviceState *dev, BusState *bus, Error **errp)
>>>>  DeviceState *qdev_new(const char *name)
>>>>  {
>>>>      if (!object_class_by_name(name)) {
>>>> -        module_load_qom_one(name);
>>>> +        Error *local_err = NULL;
>>>> +        if (!module_load_qom_one(name, &local_err)) {
>>>> +            if (local_err) {
>>>> +                error_report_err(local_err);
>>>> +            } else {
>>>> +                error_report("could not find a module for type '%s'", name);
>>>> +            }
>>>> +            abort();
>>>
>>> Why is this a programming error?
>>
>> As described before in your request for previous/after, this is a programming error because qdev_new is not expected to fail.
>> All QEMU users of qdev_new expect the API to return non-NULL and immediately use and dereference the pointer returned.
>> There is a separate API, qdev_try_new, for creating a device and allowing a NULL return value.
> 
> Well, then the programming error is "adding loadable device modules
> broke qdev_new()".
> 
> This is qdev_new() before commit 7ab6e7fcce "qdev: device module
> support":
> 
>     DeviceState *qdev_new(const char *name)
>     {
>         return DEVICE(object_new(name));
>     }
> 
> Precondition: @name is a valid device type name.  If it isn't, we fail
> an assertion.
> 
> If you got a @name that may not be valid, you must use qdev_try_new()
> instead:
> 
>     DeviceState *qdev_try_new(const char *name)
>     {
>         if (!object_class_by_name(name)) {
>             return NULL;
>         }
> 
>         return DEVICE(object_new(name));
>     }
> 
> The guard ensures we call object_new() only when the precondition holds.
> 
> Except it doesn't: when @name is a valid type name, but not a *device*
> type name, we still crash.  Bug.
> 
> Commit 7ab6e7fcce changed qdev_new() to
> 
>     DeviceState *qdev_new(const char *name)
>     {
>         if (!object_class_by_name(name)) {
>             module_load_qom_one(name);
>         }
>         return DEVICE(object_new(name));
>     }
> 
> Now the (enforced!) precondition is @name is a valid device type name
> and the device is either compiled in or it will succeed to load.
> 
> "Will succeed to load" is an inappropriate precondition.
> 
> Devices compiled as modules must be instantiated with qdev_try_new().
> Loadable module support should be dropped from qdev_new().

This should be a separate series / patch in my view, with bigger impact on the whole code.

> 
> Not your patch's fault, of course.  By the way, welcome to the swamp!
> Do not mind the crocodiles, they just want to play.
> 
>>>> +        }
>>>>      }
>>>>      return DEVICE(object_new(name));
>>>>  }
>>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>>> index 8c012bbe03..78d4c4de96 100644
>>>> --- a/include/qemu/module.h
>>>> +++ b/include/qemu/module.h
>>>> @@ -61,16 +61,44 @@ typedef enum {
>>>>  #define fuzz_target_init(function) module_init(function, \
>>>>                                                 MODULE_INIT_FUZZ_TARGET)
>>>>  #define migration_init(function) module_init(function, MODULE_INIT_MIGRATION)
>>>> -#define block_module_load_one(lib) module_load_one("block-", lib)
>>>> -#define ui_module_load_one(lib) module_load_one("ui-", lib)
>>>> -#define audio_module_load_one(lib) module_load_one("audio-", lib)
>>>> +#define block_module_load_one(lib, errp) module_load_one("block-", lib, errp)
>>>> +#define ui_module_load_one(lib, errp) module_load_one("ui-", lib, errp)
>>>> +#define audio_module_load_one(lib, errp) module_load_one("audio-", lib, errp)
>>>>  
>>>>  void register_module_init(void (*fn)(void), module_init_type type);
>>>>  void register_dso_module_init(void (*fn)(void), module_init_type type);
>>>>  
>>>>  void module_call_init(module_init_type type);
>>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>>> -void module_load_qom_one(const char *type);
>>>> +
>>>> +/*
>>>> + * module_load_one: attempt to load a module from a set of directories
>>>> + *
>>>> + * directories searched are:
>>>> + * - getenv("QEMU_MODULE_DIR")
>>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>>> + * - /var/run/qemu/${version_dir}
>>>> + *
>>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>>> + * name:           name of the module
>>>> + * errp:           error to set in case the module is found, but load failed.
>>>> + *
>>>> + * Return value:   true on success (found and loaded);
>>>> + *                 if module if found, but load failed, errp will be set.
>>>> + *                 if module is not found, errp will not be set.
>>>
>>> I understand you need to distingush two failure modes "found, but load
>>> failed" and "not found".
>>
>> That is correct.
>>
>>>
>>> Functions that set an error on some failures only tend to be awkward: in
>>> addition to checking the return value for failure, you have to check
>>> @errp for special failures.  This is particularly cumbersome when it
>>> requires a @local_err and an error_propagate() just for that.  I
>>> generally prefer to return an error code and always set an error.
>>>
>>> That said, the patch doesn't look bad.  Perhaps it will be once the
>>> issues I pointed out above have been addressed.  Let's not worry about
>>> it right now.
>>>
>>>> + */
>>>> +bool module_load_one(const char *prefix, const char *name, Error **errp);
>>>> +
>>>> +/*
>>>> + * module_load_qom_one: attempt to load a module to provide a QOM type
>>>> + *
>>>> + * type:           the type to be provided
>>>> + * errp:           error to set.
>>>> + *
>>>> + * Return value:   true on success (found and loaded), false on failure.
>>>> + *                 If a module is simply not found for the type,
>>>> + *                 errp will not be set.
>>>> + */
>>>> +bool module_load_qom_one(const char *type, Error **errp);
>>>>  void module_load_qom_all(void);
>>>>  void module_allow_arch(const char *arch);
>>>>  
>>>
>>> [...]
>>>
>>>
>>> [*] Except it prints "Module is not supported by system\n" to stderr
>>> when !g_module_supported(), which doesn't look right to me.
>>>
>> I don't know what this comment relates to.
> 
> It's a footnote attached to "Before the patch, bdrv_find_format() fails
> silently[*]" above.
> 
> 
> I think the root of the problems with this patch is that you convert the
> FOO_module_load_one() functions to the Error API without propagating the
> error far enough.  It really needs to go all the way up.
> 

I respecfully disagree. Sometimes I think we need to be practical; there is for sure further improvement possible,
a redesign of the Error API to be more consistent with return values,
rethinking about use of qdev_new vs try_qdev_new() etc,

but in my view the series is complete and self-consistent
(and btw fixes an actual bug in your CentOS docker / kubevirt).

I'll review all the changes and see if merging some errors make sense in some cases, but generally I would recommend against it.

Further improvements can and should be done but in my view should not be part of this series,
I cannot honestly take at the moment that kind of work item.

Thanks,

Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16 15:06         ` Claudio Fontana
@ 2022-09-19  8:17           ` Markus Armbruster
  2022-09-19  8:45             ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-19  8:17 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

Claudio Fontana <cfontana@suse.de> writes:

> On 9/16/22 16:26, Markus Armbruster wrote:
>> Claudio Fontana <cfontana@suse.de> writes:
>> 
>>> On 9/16/22 11:27, Markus Armbruster wrote:
>>>> Claudio Fontana <cfontana@suse.de> writes:

[...]

>>>>> diff --git a/block.c b/block.c
>>>>> index bc85f46eed..8b610c6d95 100644
>>>>> --- a/block.c
>>>>> +++ b/block.c
>>>>> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>>>>      /* The driver isn't registered, maybe we need to load a module */
>>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>>> +            Error *local_err = NULL;
>>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>>> +                                       &local_err)) {
>>>>> +                if (local_err) {
>>>>> +                    error_report_err(local_err);
>>>>> +                }
>>>>> +                return NULL;
>>>>> +            }
>>>>>              break;
>>>>>          }
>>>>>      }
>>>>
>>>> Before the patch, bdrv_find_format() fails silently[*].
>>>>
>>>> Afterwards, it reports an error on some failures, but not on others.
>>>> Sure this is what we want?
>>>
>>> Yes, I am sure. It only reports an error if an error exists.
>>> When a module is not present, no error should be printed in this context.
>> 
>> Callers commonly look like this (this one is in bdrv_fill_options()):
>> 
>>         drv = bdrv_find_format(drvname);
>>         if (!drv) {
>>             error_setg(errp, "Unknown driver '%s'", drvname);
>>             return -ENOENT;
>>         }
>> 
>> where @errp is a parameter.
>> 
>> The patch changes bdrv_find_format() to report some failures with
>> error_report_err().  This is effectively the same anti-pattern I pointed
>> out below, starting with "Uh-oh": use of error_report() or equivalent
>> within a function with an Error ** parameter.
>> 
>> If the "Unknown driver" Error object is ultimately passed to
>> error_report_err(), we report two errors if module loading fails, else
>> one.
>> 
>> Emitting two error messages for the same problem, a useful one and a
>> useless one, is bad UI, but it's arguably still better than just a
>> useless one.
>
> I'll see if it makes sense to merge them.
> I wanted to avoid removing the existing Unknown driver / or Unknown Protocol errors
> which people might be relying on.

Error messages are not a stable interface, because if we made it one,
we'd be stuck with sub-optimal error messages forever.

Sometimes, people rely on them anyway, and if we're aware of it, we may
choose to provide a grace period before we make their latently broken
code manifestly broken.

>> But what if a caller wants to handle errors without reporting them?
>> This is a legitimate use of the Error API.  No go, we spam stderr
>> regardless.
>
> I suppose the Error API can be directed to something else than stderr?

Absolutely.

Callers can handle the error as they see fit.  This is a core design
feature, see "Separation of concerns" below.

For instance,

(1) When executing an HMP command, and the error makes the command fail,
the monitor core handles the error by reporting it to the command's
monitor.

(2) When executing a QMP command, and the error makes the command fail,
the monitor core handles the error by sending it to the QMP client.

(3) When the error is recoverable, say by trying something else instead,
we don't report the error at all (except perhaps via some tracepoint).

If you use error_report() or equivalent within such code, then:

(1) Works as intended, both messages go the command's monitor.

(2) The caller wants errors to go to the QMP client, but the
error_report() goes to stderr instead.

(3) The caller wants to silence errors, but the error_report() spams
stderr anyway.

>> What if a caller wants to report the error in some other way, say write
>> it to a log file?  No go, we still spam stderr.
>> 
>> The patch breaks on of the Error API's design maxims.  error.h's big
>> comment:
>> 
>>  * - Separation of concerns: the function is responsible for detecting
>>  *   errors and failing cleanly; handling the error is its caller's
>>  *   job.  [...]
>> 
>> The patch changes bdrv_find_options() to no longer satisfy this rule: it
>> doesn't leave handling entirely to the caller anymore.
>> 
>> You might argue that no existing caller does such things.

Actually, I'm pretty sure this is used by QMP commands.

>>                                                            Fine.  Then
>> I'll ask you to document that bdrv_fill_options() may only be used in
>> certain ways, unlike other functions using the Error API.  Same for all
>> the other affected functions.  And then Kevin will probably NAK the
>> whole thing :)
>>
>>>>> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>>          if (block_driver_modules[i].protocol_name &&
>>>>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>>> +            Error *local_err = NULL;
>>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>>> +                                       &local_err) && local_err) {
>>>>
>>>> Break the line before && local_err, please, like you do elsewhere.
>>>
>>> This is to avoid hitting the 80 chars on one line, as warned by checkpatch. Should I disregard checkpatch, is it not used anymore?
>> 
>> Elsewhere, you format like this:
>> 
>>                  if (!block_module_load_one(block_driver_modules[i].library_name,
>>                                             &local_err)
>>                      && local_err) {
>> 
>> Better, isn't it?
>
> I would not say so, I have no preference, but I can change it.
>
>> 
>>>
>>>>
>>>>> +                error_report_err(local_err);
>>>>> +            }
>>>>>              break;
>>>>>          }
>>>>>      }
>>>>
>>>> Uh-oh: error_report() or equivalent in a function with an Error **
>>>> parameter.  This is almost always wrong.  Shouldn't you pass the error
>>>> to the caller?
>>>
>>> I guess this is the "almost" case. No I should not pass anything back.
>>>
>>> The Error **errp parameter of bdrv_find_protocol, and the Error local_err parameter inside it are different errors.
>>>
>>> The first is about whether a protocol has been found or not.
>>> The second is about whether there was an error during the load of a module.
>>>
>>> No, local_err should not be passed back to the caller in my view. The error passed back to the caller is errp and is already set correctly.
>> 
>> Part of bdrv_find_protocol()'s (unstated) contract is "either succeed
>> and return non-null, or fail, return null, and set an error."
>> 
>> This is pretty much ubiquitous among Error-using functions.  error.h's
>> big comment:
>> 
>>  * - On success, the function should not touch *errp.  On failure, it
>>  *   should set a new error, e.g. with error_setg(errp, ...), or
>>  *   propagate an existing one, e.g. with error_propagate(errp, ...).
>> 
>> Note there is no third case "except on some failures, don't set an
>> error".
>> 
>> Of course, bdrv_find_protocol() might be an exception, i.e. I still owe
>> you proof of my claim about its contract.  So let's examine callers:
>> 
>> * print_block_option_help() in qemu-img.c:
>> 
>>             proto_drv = bdrv_find_protocol(filename, true, &local_err);
>>             if (!proto_drv) {
>>                 error_report_err(local_err);
>>                 qemu_opts_free(create_opts);
>>                 return 1;
>>             }
>> 
>>   If bdrv_find_protocol() returns null without setting @local_err,
>>   error_report_err() will crash.
>
> I don't see how this patch changes anything here, or why it relates to this code at all.
>
> No, this patch does not change bdrv_find_protocol() unwritten claim about the Error parameter.
>
> I think you should re-read the code honestly.

First, you're right, I got confused, and my analysis is wrong.  I missed
the && local_err after !block_module_load_one().  I believe I spotted it
only later, and then asked for it to be put on its own line, but missed
its significance for what I had already written.

Second, your reply was of the sort that can depress your review
priority.  Permanently when it becomes a pattern.  Hear me out.

It is obvious that you put significant time into this patch.  It is also
obvious that you made it with care.  Mistakes remain a possibility;
writing good patches is hard.  That's why we review.

Reviewing patches, even good ones is hard.  Review should be a
conversation between the patch submitter and the reviewers, where the
participants try to help each other along.

I put significant time into a careful review of this patch.  I misread
it in at least one spot, and saw a problem that isn't.  I needed you to
tell me that I'm wrong and help me understand.

Simply telling me to re-read the code is not helpful, and borders on
disrespect (not intended I presume).  What would have been helpful is
shooting a hole in my analysis, say by demonstrating that
bdrv_find_protocol() cannot fail without setting an error.

[...]

>> I think the root of the problems with this patch is that you convert the
>> FOO_module_load_one() functions to the Error API without propagating the
>> error far enough.  It really needs to go all the way up.
>> 
>
> I respecfully disagree. Sometimes I think we need to be practical; there is for sure further improvement possible,
> a redesign of the Error API to be more consistent with return values,
> rethinking about use of qdev_new vs try_qdev_new() etc,
>
> but in my view the series is complete and self-consistent
> (and btw fixes an actual bug in your CentOS docker / kubevirt).
>
> I'll review all the changes and see if merging some errors make sense in some cases, but generally I would recommend against it.
>
> Further improvements can and should be done but in my view should not be part of this series,
> I cannot honestly take at the moment that kind of work item.

Some issues predate your patch.  They way loadable modules work is
clearly not good.

A patch that improves error reporting in some cases without making
things worse elsewhere could be acceptable even if it doesn't address
the existing problems comprehensively.  As you wrote, we need to be
practical.

That said, error_report() & friends from within a QMP command is a
fairly big no-no.  Such issues might push this patch into "makes things
worse elsewhere" territory.  Reserving judgement.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-19  8:17           ` Markus Armbruster
@ 2022-09-19  8:45             ` Claudio Fontana
  2022-09-21 12:47               ` Markus Armbruster
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-19  8:45 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

On 9/19/22 10:17, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> On 9/16/22 16:26, Markus Armbruster wrote:
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>>> On 9/16/22 11:27, Markus Armbruster wrote:
>>>>> Claudio Fontana <cfontana@suse.de> writes:
> 
> [...]
> 
>>>>>> diff --git a/block.c b/block.c
>>>>>> index bc85f46eed..8b610c6d95 100644
>>>>>> --- a/block.c
>>>>>> +++ b/block.c
>>>>>> @@ -464,7 +464,14 @@ BlockDriver *bdrv_find_format(const char *format_name)
>>>>>>      /* The driver isn't registered, maybe we need to load a module */
>>>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>>>          if (!strcmp(block_driver_modules[i].format_name, format_name)) {
>>>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>>>> +            Error *local_err = NULL;
>>>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>>>> +                                       &local_err)) {
>>>>>> +                if (local_err) {
>>>>>> +                    error_report_err(local_err);
>>>>>> +                }
>>>>>> +                return NULL;
>>>>>> +            }
>>>>>>              break;
>>>>>>          }
>>>>>>      }
>>>>>
>>>>> Before the patch, bdrv_find_format() fails silently[*].
>>>>>
>>>>> Afterwards, it reports an error on some failures, but not on others.
>>>>> Sure this is what we want?
>>>>
>>>> Yes, I am sure. It only reports an error if an error exists.
>>>> When a module is not present, no error should be printed in this context.
>>>
>>> Callers commonly look like this (this one is in bdrv_fill_options()):
>>>
>>>         drv = bdrv_find_format(drvname);
>>>         if (!drv) {
>>>             error_setg(errp, "Unknown driver '%s'", drvname);
>>>             return -ENOENT;
>>>         }
>>>
>>> where @errp is a parameter.
>>>
>>> The patch changes bdrv_find_format() to report some failures with
>>> error_report_err().  This is effectively the same anti-pattern I pointed
>>> out below, starting with "Uh-oh": use of error_report() or equivalent
>>> within a function with an Error ** parameter.
>>>
>>> If the "Unknown driver" Error object is ultimately passed to
>>> error_report_err(), we report two errors if module loading fails, else
>>> one.
>>>
>>> Emitting two error messages for the same problem, a useful one and a
>>> useless one, is bad UI, but it's arguably still better than just a
>>> useless one.
>>
>> I'll see if it makes sense to merge them.
>> I wanted to avoid removing the existing Unknown driver / or Unknown Protocol errors
>> which people might be relying on.
> 
> Error messages are not a stable interface, because if we made it one,
> we'd be stuck with sub-optimal error messages forever.
> 
> Sometimes, people rely on them anyway, and if we're aware of it, we may
> choose to provide a grace period before we make their latently broken
> code manifestly broken.
> 
>>> But what if a caller wants to handle errors without reporting them?
>>> This is a legitimate use of the Error API.  No go, we spam stderr
>>> regardless.
>>
>> I suppose the Error API can be directed to something else than stderr?
> 
> Absolutely.
> 
> Callers can handle the error as they see fit.  This is a core design
> feature, see "Separation of concerns" below.
> 
> For instance,
> 
> (1) When executing an HMP command, and the error makes the command fail,
> the monitor core handles the error by reporting it to the command's
> monitor.
> 
> (2) When executing a QMP command, and the error makes the command fail,
> the monitor core handles the error by sending it to the QMP client.
> 
> (3) When the error is recoverable, say by trying something else instead,
> we don't report the error at all (except perhaps via some tracepoint).
> 
> If you use error_report() or equivalent within such code, then:
> 
> (1) Works as intended, both messages go the command's monitor.
> 
> (2) The caller wants errors to go to the QMP client, but the
> error_report() goes to stderr instead.
> 
> (3) The caller wants to silence errors, but the error_report() spams
> stderr anyway.
> 
>>> What if a caller wants to report the error in some other way, say write
>>> it to a log file?  No go, we still spam stderr.
>>>
>>> The patch breaks on of the Error API's design maxims.  error.h's big
>>> comment:
>>>
>>>  * - Separation of concerns: the function is responsible for detecting
>>>  *   errors and failing cleanly; handling the error is its caller's
>>>  *   job.  [...]
>>>
>>> The patch changes bdrv_find_options() to no longer satisfy this rule: it
>>> doesn't leave handling entirely to the caller anymore.
>>>
>>> You might argue that no existing caller does such things.
> 
> Actually, I'm pretty sure this is used by QMP commands.
> 
>>>                                                            Fine.  Then
>>> I'll ask you to document that bdrv_fill_options() may only be used in
>>> certain ways, unlike other functions using the Error API.  Same for all
>>> the other affected functions.  And then Kevin will probably NAK the
>>> whole thing :)
>>>
>>>>>> @@ -976,7 +983,11 @@ BlockDriver *bdrv_find_protocol(const char *filename,
>>>>>>      for (i = 0; i < (int)ARRAY_SIZE(block_driver_modules); ++i) {
>>>>>>          if (block_driver_modules[i].protocol_name &&
>>>>>>              !strcmp(block_driver_modules[i].protocol_name, protocol)) {
>>>>>> -            block_module_load_one(block_driver_modules[i].library_name);
>>>>>> +            Error *local_err = NULL;
>>>>>> +            if (!block_module_load_one(block_driver_modules[i].library_name,
>>>>>> +                                       &local_err) && local_err) {
>>>>>
>>>>> Break the line before && local_err, please, like you do elsewhere.
>>>>
>>>> This is to avoid hitting the 80 chars on one line, as warned by checkpatch. Should I disregard checkpatch, is it not used anymore?
>>>
>>> Elsewhere, you format like this:
>>>
>>>                  if (!block_module_load_one(block_driver_modules[i].library_name,
>>>                                             &local_err)
>>>                      && local_err) {
>>>
>>> Better, isn't it?
>>
>> I would not say so, I have no preference, but I can change it.
>>
>>>
>>>>
>>>>>
>>>>>> +                error_report_err(local_err);
>>>>>> +            }
>>>>>>              break;
>>>>>>          }
>>>>>>      }
>>>>>
>>>>> Uh-oh: error_report() or equivalent in a function with an Error **
>>>>> parameter.  This is almost always wrong.  Shouldn't you pass the error
>>>>> to the caller?
>>>>
>>>> I guess this is the "almost" case. No I should not pass anything back.
>>>>
>>>> The Error **errp parameter of bdrv_find_protocol, and the Error local_err parameter inside it are different errors.
>>>>
>>>> The first is about whether a protocol has been found or not.
>>>> The second is about whether there was an error during the load of a module.
>>>>
>>>> No, local_err should not be passed back to the caller in my view. The error passed back to the caller is errp and is already set correctly.
>>>
>>> Part of bdrv_find_protocol()'s (unstated) contract is "either succeed
>>> and return non-null, or fail, return null, and set an error."
>>>
>>> This is pretty much ubiquitous among Error-using functions.  error.h's
>>> big comment:
>>>
>>>  * - On success, the function should not touch *errp.  On failure, it
>>>  *   should set a new error, e.g. with error_setg(errp, ...), or
>>>  *   propagate an existing one, e.g. with error_propagate(errp, ...).
>>>
>>> Note there is no third case "except on some failures, don't set an
>>> error".
>>>
>>> Of course, bdrv_find_protocol() might be an exception, i.e. I still owe
>>> you proof of my claim about its contract.  So let's examine callers:
>>>
>>> * print_block_option_help() in qemu-img.c:
>>>
>>>             proto_drv = bdrv_find_protocol(filename, true, &local_err);
>>>             if (!proto_drv) {
>>>                 error_report_err(local_err);
>>>                 qemu_opts_free(create_opts);
>>>                 return 1;
>>>             }
>>>
>>>   If bdrv_find_protocol() returns null without setting @local_err,
>>>   error_report_err() will crash.
>>
>> I don't see how this patch changes anything here, or why it relates to this code at all.
>>
>> No, this patch does not change bdrv_find_protocol() unwritten claim about the Error parameter.
>>
>> I think you should re-read the code honestly.
> 
> First, you're right, I got confused, and my analysis is wrong.  I missed
> the && local_err after !block_module_load_one().  I believe I spotted it
> only later, and then asked for it to be put on its own line, but missed
> its significance for what I had already written.
> 
> Second, your reply was of the sort that can depress your review
> priority.  Permanently when it becomes a pattern.  Hear me out.
> 
> It is obvious that you put significant time into this patch.  It is also
> obvious that you made it with care.  Mistakes remain a possibility;
> writing good patches is hard.  That's why we review.
> 
> Reviewing patches, even good ones is hard.  Review should be a
> conversation between the patch submitter and the reviewers, where the
> participants try to help each other along.
> 
> I put significant time into a careful review of this patch.  I misread
> it in at least one spot, and saw a problem that isn't.  I needed you to
> tell me that I'm wrong and help me understand.
> 
> Simply telling me to re-read the code is not helpful, and borders on
> disrespect (not intended I presume).  What would have been helpful is
> shooting a hole in my analysis, say by demonstrating that
> bdrv_find_protocol() cannot fail without setting an error.
> 
> [...]
> 
>>> I think the root of the problems with this patch is that you convert the
>>> FOO_module_load_one() functions to the Error API without propagating the
>>> error far enough.  It really needs to go all the way up.
>>>
>>
>> I respecfully disagree. Sometimes I think we need to be practical; there is for sure further improvement possible,
>> a redesign of the Error API to be more consistent with return values,
>> rethinking about use of qdev_new vs try_qdev_new() etc,
>>
>> but in my view the series is complete and self-consistent
>> (and btw fixes an actual bug in your CentOS docker / kubevirt).
>>
>> I'll review all the changes and see if merging some errors make sense in some cases, but generally I would recommend against it.
>>
>> Further improvements can and should be done but in my view should not be part of this series,
>> I cannot honestly take at the moment that kind of work item.
> 
> Some issues predate your patch.  They way loadable modules work is
> clearly not good.
> 
> A patch that improves error reporting in some cases without making
> things worse elsewhere could be acceptable even if it doesn't address
> the existing problems comprehensively.  As you wrote, we need to be
> practical.
> 
> That said, error_report() & friends from within a QMP command is a
> fairly big no-no.  Such issues might push this patch into "makes things
> worse elsewhere" territory.  Reserving judgement.
> 

Hi Markus, sorry for the harsh response last week, it comes from a position of lack of time,
and the expectation that Richard's review would be enough.

I don't have (and I suspect no one that is not 100% working on qemu "upstream" and cannot hang around IRC channels or such)
good visibility of what happens to patches after they are reviewed like in this case,
so this makes the process in my view overall quite unpredictable and seems to require a lot of time investment for even the smallest change.

That is why I would generally prefer smaller chunks of work to be committed incrementally, each series self-contained,
rather than hundred-patch series that I expect to lose momentum or get lost.

There are things we can fine tune here, but I see some of the changes you propose as going beyond what a fix for the acute problem really requires,
so if this series requires a lot of work in your view for even the minimal fix to be done, I think we will need someone else to step in and expand on the work,

or we will have to keep the fix as downstream-only for now, and I'll try to find more time to invest early next year.

Thanks,

Claudio



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-16  9:27   ` Markus Armbruster
  2022-09-16 10:48     ` Claudio Fontana
@ 2022-09-19 10:18     ` Philippe Mathieu-Daudé via
  2022-09-21 12:16       ` Markus Armbruster
  1 sibling, 1 reply; 50+ messages in thread
From: Philippe Mathieu-Daudé via @ 2022-09-19 10:18 UTC (permalink / raw)
  To: Markus Armbruster, Claudio Fontana
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé

On 16/9/22 11:27, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> improve error handling during module load, by changing:
>>
>> bool module_load_one(const char *prefix, const char *lib_name);
>> void module_load_qom_one(const char *type);
>>
>> to:
>>
>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>> bool module_load_qom_one(const char *type, Error **errp);
>>
>> module_load_qom_one has been introduced in:
>>
>> commit 28457744c345 ("module: qom module support"), which built on top of
>> module_load_one, but discarded the bool return value. Restore it.
>>
>> Adapt all callers to emit errors, or ignore them, or fail hard,
>> as appropriate in each context.
> 
> How exactly does behavior change?  The commit message is mum on the
> behavior before the patch, and vague on the behavior afterwards.
> 
>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>> ---
>>   audio/audio.c         |   9 ++-
>>   block.c               |  15 ++++-
>>   block/dmg.c           |  18 +++++-
>>   hw/core/qdev.c        |  10 ++-
>>   include/qemu/module.h |  38 ++++++++++--
>>   qom/object.c          |  18 +++++-
>>   softmmu/qtest.c       |   6 +-
>>   ui/console.c          |  18 +++++-
>>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>>   9 files changed, 194 insertions(+), 78 deletions(-)

>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>> index 8c012bbe03..78d4c4de96 100644
>> --- a/include/qemu/module.h
>> +++ b/include/qemu/module.h
>> @@ -61,16 +61,44 @@ typedef enum {

>>   
>>   void module_call_init(module_init_type type);
>> -bool module_load_one(const char *prefix, const char *lib_name);
>> -void module_load_qom_one(const char *type);
>> +
>> +/*
>> + * module_load_one: attempt to load a module from a set of directories
>> + *
>> + * directories searched are:
>> + * - getenv("QEMU_MODULE_DIR")
>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>> + * - /var/run/qemu/${version_dir}
>> + *
>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>> + * name:           name of the module
>> + * errp:           error to set in case the module is found, but load failed.
>> + *
>> + * Return value:   true on success (found and loaded);
>> + *                 if module if found, but load failed, errp will be set.
>> + *                 if module is not found, errp will not be set.
> 
> I understand you need to distingush two failure modes "found, but load
> failed" and "not found".
> 
> Functions that set an error on some failures only tend to be awkward: in
> addition to checking the return value for failure, you have to check
> @errp for special failures.  This is particularly cumbersome when it
> requires a @local_err and an error_propagate() just for that.  I
> generally prefer to return an error code and always set an error.

I notice the same issue, therefore would suggest this alternative
prototype:

   bool module_load_one(const char *prefix, const char *name, 
             bool ignore_if_missing, Error **errp);
which always sets *errp when returning false:

  * Return value:   if ignore_if_missing is true:
  *                   true on success (found or missing), false on
  *                   load failure.
  *                 if ignore_if_missing is false:
  *                   true on success (found and loaded); false if
  *                   not found or load failed.
  *                 errp will be set if the returned value is false.
  */


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-19 10:18     ` Philippe Mathieu-Daudé via
@ 2022-09-21 12:16       ` Markus Armbruster
  2022-09-21 16:03         ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-21 12:16 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Claudio Fontana, Paolo Bonzini, Richard Henderson, Kevin Wolf,
	qemu-devel, dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé

Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

> On 16/9/22 11:27, Markus Armbruster wrote:
>> Claudio Fontana <cfontana@suse.de> writes:
>> 
>>> improve error handling during module load, by changing:
>>>
>>> bool module_load_one(const char *prefix, const char *lib_name);
>>> void module_load_qom_one(const char *type);
>>>
>>> to:
>>>
>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>> bool module_load_qom_one(const char *type, Error **errp);
>>>
>>> module_load_qom_one has been introduced in:
>>>
>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>> module_load_one, but discarded the bool return value. Restore it.
>>>
>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>> as appropriate in each context.
>> 
>> How exactly does behavior change?  The commit message is mum on the
>> behavior before the patch, and vague on the behavior afterwards.
>> 
>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>> ---
>>>   audio/audio.c         |   9 ++-
>>>   block.c               |  15 ++++-
>>>   block/dmg.c           |  18 +++++-
>>>   hw/core/qdev.c        |  10 ++-
>>>   include/qemu/module.h |  38 ++++++++++--
>>>   qom/object.c          |  18 +++++-
>>>   softmmu/qtest.c       |   6 +-
>>>   ui/console.c          |  18 +++++-
>>>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>   9 files changed, 194 insertions(+), 78 deletions(-)
>
>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>> index 8c012bbe03..78d4c4de96 100644
>>> --- a/include/qemu/module.h
>>> +++ b/include/qemu/module.h
>>> @@ -61,16 +61,44 @@ typedef enum {
>
>>>   
>>>   void module_call_init(module_init_type type);
>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>> -void module_load_qom_one(const char *type);
>>> +
>>> +/*
>>> + * module_load_one: attempt to load a module from a set of directories
>>> + *
>>> + * directories searched are:
>>> + * - getenv("QEMU_MODULE_DIR")
>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>> + * - /var/run/qemu/${version_dir}
>>> + *
>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>> + * name:           name of the module
>>> + * errp:           error to set in case the module is found, but load failed.
>>> + *
>>> + * Return value:   true on success (found and loaded);
>>> + *                 if module if found, but load failed, errp will be set.
>>> + *                 if module is not found, errp will not be set.
>> 
>> I understand you need to distingush two failure modes "found, but load
>> failed" and "not found".
>> 
>> Functions that set an error on some failures only tend to be awkward: in
>> addition to checking the return value for failure, you have to check
>> @errp for special failures.  This is particularly cumbersome when it
>> requires a @local_err and an error_propagate() just for that.  I
>> generally prefer to return an error code and always set an error.
>
> I notice the same issue, therefore would suggest this alternative
> prototype:
>
>    bool module_load_one(const char *prefix, const char *name, 
>              bool ignore_if_missing, Error **errp);
> which always sets *errp when returning false:
>
>   * Return value:   if ignore_if_missing is true:
>   *                   true on success (found or missing), false on
>   *                   load failure.
>   *                 if ignore_if_missing is false:
>   *                   true on success (found and loaded); false if
>   *                   not found or load failed.
>   *                 errp will be set if the returned value is false.
>   */

I think this interface is less surprising.

If having to pass a flag turns out to to be a legibility issue, we can
have wrapper functions.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-19  8:45             ` Claudio Fontana
@ 2022-09-21 12:47               ` Markus Armbruster
  0 siblings, 0 replies; 50+ messages in thread
From: Markus Armbruster @ 2022-09-21 12:47 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé, Philippe Mathieu-Daudé

Claudio Fontana <cfontana@suse.de> writes:

> Hi Markus, sorry for the harsh response last week, it comes from a position of lack of time,
> and the expectation that Richard's review would be enough.

I gladly accept your apology.

We had the good fortune to meet in person (at KVM Forums before the
plague).  Makes it so much easier to react "good guy is having a bad
day, try to help" instead of "bad guy, avoid".

> I don't have (and I suspect no one that is not 100% working on qemu "upstream" and cannot hang around IRC channels or such)
> good visibility of what happens to patches after they are reviewed like in this case,
> so this makes the process in my view overall quite unpredictable and seems to require a lot of time investment for even the smallest change.

I understand.  It's a common complaint.

> That is why I would generally prefer smaller chunks of work to be committed incrementally, each series self-contained,
> rather than hundred-patch series that I expect to lose momentum or get lost.
>
> There are things we can fine tune here, but I see some of the changes you propose as going beyond what a fix for the acute problem really requires,
> so if this series requires a lot of work in your view for even the minimal fix to be done, I think we will need someone else to step in and expand on the work,
>
> or we will have to keep the fix as downstream-only for now, and I'll try to find more time to invest early next year.

That's fair.

We can accept patches that don't solve the entire problem.  I still like
to understand the entire problem to a useful degree, and have a rough
idea of how a solution of the entire problem could look like.

We may then conclude that such a solution isn't in the cards right now,
but a partial solution is, so we better take it.

Sometimes, the proposed patch turns out to be not even a partial
solution (but we recognize that only now we understand the entire
problem).  Then we better reject it.

Trouble is that developing such an understanding in patch review can
easily come over as non-negotiable demands.  This can be frustrating and
demoralizing for patch submitters.  I try to avoid this trap, but I
don't always succeed.  When understanding the patch and the problem
consumes 95% of my poor brain's capacity, I'm left with 5% for
communicating...  and when 5% aren't enough, I need to apologize for not
expressing myself clearly.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-21 12:16       ` Markus Armbruster
@ 2022-09-21 16:03         ` Claudio Fontana
  2022-09-22  6:07           ` Markus Armbruster
  2022-09-25 10:35           ` Richard Henderson
  0 siblings, 2 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-21 16:03 UTC (permalink / raw)
  To: Markus Armbruster, Philippe Mathieu-Daudé
  Cc: Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé

On 9/21/22 14:16, Markus Armbruster wrote:
> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
> 
>> On 16/9/22 11:27, Markus Armbruster wrote:
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>>> improve error handling during module load, by changing:
>>>>
>>>> bool module_load_one(const char *prefix, const char *lib_name);
>>>> void module_load_qom_one(const char *type);
>>>>
>>>> to:
>>>>
>>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>>> bool module_load_qom_one(const char *type, Error **errp);
>>>>
>>>> module_load_qom_one has been introduced in:
>>>>
>>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>>> module_load_one, but discarded the bool return value. Restore it.
>>>>
>>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>>> as appropriate in each context.
>>>
>>> How exactly does behavior change?  The commit message is mum on the
>>> behavior before the patch, and vague on the behavior afterwards.
>>>
>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>> ---
>>>>   audio/audio.c         |   9 ++-
>>>>   block.c               |  15 ++++-
>>>>   block/dmg.c           |  18 +++++-
>>>>   hw/core/qdev.c        |  10 ++-
>>>>   include/qemu/module.h |  38 ++++++++++--
>>>>   qom/object.c          |  18 +++++-
>>>>   softmmu/qtest.c       |   6 +-
>>>>   ui/console.c          |  18 +++++-
>>>>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>>   9 files changed, 194 insertions(+), 78 deletions(-)
>>
>>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>>> index 8c012bbe03..78d4c4de96 100644
>>>> --- a/include/qemu/module.h
>>>> +++ b/include/qemu/module.h
>>>> @@ -61,16 +61,44 @@ typedef enum {
>>
>>>>   
>>>>   void module_call_init(module_init_type type);
>>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>>> -void module_load_qom_one(const char *type);
>>>> +
>>>> +/*
>>>> + * module_load_one: attempt to load a module from a set of directories
>>>> + *
>>>> + * directories searched are:
>>>> + * - getenv("QEMU_MODULE_DIR")
>>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>>> + * - /var/run/qemu/${version_dir}
>>>> + *
>>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>>> + * name:           name of the module
>>>> + * errp:           error to set in case the module is found, but load failed.
>>>> + *
>>>> + * Return value:   true on success (found and loaded);
>>>> + *                 if module if found, but load failed, errp will be set.
>>>> + *                 if module is not found, errp will not be set.
>>>
>>> I understand you need to distingush two failure modes "found, but load
>>> failed" and "not found".
>>>
>>> Functions that set an error on some failures only tend to be awkward: in
>>> addition to checking the return value for failure, you have to check
>>> @errp for special failures.  This is particularly cumbersome when it
>>> requires a @local_err and an error_propagate() just for that.  I
>>> generally prefer to return an error code and always set an error.
>>
>> I notice the same issue, therefore would suggest this alternative
>> prototype:
>>
>>    bool module_load_one(const char *prefix, const char *name, 
>>              bool ignore_if_missing, Error **errp);
>> which always sets *errp when returning false:
>>
>>   * Return value:   if ignore_if_missing is true:
>>   *                   true on success (found or missing), false on
>>   *                   load failure.
>>   *                 if ignore_if_missing is false:
>>   *                   true on success (found and loaded); false if
>>   *                   not found or load failed.
>>   *                 errp will be set if the returned value is false.
>>   */
> 
> I think this interface is less surprising.
> 
> If having to pass a flag turns out to to be a legibility issue, we can
> have wrapper functions.
> 

To me it seems even more confusing tbh. Do we have more ideas? Richard?

bool module_load_one(const char *prefix, const char *name, Error **errp);

In my mind we should really say,

RETURN VALUE: a bool with the meaning: "was a module loaded? true/false"

ERROR: The Error parameter tells us: "was there an error loading the module?"

Makes sense to me, but maybe someone has a better one.

Btw, as an aside, for the general Error API pattern, if the bool return value and Error != NULL should be always related 1:1,
It would have been better to design it with only one of those informing about the error (Error, as it carries the additional Error information, besides the information that Error != NULL),
and remove the extra degree of freedom for a return value that at this point (in the current code) carries ZERO additional useful information.

We could have designed the API to use the return value as... an actual return value for solving the domain problem at hand,
and use only the Error parameter for the error path.

Ie the API standard pattern could have been, instead of bool function(Error **),

return_value_type_t function_that_can_fail(function_arguments, ..., Error **errp);

and then leave both return_value_type_t and the function_arguments for the normal function operation.

rv = function_that_can_fail(errp);
if (errp != NULL) {
    /* handle the error */
}
if (rv > 7) {
    /* handle the return value */
}

Would it not be better to handle the Error path and the normal return value separately?

With this pattern in mind, this specific case would then not be surprising to anyone,
and we would not have to start cooking up booleans to start passing to each function to say how errors should be treated,
as nobody would expect the bool returned to mean anything related with the Error path.

But this again would be rethinking the whole Error API.

We should in any case do the right thing in this specific case, and this specific case (handling module load errors vs modules not installed),
is not served well by the current code, whether it went through this attentive abstract principles scrutiny or not (seems not to me).

Thanks,

Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-21 16:03         ` Claudio Fontana
@ 2022-09-22  6:07           ` Markus Armbruster
  2022-09-22  8:28             ` Daniel P. Berrangé
  2022-09-25 10:35           ` Richard Henderson
  1 sibling, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22  6:07 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau,
	Daniel P.Berrangé

Claudio Fontana <cfontana@suse.de> writes:

> On 9/21/22 14:16, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
>> 
>>> On 16/9/22 11:27, Markus Armbruster wrote:
>>>> Claudio Fontana <cfontana@suse.de> writes:
>>>>
>>>>> improve error handling during module load, by changing:
>>>>>
>>>>> bool module_load_one(const char *prefix, const char *lib_name);
>>>>> void module_load_qom_one(const char *type);
>>>>>
>>>>> to:
>>>>>
>>>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>>>> bool module_load_qom_one(const char *type, Error **errp);
>>>>>
>>>>> module_load_qom_one has been introduced in:
>>>>>
>>>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>>>> module_load_one, but discarded the bool return value. Restore it.
>>>>>
>>>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>>>> as appropriate in each context.
>>>>
>>>> How exactly does behavior change?  The commit message is mum on the
>>>> behavior before the patch, and vague on the behavior afterwards.
>>>>
>>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>>> ---
>>>>>   audio/audio.c         |   9 ++-
>>>>>   block.c               |  15 ++++-
>>>>>   block/dmg.c           |  18 +++++-
>>>>>   hw/core/qdev.c        |  10 ++-
>>>>>   include/qemu/module.h |  38 ++++++++++--
>>>>>   qom/object.c          |  18 +++++-
>>>>>   softmmu/qtest.c       |   6 +-
>>>>>   ui/console.c          |  18 +++++-
>>>>>   util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>>>   9 files changed, 194 insertions(+), 78 deletions(-)
>>>
>>>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>>>> index 8c012bbe03..78d4c4de96 100644
>>>>> --- a/include/qemu/module.h
>>>>> +++ b/include/qemu/module.h
>>>>> @@ -61,16 +61,44 @@ typedef enum {
>>>
>>>>>   
>>>>>   void module_call_init(module_init_type type);
>>>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>>>> -void module_load_qom_one(const char *type);
>>>>> +
>>>>> +/*
>>>>> + * module_load_one: attempt to load a module from a set of directories
>>>>> + *
>>>>> + * directories searched are:
>>>>> + * - getenv("QEMU_MODULE_DIR")
>>>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>>>> + * - /var/run/qemu/${version_dir}
>>>>> + *
>>>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>>>> + * name:           name of the module
>>>>> + * errp:           error to set in case the module is found, but load failed.
>>>>> + *
>>>>> + * Return value:   true on success (found and loaded);
>>>>> + *                 if module if found, but load failed, errp will be set.
>>>>> + *                 if module is not found, errp will not be set.
>>>>
>>>> I understand you need to distingush two failure modes "found, but load
>>>> failed" and "not found".
>>>>
>>>> Functions that set an error on some failures only tend to be awkward: in
>>>> addition to checking the return value for failure, you have to check
>>>> @errp for special failures.  This is particularly cumbersome when it
>>>> requires a @local_err and an error_propagate() just for that.  I
>>>> generally prefer to return an error code and always set an error.
>>>
>>> I notice the same issue, therefore would suggest this alternative
>>> prototype:
>>>
>>>    bool module_load_one(const char *prefix, const char *name, 
>>>              bool ignore_if_missing, Error **errp);
>>> which always sets *errp when returning false:
>>>
>>>   * Return value:   if ignore_if_missing is true:
>>>   *                   true on success (found or missing), false on
>>>   *                   load failure.
>>>   *                 if ignore_if_missing is false:
>>>   *                   true on success (found and loaded); false if
>>>   *                   not found or load failed.
>>>   *                 errp will be set if the returned value is false.
>>>   */
>> 
>> I think this interface is less surprising.
>> 
>> If having to pass a flag turns out to to be a legibility issue, we can
>> have wrapper functions.
>> 
>
> To me it seems even more confusing tbh. Do we have more ideas? Richard?
>
> bool module_load_one(const char *prefix, const char *name, Error **errp);
>
> In my mind we should really say,
>
> RETURN VALUE: a bool with the meaning: "was a module loaded? true/false"
>
> ERROR: The Error parameter tells us: "was there an error loading the module?"
>
> Makes sense to me, but maybe someone has a better one.
>
> Btw, as an aside, for the general Error API pattern, if the bool return value and Error != NULL should be always related 1:1,
> It would have been better to design it with only one of those informing about the error (Error, as it carries the additional Error information, besides the information that Error != NULL),
> and remove the extra degree of freedom for a return value that at this point (in the current code) carries ZERO additional useful information.
>
> We could have designed the API to use the return value as... an actual return value for solving the domain problem at hand,
> and use only the Error parameter for the error path.

We did in the beginning, although only tacitly.  A deviation from
Error's inspiration GError (we rolled our own because we didn't use GLib
back then).

This deviation turned out to make Error cumbersome to use.  Which then
bred inconsistent usage, which led to confusion about how it should be
used, which complicated review, which finally made me bite the bullet
and correct the mistake:

 * - Whenever practical, also return a value that indicates success /
 *   failure.  This can make the error checking more concise, and can
 *   avoid useless error object creation and destruction.  Note that
 *   we still have many functions returning void.  We recommend
 *   • bool-valued functions return true on success / false on failure,
 *   • pointer-valued functions return non-null / null pointer, and
 *   • integer-valued functions return non-negative / negative.

This part of error.h's big comment goes back to

commit e3fe3988d7851cac30abffae06d2f555ff7bee62
Author: Markus Armbruster <armbru@redhat.com>
Date:   Tue Jul 7 18:05:31 2020 +0200

    error: Document Error API usage rules
    
    This merely codifies existing practice, with one exception: the rule
    advising against returning void, where existing practice is mixed.
    
    When the Error API was created, we adopted the (unwritten) rule to
    return void when the function returns no useful value on success,
    unlike GError, which recommends to return true on success and false on
    error then.
    
    When a function returns a distinct error value, say false, a checked
    call that passes the error up looks like
    
        if (!frobnicate(..., errp)) {
            handle the error...
        }
    
    When it returns void, we need
    
        Error *err = NULL;
    
        frobnicate(..., &err);
        if (err) {
            handle the error...
            error_propagate(errp, err);
        }
    
    Not only is this more verbose, it also creates an Error object even
    when @errp is null, &error_abort or &error_fatal.

Note: same issue when it returns something without a recognizable error
value.
    
    People got tired of the additional boilerplate, and started to ignore
    the unwritten rule.  The result is confusion among developers about
    the preferred usage.
    
    Make the rule advising against returning void official by putting it
    in writing.  This will hopefully reduce confusion.
    
    Update the examples accordingly.
    
    The remainder of this series will update a substantial amount of code
    to honor the rule.
    
    Signed-off-by: Markus Armbruster <armbru@redhat.com>
    Reviewed-by: Eric Blake <eblake@redhat.com>
    Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
    Reviewed-by: Greg Kurz <groug@kaod.org>
    Message-Id: <20200707160613.848843-4-armbru@redhat.com>
    [Tweak prose as per advice from Eric]

Please note "whenever practical".  Deviating from the rule is okay when
sticking to the rule is not practical.  In other words, judgement call.

When something is used a lot, consistency matters, because it reduces
cognitive load.  The assumption "this works just like it does all over
the place" is fair, so it better be correct.

Ease of use matters, too.  When sticking to the rule leads to awkward
code, we should stop and think.  Should we deviate from the rule?  Or
can we avoid that by tweaking the interface?

Philippe's proposed interface sticks to the rule.

Another interface that does: return -1 for error, 0 for module not found
(no error), and 1 for loaded.

I think I'd prefer Philippe's.

> Ie the API standard pattern could have been, instead of bool function(Error **),
>
> return_value_type_t function_that_can_fail(function_arguments, ..., Error **errp);
>
> and then leave both return_value_type_t and the function_arguments for the normal function operation.
>
> rv = function_that_can_fail(errp);
> if (errp != NULL) {
>     /* handle the error */

This is wrong.  You need to test *errp.

Which is also wrong.  You need ERRP_GUARD().  error.h's big comment
again:

 * Call a function, receive an error from it, and pass it to the caller
 * - when the function returns a value that indicates failure, say
 *   false:
 *     if (!foo(arg, errp)) {
 *         handle the error...
 *     }
 * - when it does not, say because it is a void function:
 *     ERRP_GUARD();
 *     foo(arg, errp);
 *     if (*errp) {
 *         handle the error...
 *     }
 * More on ERRP_GUARD() below.
 *
 * Code predating ERRP_GUARD() still exists, and looks like this:
 *     Error *err = NULL;
 *     foo(arg, &err);
 *     if (err) {
 *         handle the error...
 *         error_propagate(errp, err); // deprecated
 *     }
 * Avoid in new code.  Do *not* "optimize" it to
 *     foo(arg, errp);
 *     if (*errp) { // WRONG!
 *         handle the error...
 *     }
 * because errp may be NULL without the ERRP_GUARD() guard.
 *
 * But when all you do with the error is pass it on, please use
 *     foo(arg, errp);
 * for readability.
[...]
 * = Why, when and how to use ERRP_GUARD() =
 *
 * Without ERRP_GUARD(), use of the @errp parameter is restricted:
 * - It must not be dereferenced, because it may be null.
 * - It should not be passed to error_prepend() or
 *   error_append_hint(), because that doesn't work with &error_fatal.
 * ERRP_GUARD() lifts these restrictions.
 *
 * To use ERRP_GUARD(), add it right at the beginning of the function.
 * @errp can then be used without worrying about the argument being
 * NULL or &error_fatal.
 *
 * Using it when it's not needed is safe, but please avoid cluttering
 * the source with useless code.

> }
> if (rv > 7) {
>     /* handle the return value */
> }
>
> Would it not be better to handle the Error path and the normal return value separately?
>
> With this pattern in mind, this specific case would then not be surprising to anyone,
> and we would not have to start cooking up booleans to start passing to each function to say how errors should be treated,
> as nobody would expect the bool returned to mean anything related with the Error path.
>
> But this again would be rethinking the whole Error API.

Tall order.

> We should in any case do the right thing in this specific case, and this specific case (handling module load errors vs modules not installed),
> is not served well by the current code, whether it went through this attentive abstract principles scrutiny or not (seems not to me).

Loadable modules look grafted on, and the error handling seems quite
bad.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  6:07           ` Markus Armbruster
@ 2022-09-22  8:28             ` Daniel P. Berrangé
  2022-09-22  9:20               ` Claudio Fontana
  2022-09-22  9:38               ` Markus Armbruster
  0 siblings, 2 replies; 50+ messages in thread
From: Daniel P. Berrangé @ 2022-09-22  8:28 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Claudio Fontana, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
> Ease of use matters, too.  When sticking to the rule leads to awkward
> code, we should stop and think.  Should we deviate from the rule?  Or
> can we avoid that by tweaking the interface?
> 
> Philippe's proposed interface sticks to the rule.

The cost is that when you see a  function   dosomething(true|false) as
a reader you often have no idea what the effect of true vs false is
on the behaviour of that function. You resort to looking at the
API docs and/or code.  This is where C would really benefit from
having named parameters like as  dosomething(ignore_errors=true|false)
is totally obvious. Anyway, I digress.

> Another interface that does: return -1 for error, 0 for module not found
> (no error), and 1 for loaded.

IMHO this pattern is generally easier to understand when looking at
the callers, as the fatal error scenario is always clear.

That said I would suggest neither approach as the public facing
API. Rather stop trying to overload 3 states onto an error reporting
pattern that inherantly wants to be 2 states. Instead just have
distinct methods

  bool module_load_one(const char *prefix, const char *name, Error *errp)
  bool module_try_load_one(const char *prefix, const char *name, Error *errp)

other names are available for the second, eg module_load_one_optional()

Internally, both would call into a common helper following either
Philippe's idea, or the -1/0/1 int return value. Either is fine,
as they won't be exposed to any caller.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  8:28             ` Daniel P. Berrangé
@ 2022-09-22  9:20               ` Claudio Fontana
  2022-09-22  9:21                 ` Claudio Fontana
  2022-09-22  9:31                 ` Daniel P. Berrangé
  2022-09-22  9:38               ` Markus Armbruster
  1 sibling, 2 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22  9:20 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 10:28, Daniel P. Berrangé wrote:
> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>> Ease of use matters, too.  When sticking to the rule leads to awkward
>> code, we should stop and think.  Should we deviate from the rule?  Or
>> can we avoid that by tweaking the interface?
>>
>> Philippe's proposed interface sticks to the rule.
> 
> The cost is that when you see a  function   dosomething(true|false) as
> a reader you often have no idea what the effect of true vs false is
> on the behaviour of that function. You resort to looking at the
> API docs and/or code.  This is where C would really benefit from
> having named parameters like as  dosomething(ignore_errors=true|false)
> is totally obvious. Anyway, I digress.

The confusion here I think stems from the fact that not finding a module is _NORMAL BEHAVIOR_.

We can configure the qemu package once including configuration for all modules,
and then have the packager (or user) install the modules needed.

We should break away from the easy-to-lean-to mindset that

not finding a module => error path

Because this is not the case. This is what is being confused in this discussion.

Distinguishing the normal execution path from the error path (exception, in C++ parlance),

we are just hindered by the fact that C can only have one return value.


> 
>> Another interface that does: return -1 for error, 0 for module not found
>> (no error), and 1 for loaded.
> 
> IMHO this pattern is generally easier to understand when looking at
> the callers, as the fatal error scenario is always clear.
> 
> That said I would suggest neither approach as the public facing
> API. Rather stop trying to overload 3 states onto an error reporting
> pattern that inherantly wants to be 2 states. Instead just have
> distinct methods
> 
>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)


Here we are murking again the normal behavior and the error path.

What is the meaning of try? It's not as though we would error out inside the function module_load_one,
it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).

If this makes it clearer, lets keep the existing Error API pattern of using both the return value and the Error parameter for the error (exception),
and put the NORMAL BEHAVIOR error value in an argument using a pointer.

We do not pass a "bool ignore_errors" , because that is again confusing the fact that it is not module_load_one that handles the errors,
module_load_one should neither handle nor ignore the errors,
it should generate an error in the error case, and a return value in the normal case.

What about:

/*                                                                                                                                          
 * module_load_one: attempt to load a module from a set of directories                                                                      
 *                                                                                                                                          
 * directories searched are:                                                                                                                
 * - getenv("QEMU_MODULE_DIR")                                                                                                              
 * - get_relocated_path(CONFIG_QEMU_MODDIR);                                                                                                
 * - /var/run/qemu/${version_dir}                                                                                                           
 *                                                                                                                                          
 * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")                                                              
 * name:           name of the module                                                                                                       
 * errp:           (ERROR CONDITION): errp will be set on module load error.
 * found:          (output): set to true if a module with this name has been found, false if no such module is present.
 *                                                                                                                                          
 * Return value:   true if no error encountered (module loaded or not present).
 *                 false if an error has been generated, and errp will be set with the Error.
 */

Thanks,

C


> 
> other names are available for the second, eg module_load_one_optional()
> 
> Internally, both would call into a common helper following either
> Philippe's idea, or the -1/0/1 int return value. Either is fine,
> as they won't be exposed to any caller.
> 
> With regards,
> Daniel



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:20               ` Claudio Fontana
@ 2022-09-22  9:21                 ` Claudio Fontana
  2022-09-22  9:27                   ` Claudio Fontana
  2022-09-22  9:31                 ` Daniel P. Berrangé
  1 sibling, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22  9:21 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 11:20, Claudio Fontana wrote:
> On 9/22/22 10:28, Daniel P. Berrangé wrote:
>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>> can we avoid that by tweaking the interface?
>>>
>>> Philippe's proposed interface sticks to the rule.
>>
>> The cost is that when you see a  function   dosomething(true|false) as
>> a reader you often have no idea what the effect of true vs false is
>> on the behaviour of that function. You resort to looking at the
>> API docs and/or code.  This is where C would really benefit from
>> having named parameters like as  dosomething(ignore_errors=true|false)
>> is totally obvious. Anyway, I digress.
> 
> The confusion here I think stems from the fact that not finding a module is _NORMAL BEHAVIOR_.
> 
> We can configure the qemu package once including configuration for all modules,
> and then have the packager (or user) install the modules needed.
> 
> We should break away from the easy-to-lean-to mindset that
> 
> not finding a module => error path
> 
> Because this is not the case. This is what is being confused in this discussion.
> 
> Distinguishing the normal execution path from the error path (exception, in C++ parlance),
> 
> we are just hindered by the fact that C can only have one return value.
> 
> 
>>
>>> Another interface that does: return -1 for error, 0 for module not found
>>> (no error), and 1 for loaded.
>>
>> IMHO this pattern is generally easier to understand when looking at
>> the callers, as the fatal error scenario is always clear.
>>
>> That said I would suggest neither approach as the public facing
>> API. Rather stop trying to overload 3 states onto an error reporting
>> pattern that inherantly wants to be 2 states. Instead just have
>> distinct methods
>>
>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
> 
> 
> Here we are murking again the normal behavior and the error path.
> 
> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
> 
> If this makes it clearer, lets keep the existing Error API pattern of using both the return value and the Error parameter for the error (exception),
> and put the NORMAL BEHAVIOR error value in an argument using a pointer.
> 
> We do not pass a "bool ignore_errors" , because that is again confusing the fact that it is not module_load_one that handles the errors,
> module_load_one should neither handle nor ignore the errors,
> it should generate an error in the error case, and a return value in the normal case.
> 
> What about:
> 
> /*                                                                                                                                          
>  * module_load_one: attempt to load a module from a set of directories                                                                      
>  *                                                                                                                                          
>  * directories searched are:                                                                                                                
>  * - getenv("QEMU_MODULE_DIR")                                                                                                              
>  * - get_relocated_path(CONFIG_QEMU_MODDIR);                                                                                                
>  * - /var/run/qemu/${version_dir}                                                                                                           
>  *                                                                                                                                          
>  * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")                                                              
>  * name:           name of the module                                                                                                       
>  * errp:           (ERROR CONDITION): errp will be set on module load error.
>  * found:          (output): set to true if a module with this name has been found, false if no such module is present.
>  *                                                                                                                                          
>  * Return value:   true if no error encountered (module loaded or not present).
>  *                 false if an error has been generated, and errp will be set with the Error.
>  */
> 

Now with the missing prototype:

bool module_load_one(const char *prefix, const char *name, Error *errp, bool *found);

> Thanks,
> 
> C
> 
> 
>>
>> other names are available for the second, eg module_load_one_optional()
>>
>> Internally, both would call into a common helper following either
>> Philippe's idea, or the -1/0/1 int return value. Either is fine,
>> as they won't be exposed to any caller.
>>
>> With regards,
>> Daniel
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:21                 ` Claudio Fontana
@ 2022-09-22  9:27                   ` Claudio Fontana
  0 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22  9:27 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 11:21, Claudio Fontana wrote:
> On 9/22/22 11:20, Claudio Fontana wrote:
>> On 9/22/22 10:28, Daniel P. Berrangé wrote:
>>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>>> can we avoid that by tweaking the interface?
>>>>
>>>> Philippe's proposed interface sticks to the rule.
>>>
>>> The cost is that when you see a  function   dosomething(true|false) as
>>> a reader you often have no idea what the effect of true vs false is
>>> on the behaviour of that function. You resort to looking at the
>>> API docs and/or code.  This is where C would really benefit from
>>> having named parameters like as  dosomething(ignore_errors=true|false)
>>> is totally obvious. Anyway, I digress.
>>
>> The confusion here I think stems from the fact that not finding a module is _NORMAL BEHAVIOR_.
>>
>> We can configure the qemu package once including configuration for all modules,
>> and then have the packager (or user) install the modules needed.
>>
>> We should break away from the easy-to-lean-to mindset that
>>
>> not finding a module => error path
>>
>> Because this is not the case. This is what is being confused in this discussion.
>>
>> Distinguishing the normal execution path from the error path (exception, in C++ parlance),
>>
>> we are just hindered by the fact that C can only have one return value.
>>
>>
>>>
>>>> Another interface that does: return -1 for error, 0 for module not found
>>>> (no error), and 1 for loaded.
>>>
>>> IMHO this pattern is generally easier to understand when looking at
>>> the callers, as the fatal error scenario is always clear.
>>>
>>> That said I would suggest neither approach as the public facing
>>> API. Rather stop trying to overload 3 states onto an error reporting
>>> pattern that inherantly wants to be 2 states. Instead just have
>>> distinct methods
>>>
>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>
>>
>> Here we are murking again the normal behavior and the error path.
>>
>> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
>> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
>>
>> If this makes it clearer, lets keep the existing Error API pattern of using both the return value and the Error parameter for the error (exception),
>> and put the NORMAL BEHAVIOR error value in an argument using a pointer.
>>
>> We do not pass a "bool ignore_errors" , because that is again confusing the fact that it is not module_load_one that handles the errors,
>> module_load_one should neither handle nor ignore the errors,
>> it should generate an error in the error case, and a return value in the normal case.
>>
>> What about:
>>
>> /*                                                                                                                                          
>>  * module_load_one: attempt to load a module from a set of directories                                                                      
>>  *                                                                                                                                          
>>  * directories searched are:                                                                                                                
>>  * - getenv("QEMU_MODULE_DIR")                                                                                                              
>>  * - get_relocated_path(CONFIG_QEMU_MODDIR);                                                                                                
>>  * - /var/run/qemu/${version_dir}                                                                                                           
>>  *                                                                                                                                          
>>  * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")                                                              
>>  * name:           name of the module                                                                                                       
>>  * errp:           (ERROR CONDITION): errp will be set on module load error.
>>  * found:          (output): set to true if a module with this name has been found, false if no such module is present.
>>  *                                                                                                                                          
>>  * Return value:   true if no error encountered (module loaded or not present).
>>  *                 false if an error has been generated, and errp will be set with the Error.
>>  */
>>
> 
> Now with the missing prototype:
> 
> bool module_load_one(const char *prefix, const char *name, Error *errp, bool *found);

bool module_load_one(const char *prefix, const char *name, Error **errp, bool *found);



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:20               ` Claudio Fontana
  2022-09-22  9:21                 ` Claudio Fontana
@ 2022-09-22  9:31                 ` Daniel P. Berrangé
  2022-09-22  9:34                   ` Claudio Fontana
  1 sibling, 1 reply; 50+ messages in thread
From: Daniel P. Berrangé @ 2022-09-22  9:31 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
> On 9/22/22 10:28, Daniel P. Berrangé wrote:
> > 
> >> Another interface that does: return -1 for error, 0 for module not found
> >> (no error), and 1 for loaded.
> > 
> > IMHO this pattern is generally easier to understand when looking at
> > the callers, as the fatal error scenario is always clear.
> > 
> > That said I would suggest neither approach as the public facing
> > API. Rather stop trying to overload 3 states onto an error reporting
> > pattern that inherantly wants to be 2 states. Instead just have
> > distinct methods
> > 
> >   bool module_load_one(const char *prefix, const char *name, Error *errp)
> >   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
> 
> 
> Here we are murking again the normal behavior and the error path.
> 
> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).

I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
where the latter ignores the OOM error condition.

So in this case 'try' means try to load the module, but don't fail if
the module is missing on disk.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:31                 ` Daniel P. Berrangé
@ 2022-09-22  9:34                   ` Claudio Fontana
  2022-09-22 10:37                     ` Daniel P. Berrangé
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22  9:34 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 11:31, Daniel P. Berrangé wrote:
> On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
>> On 9/22/22 10:28, Daniel P. Berrangé wrote:
>>>
>>>> Another interface that does: return -1 for error, 0 for module not found
>>>> (no error), and 1 for loaded.
>>>
>>> IMHO this pattern is generally easier to understand when looking at
>>> the callers, as the fatal error scenario is always clear.
>>>
>>> That said I would suggest neither approach as the public facing
>>> API. Rather stop trying to overload 3 states onto an error reporting
>>> pattern that inherantly wants to be 2 states. Instead just have
>>> distinct methods
>>>
>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>
>>
>> Here we are murking again the normal behavior and the error path.
>>
>> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
>> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
> 
> I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
> where the latter ignores the OOM error condition.
> 
> So in this case 'try' means try to load the module, but don't fail if
> the module is missing on disk.

I understand what you mean, but this is wrong in this case.

We _do not fail_ in module_load_one, whether an error is present or not, whether a module is found or not.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  8:28             ` Daniel P. Berrangé
  2022-09-22  9:20               ` Claudio Fontana
@ 2022-09-22  9:38               ` Markus Armbruster
  2022-09-22  9:43                 ` Claudio Fontana
  1 sibling, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22  9:38 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Claudio Fontana, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>> Ease of use matters, too.  When sticking to the rule leads to awkward
>> code, we should stop and think.  Should we deviate from the rule?  Or
>> can we avoid that by tweaking the interface?
>> 
>> Philippe's proposed interface sticks to the rule.
>
> The cost is that when you see a  function   dosomething(true|false) as
> a reader you often have no idea what the effect of true vs false is
> on the behaviour of that function. You resort to looking at the
> API docs and/or code.  This is where C would really benefit from
> having named parameters like as  dosomething(ignore_errors=true|false)
> is totally obvious. Anyway, I digress.

Right.  Quoting myself: "If having to pass a flag turns out to to be a
legibility issue, we can have wrapper functions."  :)

>> Another interface that does: return -1 for error, 0 for module not found
>> (no error), and 1 for loaded.
>
> IMHO this pattern is generally easier to understand when looking at
> the callers, as the fatal error scenario is always clear.
>
> That said I would suggest neither approach as the public facing
> API. Rather stop trying to overload 3 states onto an error reporting
> pattern that inherantly wants to be 2 states. Instead just have
> distinct methods

Like these:

>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>
> other names are available for the second, eg module_load_one_optional()

module_load_one_if_there()?

By the way, the "one" in "module_load_one" & friends feels redundant.
When I see "module_load", I assume it loads one module.

> Internally, both would call into a common helper following either
> Philippe's idea, or the -1/0/1 int return value. Either is fine,
> as they won't be exposed to any caller.

Yup.



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:38               ` Markus Armbruster
@ 2022-09-22  9:43                 ` Claudio Fontana
  2022-09-22 12:42                   ` Markus Armbruster
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22  9:43 UTC (permalink / raw)
  To: Markus Armbruster, Daniel P. Berrangé
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 11:38, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>> can we avoid that by tweaking the interface?
>>>
>>> Philippe's proposed interface sticks to the rule.
>>
>> The cost is that when you see a  function   dosomething(true|false) as
>> a reader you often have no idea what the effect of true vs false is
>> on the behaviour of that function. You resort to looking at the
>> API docs and/or code.  This is where C would really benefit from
>> having named parameters like as  dosomething(ignore_errors=true|false)
>> is totally obvious. Anyway, I digress.
> 
> Right.  Quoting myself: "If having to pass a flag turns out to to be a
> legibility issue, we can have wrapper functions."  :)

There is something more fundamental that seems to be missed by most in this conversation,
ie the distinction between the normal execution path and the error path.


> 
>>> Another interface that does: return -1 for error, 0 for module not found
>>> (no error), and 1 for loaded.
>>
>> IMHO this pattern is generally easier to understand when looking at
>> the callers, as the fatal error scenario is always clear.
>>
>> That said I would suggest neither approach as the public facing
>> API. Rather stop trying to overload 3 states onto an error reporting
>> pattern that inherantly wants to be 2 states. Instead just have
>> distinct methods
> 
> Like these:
> 
>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>
>> other names are available for the second, eg module_load_one_optional()
> 
> module_load_one_if_there()?

And what do you do with the caller that needs to _know_ whether the module was "there" or not?

This is losing this information along the way, and the callers NEED it.

I really invite, with no offense intended, to read the hunks of my patch and the callers,
there are occasions where we need to _know_ if the module was there or not, and act depending on the context.

The information about "bool is_there" needs to be passed to the caller.

> 
> By the way, the "one" in "module_load_one" & friends feels redundant.
> When I see "module_load", I assume it loads one module.

there is a module_load_all.

> 
>> Internally, both would call into a common helper following either
>> Philippe's idea, or the -1/0/1 int return value. Either is fine,
>> as they won't be exposed to any caller.
> 
> Yup.
> 

C


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:34                   ` Claudio Fontana
@ 2022-09-22 10:37                     ` Daniel P. Berrangé
  2022-09-22 12:30                       ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel P. Berrangé @ 2022-09-22 10:37 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On Thu, Sep 22, 2022 at 11:34:22AM +0200, Claudio Fontana wrote:
> On 9/22/22 11:31, Daniel P. Berrangé wrote:
> > On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
> >> On 9/22/22 10:28, Daniel P. Berrangé wrote:
> >>>
> >>>> Another interface that does: return -1 for error, 0 for module not found
> >>>> (no error), and 1 for loaded.
> >>>
> >>> IMHO this pattern is generally easier to understand when looking at
> >>> the callers, as the fatal error scenario is always clear.
> >>>
> >>> That said I would suggest neither approach as the public facing
> >>> API. Rather stop trying to overload 3 states onto an error reporting
> >>> pattern that inherantly wants to be 2 states. Instead just have
> >>> distinct methods
> >>>
> >>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
> >>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
> >>
> >>
> >> Here we are murking again the normal behavior and the error path.
> >>
> >> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
> >> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
> > 
> > I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
> > where the latter ignores the OOM error condition.
> > 
> > So in this case 'try' means try to load the module, but don't fail if
> > the module is missing on disk.
> 
> I understand what you mean, but this is wrong in this case.
> 
> We _do not fail_ in module_load_one, whether an error is present
> or not, whether a module is found or not.

Looking at the callers though, AFAIK there are only two patterns
that we need. All callers should report a fatal error if the module
exists and loading it failed eg module is from mis-matched build.

Some callers also want a failure if the module doesn't exist on
disk (module_load_one can be made todo this), but most callers
are happy to carry on if the module doesn't exist (module_try_load_one
can simply return success status if it doesn't exist).

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 10:37                     ` Daniel P. Berrangé
@ 2022-09-22 12:30                       ` Claudio Fontana
  2022-09-22 12:33                         ` Daniel P. Berrangé
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 12:30 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 12:37, Daniel P. Berrangé wrote:
> On Thu, Sep 22, 2022 at 11:34:22AM +0200, Claudio Fontana wrote:
>> On 9/22/22 11:31, Daniel P. Berrangé wrote:
>>> On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
>>>> On 9/22/22 10:28, Daniel P. Berrangé wrote:
>>>>>
>>>>>> Another interface that does: return -1 for error, 0 for module not found
>>>>>> (no error), and 1 for loaded.
>>>>>
>>>>> IMHO this pattern is generally easier to understand when looking at
>>>>> the callers, as the fatal error scenario is always clear.
>>>>>
>>>>> That said I would suggest neither approach as the public facing
>>>>> API. Rather stop trying to overload 3 states onto an error reporting
>>>>> pattern that inherantly wants to be 2 states. Instead just have
>>>>> distinct methods
>>>>>
>>>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>>>
>>>>
>>>> Here we are murking again the normal behavior and the error path.
>>>>
>>>> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
>>>> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
>>>
>>> I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
>>> where the latter ignores the OOM error condition.
>>>
>>> So in this case 'try' means try to load the module, but don't fail if
>>> the module is missing on disk.
>>
>> I understand what you mean, but this is wrong in this case.
>>
>> We _do not fail_ in module_load_one, whether an error is present
>> or not, whether a module is found or not.
> 
> Looking at the callers though, AFAIK there are only two patterns
> that we need. All callers should report a fatal error if the module
> exists and loading it failed eg module is from mis-matched build.
> 
> Some callers also want a failure if the module doesn't exist on

Some callers also want to behave differently if the module is not installed.
It is not a "failure" in the same sense that what an Error returns.

For example, the block dmg module tries to also load other submodules for decompression.

If dmg does not find any such submodules, it will be able to support basic dmg just fine,
but won't be able to open compressed dmg.

I really think we should see "Error" in this context as an exception like in C++, with the caller stack deciding where to catch it and what to do with it.

The "is_found" or "is_loaded", or whatever we want to model as the real return value is the result of the normal module_load_one execution.


> disk (module_load_one can be made todo this), but most callers
> are happy to carry on if the module doesn't exist (module_try_load_one
> can simply return success status if it doesn't exist).

Yes, that is I think also a legitimate way to model the return value of the function, agreed.


> 
> With regards,
> Daniel


Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 12:30                       ` Claudio Fontana
@ 2022-09-22 12:33                         ` Daniel P. Berrangé
  2022-09-22 12:35                           ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Daniel P. Berrangé @ 2022-09-22 12:33 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On Thu, Sep 22, 2022 at 02:30:35PM +0200, Claudio Fontana wrote:
> On 9/22/22 12:37, Daniel P. Berrangé wrote:
> > On Thu, Sep 22, 2022 at 11:34:22AM +0200, Claudio Fontana wrote:
> >> On 9/22/22 11:31, Daniel P. Berrangé wrote:
> >>> On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
> >>>> On 9/22/22 10:28, Daniel P. Berrangé wrote:
> >>>>>
> >>>>>> Another interface that does: return -1 for error, 0 for module not found
> >>>>>> (no error), and 1 for loaded.
> >>>>>
> >>>>> IMHO this pattern is generally easier to understand when looking at
> >>>>> the callers, as the fatal error scenario is always clear.
> >>>>>
> >>>>> That said I would suggest neither approach as the public facing
> >>>>> API. Rather stop trying to overload 3 states onto an error reporting
> >>>>> pattern that inherantly wants to be 2 states. Instead just have
> >>>>> distinct methods
> >>>>>
> >>>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
> >>>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
> >>>>
> >>>>
> >>>> Here we are murking again the normal behavior and the error path.
> >>>>
> >>>> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
> >>>> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
> >>>
> >>> I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
> >>> where the latter ignores the OOM error condition.
> >>>
> >>> So in this case 'try' means try to load the module, but don't fail if
> >>> the module is missing on disk.
> >>
> >> I understand what you mean, but this is wrong in this case.
> >>
> >> We _do not fail_ in module_load_one, whether an error is present
> >> or not, whether a module is found or not.
> > 
> > Looking at the callers though, AFAIK there are only two patterns
> > that we need. All callers should report a fatal error if the module
> > exists and loading it failed eg module is from mis-matched build.
> > 
> > Some callers also want a failure if the module doesn't exist on
> 
> Some callers also want to behave differently if the module is not installed.
> It is not a "failure" in the same sense that what an Error returns.
> 
> For example, the block dmg module tries to also load other submodules for decompression.
> 
> If dmg does not find any such submodules, it will be able to support basic dmg just fine,
> but won't be able to open compressed dmg.

The dmg case looks like it works fine with module_try_load_one(). I
know your patch does a "warn" when the module is missing currently,
but IMHO that's the wrong place todo this. Any problems with use
of compressed dmg should be reported at the time the feature is
actually used, rather than when trying to load the module.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 12:33                         ` Daniel P. Berrangé
@ 2022-09-22 12:35                           ` Claudio Fontana
  0 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 12:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 14:33, Daniel P. Berrangé wrote:
> On Thu, Sep 22, 2022 at 02:30:35PM +0200, Claudio Fontana wrote:
>> On 9/22/22 12:37, Daniel P. Berrangé wrote:
>>> On Thu, Sep 22, 2022 at 11:34:22AM +0200, Claudio Fontana wrote:
>>>> On 9/22/22 11:31, Daniel P. Berrangé wrote:
>>>>> On Thu, Sep 22, 2022 at 11:20:07AM +0200, Claudio Fontana wrote:
>>>>>> On 9/22/22 10:28, Daniel P. Berrangé wrote:
>>>>>>>
>>>>>>>> Another interface that does: return -1 for error, 0 for module not found
>>>>>>>> (no error), and 1 for loaded.
>>>>>>>
>>>>>>> IMHO this pattern is generally easier to understand when looking at
>>>>>>> the callers, as the fatal error scenario is always clear.
>>>>>>>
>>>>>>> That said I would suggest neither approach as the public facing
>>>>>>> API. Rather stop trying to overload 3 states onto an error reporting
>>>>>>> pattern that inherantly wants to be 2 states. Instead just have
>>>>>>> distinct methods
>>>>>>>
>>>>>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>>>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>>>>>
>>>>>>
>>>>>> Here we are murking again the normal behavior and the error path.
>>>>>>
>>>>>> What is the meaning of try? It's not as though we would error out inside the function module_load_one,
>>>>>> it's the _caller_ that needs to decide how to treat a return value of found/not found, and the exception (Error).
>>>>>
>>>>> I suggested "try" as in the  g_malloc vs g_try_malloc API naming pattern,
>>>>> where the latter ignores the OOM error condition.
>>>>>
>>>>> So in this case 'try' means try to load the module, but don't fail if
>>>>> the module is missing on disk.
>>>>
>>>> I understand what you mean, but this is wrong in this case.
>>>>
>>>> We _do not fail_ in module_load_one, whether an error is present
>>>> or not, whether a module is found or not.
>>>
>>> Looking at the callers though, AFAIK there are only two patterns
>>> that we need. All callers should report a fatal error if the module
>>> exists and loading it failed eg module is from mis-matched build.
>>>
>>> Some callers also want a failure if the module doesn't exist on
>>
>> Some callers also want to behave differently if the module is not installed.
>> It is not a "failure" in the same sense that what an Error returns.
>>
>> For example, the block dmg module tries to also load other submodules for decompression.
>>
>> If dmg does not find any such submodules, it will be able to support basic dmg just fine,
>> but won't be able to open compressed dmg.
> 
> The dmg case looks like it works fine with module_try_load_one(). I
> know your patch does a "warn" when the module is missing currently,
> but IMHO that's the wrong place todo this. Any problems with use
> of compressed dmg should be reported at the time the feature is
> actually used, rather than when trying to load the module.

Right, this was the input from Kevin as well, I planned to change that.

We should warn only when we actually encounter/decode a compressed dmg.

Thanks,

Claudio


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22  9:43                 ` Claudio Fontana
@ 2022-09-22 12:42                   ` Markus Armbruster
  2022-09-22 12:45                     ` Claudio Fontana
  2022-09-22 14:54                     ` Kevin Wolf
  0 siblings, 2 replies; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22 12:42 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

Claudio Fontana <cfontana@suse.de> writes:

> On 9/22/22 11:38, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>>> can we avoid that by tweaking the interface?
>>>>
>>>> Philippe's proposed interface sticks to the rule.
>>>
>>> The cost is that when you see a  function   dosomething(true|false) as
>>> a reader you often have no idea what the effect of true vs false is
>>> on the behaviour of that function. You resort to looking at the
>>> API docs and/or code.  This is where C would really benefit from
>>> having named parameters like as  dosomething(ignore_errors=true|false)
>>> is totally obvious. Anyway, I digress.
>> 
>> Right.  Quoting myself: "If having to pass a flag turns out to to be a
>> legibility issue, we can have wrapper functions."  :)
>
> There is something more fundamental that seems to be missed by most in this conversation,
> ie the distinction between the normal execution path and the error path.
>
>
>> 
>>>> Another interface that does: return -1 for error, 0 for module not found
>>>> (no error), and 1 for loaded.
>>>
>>> IMHO this pattern is generally easier to understand when looking at
>>> the callers, as the fatal error scenario is always clear.
>>>
>>> That said I would suggest neither approach as the public facing
>>> API. Rather stop trying to overload 3 states onto an error reporting
>>> pattern that inherantly wants to be 2 states. Instead just have
>>> distinct methods
>> 
>> Like these:
>> 
>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>>
>>> other names are available for the second, eg module_load_one_optional()
>> 
>> module_load_one_if_there()?
>
> And what do you do with the caller that needs to _know_ whether the module was "there" or not?
>
> This is losing this information along the way, and the callers NEED it.
>
> I really invite, with no offense intended,

None taken!

>                                            to read the hunks of my patch and the callers,
> there are occasions where we need to _know_ if the module was there or not, and act depending on the context.
>
> The information about "bool is_there" needs to be passed to the caller.

If you have callers that need to distinguish between not found, found
but bad, found and good, then return three distinct values.

I proposed to return -1 for found but bad (error), 0 for not found (no
error), and 1 for loaded (no error).

>> By the way, the "one" in "module_load_one" & friends feels redundant.
>> When I see "module_load", I assume it loads one module.
>
> there is a module_load_all.

Libc has fcloseall() and fclose().  Clear enough, isn't it?

[...]



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 12:42                   ` Markus Armbruster
@ 2022-09-22 12:45                     ` Claudio Fontana
  2022-09-22 13:20                       ` Markus Armbruster
  2022-09-22 14:54                     ` Kevin Wolf
  1 sibling, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 12:45 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 14:42, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> On 9/22/22 11:38, Markus Armbruster wrote:
>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>
>>>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>>>> can we avoid that by tweaking the interface?
>>>>>
>>>>> Philippe's proposed interface sticks to the rule.
>>>>
>>>> The cost is that when you see a  function   dosomething(true|false) as
>>>> a reader you often have no idea what the effect of true vs false is
>>>> on the behaviour of that function. You resort to looking at the
>>>> API docs and/or code.  This is where C would really benefit from
>>>> having named parameters like as  dosomething(ignore_errors=true|false)
>>>> is totally obvious. Anyway, I digress.
>>>
>>> Right.  Quoting myself: "If having to pass a flag turns out to to be a
>>> legibility issue, we can have wrapper functions."  :)
>>
>> There is something more fundamental that seems to be missed by most in this conversation,
>> ie the distinction between the normal execution path and the error path.
>>
>>
>>>
>>>>> Another interface that does: return -1 for error, 0 for module not found
>>>>> (no error), and 1 for loaded.
>>>>
>>>> IMHO this pattern is generally easier to understand when looking at
>>>> the callers, as the fatal error scenario is always clear.
>>>>
>>>> That said I would suggest neither approach as the public facing
>>>> API. Rather stop trying to overload 3 states onto an error reporting
>>>> pattern that inherantly wants to be 2 states. Instead just have
>>>> distinct methods
>>>
>>> Like these:
>>>
>>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>>>
>>>> other names are available for the second, eg module_load_one_optional()
>>>
>>> module_load_one_if_there()?
>>
>> And what do you do with the caller that needs to _know_ whether the module was "there" or not?
>>
>> This is losing this information along the way, and the callers NEED it.
>>
>> I really invite, with no offense intended,
> 
> None taken!
> 
>>                                            to read the hunks of my patch and the callers,
>> there are occasions where we need to _know_ if the module was there or not, and act depending on the context.
>>
>> The information about "bool is_there" needs to be passed to the caller.
> 
> If you have callers that need to distinguish between not found, found
> but bad, found and good, then return three distinct values.
> 
> I proposed to return -1 for found but bad (error), 0 for not found (no
> error), and 1 for loaded (no error).

That is fine too.

I think it would be better to completely make the return value separate from the Error,
and really treat Error as an exception and not mix it up with the regular execution,

but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.

> 
>>> By the way, the "one" in "module_load_one" & friends feels redundant.
>>> When I see "module_load", I assume it loads one module.
>>
>> there is a module_load_all.
> 
> Libc has fcloseall() and fclose().  Clear enough, isn't it?
> 
> [...]
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 12:45                     ` Claudio Fontana
@ 2022-09-22 13:20                       ` Markus Armbruster
  2022-09-22 13:33                         ` Claudio Fontana
  2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
  0 siblings, 2 replies; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22 13:20 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

Claudio Fontana <cfontana@suse.de> writes:

[...]

> I think it would be better to completely make the return value separate from the Error,
> and really treat Error as an exception and not mix it up with the regular execution,
>
> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.

It's a matter of language pragmatics.  In Java, you throw an exception
on error.  In C, you return an error value.

Trying to emulate exceptions in C might be even more unadvisable than
trying to avoid them in Java.  Best to work with the language, not
against it.

Trouble is the error values we can conveniently return in C can't convey
enough information.  So we use Error for that.  Just like GLib uses
GError.

More modern languages do "return error value" much better than C can.  C
is what it is.

We could certainly argue how to do better than we do now in QEMU's C
code.  However, the Error API is used all over the place, which makes
changing it expensive.  "Rethinking the whole Error API" (your words)
would have to generate benefits worth this expense.  Which seems
unlikely.

[...]



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:20                       ` Markus Armbruster
@ 2022-09-22 13:33                         ` Claudio Fontana
  2022-09-22 14:36                           ` Markus Armbruster
  2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
  1 sibling, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 13:33 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 15:20, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
> [...]
> 
>> I think it would be better to completely make the return value separate from the Error,
>> and really treat Error as an exception and not mix it up with the regular execution,
>>
>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
> 
> It's a matter of language pragmatics.  In Java, you throw an exception
> on error.  In C, you return an error value.
> 
> Trying to emulate exceptions in C might be even more unadvisable than
> trying to avoid them in Java.  Best to work with the language, not
> against it.
> 
> Trouble is the error values we can conveniently return in C can't convey
> enough information.  So we use Error for that.  Just like GLib uses

Right, we use Error for that and that's fine, but we should use it _only Error_ for that.

Ie map the Exceptions directly to Error.

So:

try {

  rv = function_call(...);

  use_return_value(rv);

} catch (Exception e) {

  /* handle exceptional case */

}

becomes

rv = function_call(..., Error **errp);

if (errp) {
  /* handle exceptional case */
}

use_return_value(rv);


Instead we mix up the Exception code path and the regular code path, by having rv carry a mix of the Exception and regular return value,
and this creates problems and confusion.

If we put a hard line between the two, I think more clarity emerges.


> GError.
> 
> More modern languages do "return error value" much better than C can.  C
> is what it is.
> 
> We could certainly argue how to do better than we do now in QEMU's C
> code.  However, the Error API is used all over the place, which makes
> changing it expensive.  "Rethinking the whole Error API" (your words)
> would have to generate benefits worth this expense.  Which seems
> unlikely.
> 
> [...]
> 

This is all fine, the problem is how we remodel this in C.

This is how I see the next steps to clarify my position:

short term:

- keep the existing API with the existing assumptions, use a separate argument to carry the pointer to the actual return value (where the function return value as provided by the language is used to return if an exception was generated or not, as we assume today).

- long term (maybe): fix the existing API by detaching completely the return value from the exception.


Wdyt?

C



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:20                       ` Markus Armbruster
  2022-09-22 13:33                         ` Claudio Fontana
@ 2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
  2022-09-22 13:42                           ` Claudio Fontana
  2022-09-22 13:44                           ` Daniel P. Berrangé
  1 sibling, 2 replies; 50+ messages in thread
From: Philippe Mathieu-Daudé via @ 2022-09-22 13:34 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Claudio Fontana, Daniel P. Berrangé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf,
	qemu-devel@nongnu.org Developers, dinechin, Gerd Hoffmann,
	Marc-André Lureau

On Thu, Sep 22, 2022 at 3:20 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Claudio Fontana <cfontana@suse.de> writes:
>
> [...]
>
> > I think it would be better to completely make the return value separate from the Error,
> > and really treat Error as an exception and not mix it up with the regular execution,
> >
> > but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>
> It's a matter of language pragmatics.  In Java, you throw an exception
> on error.  In C, you return an error value.
>
> Trying to emulate exceptions in C might be even more unadvisable than
> trying to avoid them in Java.  Best to work with the language, not
> against it.
>
> Trouble is the error values we can conveniently return in C can't convey
> enough information.  So we use Error for that.  Just like GLib uses
> GError.
>
> More modern languages do "return error value" much better than C can.  C
> is what it is.
>
> We could certainly argue how to do better than we do now in QEMU's C
> code.  However, the Error API is used all over the place, which makes
> changing it expensive.  "Rethinking the whole Error API" (your words)
> would have to generate benefits worth this expense.  Which seems
> unlikely.

QEMU Error* and GLib GError are designed to report recoverable runtime *errors*.

There is or is no error. A boolean return value seems appropriate.

We are bikeshedding about the API because we are abusing it in a non-error case.

If we want to try to load an optional module, the Error* argument is
not the proper way to return the information regarding why we couldn't
load.

In both cases we want to know if the module was loaded. If this is an
optional module, we don't care why it couldn't be loaded.

So trying to make everybody happy:

  // Return true if the module could be loaded, otherwise return false
and errp contains the error.
 bool module_load_one(const char *prefix, const char *name, Error *errp);

  // Return true if the module could be loaded, false otherwise.
  bool module_try_load_one(const char *prefix, const char *name);


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
@ 2022-09-22 13:42                           ` Claudio Fontana
  2022-09-22 13:44                           ` Daniel P. Berrangé
  1 sibling, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 13:42 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Markus Armbruster
  Cc: Daniel P. Berrangé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf,
	qemu-devel@nongnu.org Developers, dinechin, Gerd Hoffmann,
	Marc-André Lureau

On 9/22/22 15:34, Philippe Mathieu-Daudé wrote:
> On Thu, Sep 22, 2022 at 3:20 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Claudio Fontana <cfontana@suse.de> writes:
>>
>> [...]
>>
>>> I think it would be better to completely make the return value separate from the Error,
>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>
>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>>
>> It's a matter of language pragmatics.  In Java, you throw an exception
>> on error.  In C, you return an error value.
>>
>> Trying to emulate exceptions in C might be even more unadvisable than
>> trying to avoid them in Java.  Best to work with the language, not
>> against it.
>>
>> Trouble is the error values we can conveniently return in C can't convey
>> enough information.  So we use Error for that.  Just like GLib uses
>> GError.
>>
>> More modern languages do "return error value" much better than C can.  C
>> is what it is.
>>
>> We could certainly argue how to do better than we do now in QEMU's C
>> code.  However, the Error API is used all over the place, which makes
>> changing it expensive.  "Rethinking the whole Error API" (your words)
>> would have to generate benefits worth this expense.  Which seems
>> unlikely.
> 
> QEMU Error* and GLib GError are designed to report recoverable runtime *errors*.
> 
> There is or is no error. A boolean return value seems appropriate.
> 
> We are bikeshedding about the API because we are abusing it in a non-error case.

Agreed.

> 
> If we want to try to load an optional module, the Error* argument is
> not the proper way to return the information regarding why we couldn't
> load.

Mostly agree. If an _actual_ terrible Error happens when loading an optional module (I/O error, multiple module exist for the same QOM type, module architecture incompatible etc, module directory permissions are wrong, etc) I'd argue we have to provide this information too,

ie in the case of a catastrophic error when loading the optional module, we should at least provide the information to the caller and let it decide what to do with it,
maybe even just warn and continue, maybe do nothing, caller choice.

> 
> In both cases we want to know if the module was loaded. If this is an
> optional module, we don't care why it couldn't be loaded.

We don't care if the module is not present, but the caller _may_ care to at least report if the module is present (and thus the user expects the functionality to be available),
but it could not be loaded because of a catastrophic I/O error. I am deliberately exaggerating for the benefit of clarity.

> 
> So trying to make everybody happy:
> 
>   // Return true if the module could be loaded, otherwise return false
> and errp contains the error.
>  bool module_load_one(const char *prefix, const char *name, Error *errp);
> 
>   // Return true if the module could be loaded, false otherwise.
>   bool module_try_load_one(const char *prefix, const char *name);



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
  2022-09-22 13:42                           ` Claudio Fontana
@ 2022-09-22 13:44                           ` Daniel P. Berrangé
  2022-09-22 14:01                             ` Claudio Fontana
  1 sibling, 1 reply; 50+ messages in thread
From: Daniel P. Berrangé @ 2022-09-22 13:44 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Markus Armbruster, Claudio Fontana, Paolo Bonzini,
	Richard Henderson, Kevin Wolf, qemu-devel@nongnu.org Developers,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On Thu, Sep 22, 2022 at 03:34:42PM +0200, Philippe Mathieu-Daudé wrote:
> On Thu, Sep 22, 2022 at 3:20 PM Markus Armbruster <armbru@redhat.com> wrote:
> >
> > Claudio Fontana <cfontana@suse.de> writes:
> >
> > [...]
> >
> > > I think it would be better to completely make the return value separate from the Error,
> > > and really treat Error as an exception and not mix it up with the regular execution,
> > >
> > > but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
> >
> > It's a matter of language pragmatics.  In Java, you throw an exception
> > on error.  In C, you return an error value.
> >
> > Trying to emulate exceptions in C might be even more unadvisable than
> > trying to avoid them in Java.  Best to work with the language, not
> > against it.
> >
> > Trouble is the error values we can conveniently return in C can't convey
> > enough information.  So we use Error for that.  Just like GLib uses
> > GError.
> >
> > More modern languages do "return error value" much better than C can.  C
> > is what it is.
> >
> > We could certainly argue how to do better than we do now in QEMU's C
> > code.  However, the Error API is used all over the place, which makes
> > changing it expensive.  "Rethinking the whole Error API" (your words)
> > would have to generate benefits worth this expense.  Which seems
> > unlikely.
> 
> QEMU Error* and GLib GError are designed to report recoverable runtime *errors*.
> 
> There is or is no error. A boolean return value seems appropriate.
> 
> We are bikeshedding about the API because we are abusing it in a non-error case.
> 
> If we want to try to load an optional module, the Error* argument is
> not the proper way to return the information regarding why we couldn't
> load.
> 
> In both cases we want to know if the module was loaded. If this is an
> optional module, we don't care why it couldn't be loaded.

No, that's wrong. If the module exists on disk but is incompatible
with the current QEMU, then we need to be reporting that as an
error to the caller, so they can propagate this problem back up
the stack to the QMP command or CLI arg that started the code path.

We don't need to be using the return status to tell the caller if
the module was loaded or not. We only should be telling thue caller
is there was a reportable error or not.

Consider, there is a call to load block drivers. We don't need
to know whether each block driver was loaded or not. eg if the
'curl' code is a module and we fail to load it, then when code
tries to create a curl based block device the missing curl
support will be reported at that time.  The callers that load
modules should only need to express whether their load attempt
is mandatory or optional, in terms of the module existing on
disk.  If the modules exists on disk, any further errors
encountered when loading it should be propagated.



> So trying to make everybody happy:
> 
>   // Return true if the module could be loaded, otherwise return false
> and errp contains the error.
>  bool module_load_one(const char *prefix, const char *name, Error *errp);
> 
>   // Return true if the module could be loaded, false otherwise.
>   bool module_try_load_one(const char *prefix, const char *name);

Nope, this latter doesn't work as it throws away important errors
when loading an incompatible/broken module.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:44                           ` Daniel P. Berrangé
@ 2022-09-22 14:01                             ` Claudio Fontana
  0 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 14:01 UTC (permalink / raw)
  To: Daniel P. Berrangé, Philippe Mathieu-Daudé
  Cc: Markus Armbruster, Paolo Bonzini, Richard Henderson, Kevin Wolf,
	qemu-devel@nongnu.org Developers, dinechin, Gerd Hoffmann,
	Marc-André Lureau

On 9/22/22 15:44, Daniel P. Berrangé wrote:
> On Thu, Sep 22, 2022 at 03:34:42PM +0200, Philippe Mathieu-Daudé wrote:
>> On Thu, Sep 22, 2022 at 3:20 PM Markus Armbruster <armbru@redhat.com> wrote:
>>>
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>> [...]
>>>
>>>> I think it would be better to completely make the return value separate from the Error,
>>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>>
>>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>>>
>>> It's a matter of language pragmatics.  In Java, you throw an exception
>>> on error.  In C, you return an error value.
>>>
>>> Trying to emulate exceptions in C might be even more unadvisable than
>>> trying to avoid them in Java.  Best to work with the language, not
>>> against it.
>>>
>>> Trouble is the error values we can conveniently return in C can't convey
>>> enough information.  So we use Error for that.  Just like GLib uses
>>> GError.
>>>
>>> More modern languages do "return error value" much better than C can.  C
>>> is what it is.
>>>
>>> We could certainly argue how to do better than we do now in QEMU's C
>>> code.  However, the Error API is used all over the place, which makes
>>> changing it expensive.  "Rethinking the whole Error API" (your words)
>>> would have to generate benefits worth this expense.  Which seems
>>> unlikely.
>>
>> QEMU Error* and GLib GError are designed to report recoverable runtime *errors*.
>>
>> There is or is no error. A boolean return value seems appropriate.
>>
>> We are bikeshedding about the API because we are abusing it in a non-error case.
>>
>> If we want to try to load an optional module, the Error* argument is
>> not the proper way to return the information regarding why we couldn't
>> load.
>>
>> In both cases we want to know if the module was loaded. If this is an
>> optional module, we don't care why it couldn't be loaded.
> 
> No, that's wrong. If the module exists on disk but is incompatible
> with the current QEMU, then we need to be reporting that as an
> error to the caller, so they can propagate this problem back up
> the stack to the QMP command or CLI arg that started the code path.

Agree.

> 
> We don't need to be using the return status to tell the caller if
> the module was loaded or not. We only should be telling thue caller
> is there was a reportable error or not.
> 
> Consider, there is a call to load block drivers. We don't need
> to know whether each block driver was loaded or not. eg if the
> 'curl' code is a module and we fail to load it, then when code
> tries to create a curl based block device the missing curl
> support will be reported at that time.  The callers that load
> modules should only need to express whether their load attempt
> is mandatory or optional, in terms of the module existing on
> disk.  If the modules exists on disk, any further errors
> encountered when loading it should be propagated.
> 
> 
> 
>> So trying to make everybody happy:
>>
>>   // Return true if the module could be loaded, otherwise return false
>> and errp contains the error.
>>  bool module_load_one(const char *prefix, const char *name, Error *errp);
>>
>>   // Return true if the module could be loaded, false otherwise.
>>   bool module_try_load_one(const char *prefix, const char *name);
> 
> Nope, this latter doesn't work as it throws away important errors
> when loading an incompatible/broken module.
> 

Agree.

Claudio



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 13:33                         ` Claudio Fontana
@ 2022-09-22 14:36                           ` Markus Armbruster
  2022-09-22 15:22                             ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22 14:36 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

Claudio Fontana <cfontana@suse.de> writes:

> On 9/22/22 15:20, Markus Armbruster wrote:
>> Claudio Fontana <cfontana@suse.de> writes:
>> 
>> [...]
>> 
>>> I think it would be better to completely make the return value separate from the Error,
>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>
>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>> 
>> It's a matter of language pragmatics.  In Java, you throw an exception
>> on error.  In C, you return an error value.
>> 
>> Trying to emulate exceptions in C might be even more unadvisable than
>> trying to avoid them in Java.  Best to work with the language, not
>> against it.
>> 
>> Trouble is the error values we can conveniently return in C can't convey
>> enough information.  So we use Error for that.  Just like GLib uses
>
> Right, we use Error for that and that's fine, but we should use it _only Error_ for that.
>
> Ie map the Exceptions directly to Error.
>
> So:
>
> try {
>
>   rv = function_call(...);
>
>   use_return_value(rv);
>
> } catch (Exception e) {
>
>   /* handle exceptional case */
>
> }
>
> becomes
>
> rv = function_call(..., Error **errp);
>
> if (errp) {
>   /* handle exceptional case */
> }
>
> use_return_value(rv);

This is simply not the intended use of Error.

Also, "trying to emulate exceptions in C might be even more unadvisable
than trying to avoid them in Java."

> Instead we mix up the Exception code path and the regular code path, by having rv carry a mix of the Exception and regular return value,
> and this creates problems and confusion.

"In C, you return an error value."

> If we put a hard line between the two, I think more clarity emerges.

Maybe, but consistency matters.  Clarity doesn't emerge in isolation.  
Deviating from prevailing usage tends to confuse.

>> GError.
>> 
>> More modern languages do "return error value" much better than C can.  C
>> is what it is.
>> 
>> We could certainly argue how to do better than we do now in QEMU's C
>> code.  However, the Error API is used all over the place, which makes
>> changing it expensive.  "Rethinking the whole Error API" (your words)
>> would have to generate benefits worth this expense.  Which seems
>> unlikely.
>> 
>> [...]
>> 
>
> This is all fine, the problem is how we remodel this in C.
>
> This is how I see the next steps to clarify my position:
>
> short term:
>
> - keep the existing API with the existing assumptions, use a separate argument to carry the pointer to the actual return value (where the function return value as provided by the language is used to return if an exception was generated or not, as we assume today).

We don't actually need separate values for "actual return value" and
"success vs. failure" here.  We can easily encode them in a single
return value.  This is *common* in C, for better or worse.

For instance, read() returns -1 on error, 0 on EOF (which is not an
error), and a positive value on actual read.  On error, @errno is set,
which is a bit awkward (we wouldn't design that way today, I hope).

The interface I proposed is similar: return -1 on error, 0 on not found
(which is not an error), and 1 on successful load.  On error, an Error
is set via @errp.  With a name that makes it obvious that "not found" is
not an error, there's nothing that should surprise someone
passingly familiar with QEMU code.

> - long term (maybe): fix the existing API by detaching completely the return value from the exception.

As I wrote, this seems unlikely to be worth its considerable expense.

> Wdyt?
>
> C



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 12:42                   ` Markus Armbruster
  2022-09-22 12:45                     ` Claudio Fontana
@ 2022-09-22 14:54                     ` Kevin Wolf
  2022-09-22 15:08                       ` Claudio Fontana
  2022-09-22 15:27                       ` Markus Armbruster
  1 sibling, 2 replies; 50+ messages in thread
From: Kevin Wolf @ 2022-09-22 14:54 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Claudio Fontana, Daniel P. Berrangé,
	Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
> Claudio Fontana <cfontana@suse.de> writes:
> 
> > On 9/22/22 11:38, Markus Armbruster wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> 
> >>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
> >>>> Ease of use matters, too.  When sticking to the rule leads to awkward
> >>>> code, we should stop and think.  Should we deviate from the rule?  Or
> >>>> can we avoid that by tweaking the interface?
> >>>>
> >>>> Philippe's proposed interface sticks to the rule.
> >>>
> >>> The cost is that when you see a  function   dosomething(true|false) as
> >>> a reader you often have no idea what the effect of true vs false is
> >>> on the behaviour of that function. You resort to looking at the
> >>> API docs and/or code.  This is where C would really benefit from
> >>> having named parameters like as  dosomething(ignore_errors=true|false)
> >>> is totally obvious. Anyway, I digress.
> >> 
> >> Right.  Quoting myself: "If having to pass a flag turns out to to be a
> >> legibility issue, we can have wrapper functions."  :)
> >
> > There is something more fundamental that seems to be missed by most in this conversation,
> > ie the distinction between the normal execution path and the error path.
> >
> >
> >> 
> >>>> Another interface that does: return -1 for error, 0 for module not found
> >>>> (no error), and 1 for loaded.
> >>>
> >>> IMHO this pattern is generally easier to understand when looking at
> >>> the callers, as the fatal error scenario is always clear.
> >>>
> >>> That said I would suggest neither approach as the public facing
> >>> API. Rather stop trying to overload 3 states onto an error reporting
> >>> pattern that inherantly wants to be 2 states. Instead just have
> >>> distinct methods
> >> 
> >> Like these:
> >> 
> >>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
> >>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
> >>>
> >>> other names are available for the second, eg module_load_one_optional()
> >> 
> >> module_load_one_if_there()?
> >
> > And what do you do with the caller that needs to _know_ whether the module was "there" or not?
> >
> > This is losing this information along the way, and the callers NEED it.
> >
> > I really invite, with no offense intended,
> 
> None taken!
> 
> >                                            to read the hunks of my patch and the callers,
> > there are occasions where we need to _know_ if the module was there or not, and act depending on the context.
> >
> > The information about "bool is_there" needs to be passed to the caller.
> 
> If you have callers that need to distinguish between not found, found
> but bad, found and good, then return three distinct values.
> 
> I proposed to return -1 for found but bad (error), 0 for not found (no
> error), and 1 for loaded (no error).

My intuition with this one was that "not found" is still an error case,
but it's an error case that we need to distinguish from other error
cases.

Is this one of the rare use cases for error classes?

Kevin



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 14:54                     ` Kevin Wolf
@ 2022-09-22 15:08                       ` Claudio Fontana
  2022-09-22 15:27                       ` Markus Armbruster
  1 sibling, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 15:08 UTC (permalink / raw)
  To: Kevin Wolf, Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

On 9/22/22 16:54, Kevin Wolf wrote:
> Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
>> Claudio Fontana <cfontana@suse.de> writes:
>>
>>> On 9/22/22 11:38, Markus Armbruster wrote:
>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>>
>>>>> On Thu, Sep 22, 2022 at 08:07:43AM +0200, Markus Armbruster wrote:
>>>>>> Ease of use matters, too.  When sticking to the rule leads to awkward
>>>>>> code, we should stop and think.  Should we deviate from the rule?  Or
>>>>>> can we avoid that by tweaking the interface?
>>>>>>
>>>>>> Philippe's proposed interface sticks to the rule.
>>>>>
>>>>> The cost is that when you see a  function   dosomething(true|false) as
>>>>> a reader you often have no idea what the effect of true vs false is
>>>>> on the behaviour of that function. You resort to looking at the
>>>>> API docs and/or code.  This is where C would really benefit from
>>>>> having named parameters like as  dosomething(ignore_errors=true|false)
>>>>> is totally obvious. Anyway, I digress.
>>>>
>>>> Right.  Quoting myself: "If having to pass a flag turns out to to be a
>>>> legibility issue, we can have wrapper functions."  :)
>>>
>>> There is something more fundamental that seems to be missed by most in this conversation,
>>> ie the distinction between the normal execution path and the error path.
>>>
>>>
>>>>
>>>>>> Another interface that does: return -1 for error, 0 for module not found
>>>>>> (no error), and 1 for loaded.
>>>>>
>>>>> IMHO this pattern is generally easier to understand when looking at
>>>>> the callers, as the fatal error scenario is always clear.
>>>>>
>>>>> That said I would suggest neither approach as the public facing
>>>>> API. Rather stop trying to overload 3 states onto an error reporting
>>>>> pattern that inherantly wants to be 2 states. Instead just have
>>>>> distinct methods
>>>>
>>>> Like these:
>>>>
>>>>>   bool module_load_one(const char *prefix, const char *name, Error *errp)
>>>>>   bool module_try_load_one(const char *prefix, const char *name, Error *errp)
>>>>>
>>>>> other names are available for the second, eg module_load_one_optional()
>>>>
>>>> module_load_one_if_there()?
>>>
>>> And what do you do with the caller that needs to _know_ whether the module was "there" or not?
>>>
>>> This is losing this information along the way, and the callers NEED it.
>>>
>>> I really invite, with no offense intended,
>>
>> None taken!
>>
>>>                                            to read the hunks of my patch and the callers,
>>> there are occasions where we need to _know_ if the module was there or not, and act depending on the context.
>>>
>>> The information about "bool is_there" needs to be passed to the caller.
>>
>> If you have callers that need to distinguish between not found, found
>> but bad, found and good, then return three distinct values.
>>
>> I proposed to return -1 for found but bad (error), 0 for not found (no
>> error), and 1 for loaded (no error).
> 
> My intuition with this one was that "not found" is still an error case,

You may consider it an "error" but it isn't in the "Exception" sense as we were discussing before.

The Error API seems to be mapping to the "Exception" concept, and in that sense module not found is not an Exception at all.

Your intuition comes I think from a historical look at the QEMU codebase,
but nowadays as QEMU gets more and more modularized, and packaged differently depending
on distributions, products, and user choices,

the "not found" is genuinely not in general an error case at all.

> but it's an error case that we need to distinguish from other error
> cases.
> 
> Is this one of the rare use cases for error classes?
> 
> Kevin
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 14:36                           ` Markus Armbruster
@ 2022-09-22 15:22                             ` Claudio Fontana
  2022-09-23  5:31                               ` Markus Armbruster
  0 siblings, 1 reply; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 15:22 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/22/22 16:36, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> On 9/22/22 15:20, Markus Armbruster wrote:
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>> [...]
>>>
>>>> I think it would be better to completely make the return value separate from the Error,
>>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>>
>>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>>>
>>> It's a matter of language pragmatics.  In Java, you throw an exception
>>> on error.  In C, you return an error value.
>>>
>>> Trying to emulate exceptions in C might be even more unadvisable than
>>> trying to avoid them in Java.  Best to work with the language, not
>>> against it.
>>>
>>> Trouble is the error values we can conveniently return in C can't convey
>>> enough information.  So we use Error for that.  Just like GLib uses
>>
>> Right, we use Error for that and that's fine, but we should use it _only Error_ for that.
>>
>> Ie map the Exceptions directly to Error.
>>
>> So:
>>
>> try {
>>
>>   rv = function_call(...);
>>
>>   use_return_value(rv);
>>
>> } catch (Exception e) {
>>
>>   /* handle exceptional case */
>>
>> }
>>
>> becomes
>>
>> rv = function_call(..., Error **errp);
>>
>> if (errp) {
>>   /* handle exceptional case */
>> }
>>
>> use_return_value(rv);
> 
> This is simply not the intended use of Error.
> 
> Also, "trying to emulate exceptions in C might be even more unadvisable
> than trying to avoid them in Java."
> 
>> Instead we mix up the Exception code path and the regular code path, by having rv carry a mix of the Exception and regular return value,
>> and this creates problems and confusion.
> 
> "In C, you return an error value."
> 
>> If we put a hard line between the two, I think more clarity emerges.
> 
> Maybe, but consistency matters.  Clarity doesn't emerge in isolation.  
> Deviating from prevailing usage tends to confuse.
> 
>>> GError.
>>>
>>> More modern languages do "return error value" much better than C can.  C
>>> is what it is.
>>>
>>> We could certainly argue how to do better than we do now in QEMU's C
>>> code.  However, the Error API is used all over the place, which makes
>>> changing it expensive.  "Rethinking the whole Error API" (your words)
>>> would have to generate benefits worth this expense.  Which seems
>>> unlikely.
>>>
>>> [...]
>>>
>>
>> This is all fine, the problem is how we remodel this in C.
>>
>> This is how I see the next steps to clarify my position:
>>
>> short term:
>>
>> - keep the existing API with the existing assumptions, use a separate argument to carry the pointer to the actual return value (where the function return value as provided by the language is used to return if an exception was generated or not, as we assume today).
> 
> We don't actually need separate values for "actual return value" and
> "success vs. failure" here.  We can easily encode them in a single

Yes, we do, it would avoid the confusion that we see as soon as people look at the module_load_one code the first time.

> return value.  This is *common* in C, for better or worse.

We make our own life difficult, by wasting the very precious space of the return value that should be used to provide the meaning of the function,

and using it to provide a completely useless and redundant bool return value, that by your own definition,
is just "errp != NULL".

The very precious and scarce return value of the C function is completely wasted.

> 
> For instance, read() returns -1 on error, 0 on EOF (which is not an
> error), and a positive value on actual read.  On error, @errno is set,
> which is a bit awkward (we wouldn't design that way today, I hope).
> 
> The interface I proposed is similar: return -1 on error, 0 on not found
> (which is not an error), and 1 on successful load.  On error, an Error
> is set via @errp.  With a name that makes it obvious that "not found" is
> not an error, there's nothing that should surprise someone
> passingly familiar with QEMU code.

This is fine too, we can do -1 on error, 0 on not found and 1 (and errp set) on Error,

especially if the long term goal of actually fixing the high level problem in the Error API (separating it from the return value, leaving it free for meaningful return values for the ordinary case)
is not agreed on.

> 
>> - long term (maybe): fix the existing API by detaching completely the return value from the exception.
> 
> As I wrote, this seems unlikely to be worth its considerable expense.

In this case, something like your suggestion would be the second best option in my view.

C


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 14:54                     ` Kevin Wolf
  2022-09-22 15:08                       ` Claudio Fontana
@ 2022-09-22 15:27                       ` Markus Armbruster
  2022-09-22 15:51                         ` Claudio Fontana
  2022-09-22 17:05                         ` Kevin Wolf
  1 sibling, 2 replies; 50+ messages in thread
From: Markus Armbruster @ 2022-09-22 15:27 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Claudio Fontana, Daniel P. Berrangé,
	Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

Kevin Wolf <kwolf@redhat.com> writes:

> Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:

[...]

>> If you have callers that need to distinguish between not found, found
>> but bad, found and good, then return three distinct values.
>> 
>> I proposed to return -1 for found but bad (error), 0 for not found (no
>> error), and 1 for loaded (no error).
>
> My intuition with this one was that "not found" is still an error case,
> but it's an error case that we need to distinguish from other error
> cases.
>
> Is this one of the rare use cases for error classes?

If I remember correctly, "not found" is not actually an error for most
callers.  If we make it one, these callers have to error_free().  Minor
annoyance, especially when you have to add an else just for that.

Even if we decide to make it an error, I would not create an error class
for it.  I like

    rc = module_load_one(..., errp);
    if (rc == -ENOENT) {
        error_free(*errp);
    } else if (rc < 0) {
        return;
    }

better than

    Error *err = NULL;

    module_load_one(..., &err);
    if (err && err->class == ERROR_CLASS_NOT_FOUND) {
        error_free(err);
    } else if (err) {
        error_propagate(errp, err);
        return;
    }

I respect your intuition.  Would it still say "error case" when the
function is called module_load_if_exists()?

Hmm, another thought... a need to distinguish error cases can be a
symptom of trying to do too much in one function.  We could split this
into "look up module" and "actually load it".



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 15:27                       ` Markus Armbruster
@ 2022-09-22 15:51                         ` Claudio Fontana
  2022-09-22 17:05                         ` Kevin Wolf
  1 sibling, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-22 15:51 UTC (permalink / raw)
  To: Markus Armbruster, Kevin Wolf
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

On 9/22/22 17:27, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
>> Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
> 
> [...]
> 
>>> If you have callers that need to distinguish between not found, found
>>> but bad, found and good, then return three distinct values.
>>>
>>> I proposed to return -1 for found but bad (error), 0 for not found (no
>>> error), and 1 for loaded (no error).
>>
>> My intuition with this one was that "not found" is still an error case,
>> but it's an error case that we need to distinguish from other error
>> cases.
>>
>> Is this one of the rare use cases for error classes?
> 
> If I remember correctly, "not found" is not actually an error for most
> callers.  If we make it one, these callers have to error_free().  Minor
> annoyance, especially when you have to add an else just for that.

That's because the "Error" class, as implemented in QEMU, (and possibly in Glib..) seems to be closer to an Exception than an Error,
just like in C++.

And like in C++, the Exception is a more costly code path, that should not carry the regular runtime behavior,
it should really be representing only the "Exceptional" runtime case.

And this matches this specific instance perfectly.

Not finding the module should not raise an exception, because depending on the packaging the "module not present" is not an exceptional runtime behavior at all.

> 
> Even if we decide to make it an error, I would not create an error class
> for it.  I like
> 
>     rc = module_load_one(..., errp);
>     if (rc == -ENOENT) {
>         error_free(*errp);
>     } else if (rc < 0) {
>         return;
>     }
> 
> better than
> 
>     Error *err = NULL;
> 
>     module_load_one(..., &err);
>     if (err && err->class == ERROR_CLASS_NOT_FOUND) {
>         error_free(err);
>     } else if (err) {
>         error_propagate(errp, err);
>         return;
>     }
> 
> I respect your intuition.  Would it still say "error case" when the
> function is called module_load_if_exists()?
> 
> Hmm, another thought... a need to distinguish error cases can be a
> symptom of trying to do too much in one function.  We could split this
> into "look up module" and "actually load it".
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 15:27                       ` Markus Armbruster
  2022-09-22 15:51                         ` Claudio Fontana
@ 2022-09-22 17:05                         ` Kevin Wolf
  2022-09-23  9:42                           ` Claudio Fontana
  2022-09-23  9:44                           ` Claudio Fontana
  1 sibling, 2 replies; 50+ messages in thread
From: Kevin Wolf @ 2022-09-22 17:05 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Claudio Fontana, Daniel P. Berrangé,
	Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

Am 22.09.2022 um 17:27 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
> 
> [...]
> 
> >> If you have callers that need to distinguish between not found, found
> >> but bad, found and good, then return three distinct values.
> >> 
> >> I proposed to return -1 for found but bad (error), 0 for not found (no
> >> error), and 1 for loaded (no error).
> >
> > My intuition with this one was that "not found" is still an error case,
> > but it's an error case that we need to distinguish from other error
> > cases.
> >
> > Is this one of the rare use cases for error classes?
> 
> If I remember correctly, "not found" is not actually an error for most
> callers.  If we make it one, these callers have to error_free().  Minor
> annoyance, especially when you have to add an else just for that.

AIUI most callers don't actually need the three way distinction, but
just success (may or may not be loaded now) and error.

The example for a caller that needs it was dmg. But with the changes
after review, it won't be using the return code for this any more
either.

> Even if we decide to make it an error, I would not create an error class
> for it.  I like
> 
>     rc = module_load_one(..., errp);
>     if (rc == -ENOENT) {
>         error_free(*errp);
>     } else if (rc < 0) {
>         return;
>     }
> 
> better than
> 
>     Error *err = NULL;
> 
>     module_load_one(..., &err);
>     if (err && err->class == ERROR_CLASS_NOT_FOUND) {
>         error_free(err);
>     } else if (err) {
>         error_propagate(errp, err);
>         return;
>     }

That's a good point, we can use the return code to distinguish the cases.

> I respect your intuition.  Would it still say "error case" when the
> function is called module_load_if_exists()?

I guess no, then it becomes a second success case.

> Hmm, another thought... a need to distinguish error cases can be a
> symptom of trying to do too much in one function.  We could split this
> into "look up module" and "actually load it".

Might become slightly inconvenient. On the other hand, we can still have
a simpler wrapper function that works for the majority of cases where a
boolean result for the combined operation is enough.

Kevin



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 15:22                             ` Claudio Fontana
@ 2022-09-23  5:31                               ` Markus Armbruster
  2022-09-23  9:40                                 ` Claudio Fontana
  0 siblings, 1 reply; 50+ messages in thread
From: Markus Armbruster @ 2022-09-23  5:31 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

Claudio Fontana <cfontana@suse.de> writes:

> On 9/22/22 16:36, Markus Armbruster wrote:
>> Claudio Fontana <cfontana@suse.de> writes:
>> 
>>> On 9/22/22 15:20, Markus Armbruster wrote:
>>>> Claudio Fontana <cfontana@suse.de> writes:
>>>>
>>>> [...]
>>>>
>>>>> I think it would be better to completely make the return value separate from the Error,
>>>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>>>
>>>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>>>>
>>>> It's a matter of language pragmatics.  In Java, you throw an exception
>>>> on error.  In C, you return an error value.
>>>>
>>>> Trying to emulate exceptions in C might be even more unadvisable than
>>>> trying to avoid them in Java.  Best to work with the language, not
>>>> against it.
>>>>
>>>> Trouble is the error values we can conveniently return in C can't convey
>>>> enough information.  So we use Error for that.  Just like GLib uses
>>>
>>> Right, we use Error for that and that's fine, but we should use it _only Error_ for that.
>>>
>>> Ie map the Exceptions directly to Error.
>>>
>>> So:
>>>
>>> try {
>>>
>>>   rv = function_call(...);
>>>
>>>   use_return_value(rv);
>>>
>>> } catch (Exception e) {
>>>
>>>   /* handle exceptional case */
>>>
>>> }
>>>
>>> becomes
>>>
>>> rv = function_call(..., Error **errp);
>>>
>>> if (errp) {
>>>   /* handle exceptional case */
>>> }
>>>
>>> use_return_value(rv);
>> 
>> This is simply not the intended use of Error.
>> 
>> Also, "trying to emulate exceptions in C might be even more unadvisable
>> than trying to avoid them in Java."
>> 
>>> Instead we mix up the Exception code path and the regular code path, by having rv carry a mix of the Exception and regular return value,
>>> and this creates problems and confusion.
>> 
>> "In C, you return an error value."
>> 
>>> If we put a hard line between the two, I think more clarity emerges.
>> 
>> Maybe, but consistency matters.  Clarity doesn't emerge in isolation.  
>> Deviating from prevailing usage tends to confuse.
>> 
>>>> GError.
>>>>
>>>> More modern languages do "return error value" much better than C can.  C
>>>> is what it is.
>>>>
>>>> We could certainly argue how to do better than we do now in QEMU's C
>>>> code.  However, the Error API is used all over the place, which makes
>>>> changing it expensive.  "Rethinking the whole Error API" (your words)
>>>> would have to generate benefits worth this expense.  Which seems
>>>> unlikely.
>>>>
>>>> [...]
>>>>
>>>
>>> This is all fine, the problem is how we remodel this in C.
>>>
>>> This is how I see the next steps to clarify my position:
>>>
>>> short term:
>>>
>>> - keep the existing API with the existing assumptions, use a separate argument to carry the pointer to the actual return value (where the function return value as provided by the language is used to return if an exception was generated or not, as we assume today).
>> 
>> We don't actually need separate values for "actual return value" and
>> "success vs. failure" here.  We can easily encode them in a single
>
> Yes, we do, it would avoid the confusion that we see as soon as people look at the module_load_one code the first time.
>
>> return value.  This is *common* in C, for better or worse.
>
> We make our own life difficult, by wasting the very precious space of the return value that should be used to provide the meaning of the function,
>
> and using it to provide a completely useless and redundant bool return value, that by your own definition,
> is just "errp != NULL".

*errp != NULL, except that doesn't work when the caller NULL to errp.

> The very precious and scarce return value of the C function is completely wasted.

I think you're tilting at windmills :)

error.h again:

 * - Whenever practical, also return a value that indicates success /
 *   failure.  This can make the error checking more concise, and can
 *   avoid useless error object creation and destruction.  Note that
 *   we still have many functions returning void.  We recommend
 *   • bool-valued functions return true on success / false on failure,
 *   • pointer-valued functions return non-null / null pointer, and
 *   • integer-valued functions return non-negative / negative.

This does *not* ask you to waste the return value on a bool indicating
success.  It asks you to use error values whenever practical, and
recommends common ones, namely:

     • When a function returns a non-negative integer on success, use
       negative integers to signify failure.

     • When a function returns a non-null pointer on success, use a null
       pointer to signify failure.

     • When a function has nothing to return, make it return true on
       success, and false on failure.

Such use of return values is idiomatic C.

When a function can return any value of its return type on success,
there are no error values left.  Unless we can tweak the return type to
accomodate error values, say widen it from unsigned char to int, we're
outside "when practical" territory.

[...]



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-23  5:31                               ` Markus Armbruster
@ 2022-09-23  9:40                                 ` Claudio Fontana
  0 siblings, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-23  9:40 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Kevin Wolf, qemu-devel,
	dinechin, Gerd Hoffmann, Marc-André Lureau

On 9/23/22 07:31, Markus Armbruster wrote:
> Claudio Fontana <cfontana@suse.de> writes:
> 
>> On 9/22/22 16:36, Markus Armbruster wrote:
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>>> On 9/22/22 15:20, Markus Armbruster wrote:
>>>>> Claudio Fontana <cfontana@suse.de> writes:
>>>>>
>>>>> [...]
>>>>>
>>>>>> I think it would be better to completely make the return value separate from the Error,
>>>>>> and really treat Error as an exception and not mix it up with the regular execution,
>>>>>>
>>>>>> but if it is the general consensus that I am the only one seeing this conflation problem we can model it this way too.
>>>>>
>>>>> It's a matter of language pragmatics.  In Java, you throw an exception
>>>>> on error.  In C, you return an error value.
>>>>>
>>>>> Trying to emulate exceptions in C might be even more unadvisable than
>>>>> trying to avoid them in Java.  Best to work with the language, not
>>>>> against it.
>>>>>
>>>>> Trouble is the error values we can conveniently return in C can't convey
>>>>> enough information.  So we use Error for that.  Just like GLib uses
>>>>
>>>> Right, we use Error for that and that's fine, but we should use it _only Error_ for that.
>>>>
>>>> Ie map the Exceptions directly to Error.
>>>>
>>>> So:
>>>>
>>>> try {
>>>>
>>>>   rv = function_call(...);
>>>>
>>>>   use_return_value(rv);
>>>>
>>>> } catch (Exception e) {
>>>>
>>>>   /* handle exceptional case */
>>>>
>>>> }
>>>>
>>>> becomes
>>>>
>>>> rv = function_call(..., Error **errp);
>>>>
>>>> if (errp) {
>>>>   /* handle exceptional case */
>>>> }
>>>>
>>>> use_return_value(rv);
>>>
>>> This is simply not the intended use of Error.
>>>
>>> Also, "trying to emulate exceptions in C might be even more unadvisable
>>> than trying to avoid them in Java."
>>>
>>>> Instead we mix up the Exception code path and the regular code path, by having rv carry a mix of the Exception and regular return value,
>>>> and this creates problems and confusion.
>>>
>>> "In C, you return an error value."
>>>
>>>> If we put a hard line between the two, I think more clarity emerges.
>>>
>>> Maybe, but consistency matters.  Clarity doesn't emerge in isolation.  
>>> Deviating from prevailing usage tends to confuse.
>>>
>>>>> GError.
>>>>>
>>>>> More modern languages do "return error value" much better than C can.  C
>>>>> is what it is.
>>>>>
>>>>> We could certainly argue how to do better than we do now in QEMU's C
>>>>> code.  However, the Error API is used all over the place, which makes
>>>>> changing it expensive.  "Rethinking the whole Error API" (your words)
>>>>> would have to generate benefits worth this expense.  Which seems
>>>>> unlikely.
>>>>>
>>>>> [...]
>>>>>
>>>>
>>>> This is all fine, the problem is how we remodel this in C.
>>>>
>>>> This is how I see the next steps to clarify my position:
>>>>
>>>> short term:
>>>>
>>>> - keep the existing API with the existing assumptions, use a separate argument to carry the pointer to the actual return value (where the function return value as provided by the language is used to return if an exception was generated or not, as we assume today).
>>>
>>> We don't actually need separate values for "actual return value" and
>>> "success vs. failure" here.  We can easily encode them in a single
>>
>> Yes, we do, it would avoid the confusion that we see as soon as people look at the module_load_one code the first time.
>>
>>> return value.  This is *common* in C, for better or worse.
>>
>> We make our own life difficult, by wasting the very precious space of the return value that should be used to provide the meaning of the function,
>>
>> and using it to provide a completely useless and redundant bool return value, that by your own definition,
>> is just "errp != NULL".
> 
> *errp != NULL, except that doesn't work when the caller NULL to errp.


This is a solvable detail. If the caller passes NULL it is unlikely to make this check after the call.
Checking for the exceptions could even be a macro that solves this.

> 
>> The very precious and scarce return value of the C function is completely wasted.
> 
> I think you're tilting at windmills :)


I have no idea what you mean.


> 
> error.h again:
> 
>  * - Whenever practical, also return a value that indicates success /
>  *   failure.  This can make the error checking more concise, and can
>  *   avoid useless error object creation and destruction.  Note that
>  *   we still have many functions returning void.  We recommend
>  *   • bool-valued functions return true on success / false on failure,
>  *   • pointer-valued functions return non-null / null pointer, and
>  *   • integer-valued functions return non-negative / negative.
> 
> This does *not* ask you to waste the return value on a bool indicating
> success.  It asks you to use error values whenever practical, and
> recommends common ones, namely:


This is good text, I might have misunderstood you before then.

> 
>      • When a function returns a non-negative integer on success, use
>        negative integers to signify failure.
> 
>      • When a function returns a non-null pointer on success, use a null
>        pointer to signify failure.
> 
>      • When a function has nothing to return, make it return true on
>        success, and false on failure.
> 
> Such use of return values is idiomatic C.
> 
> When a function can return any value of its return type on success,
> there are no error values left.  Unless we can tweak the return type to
> accomodate error values, say widen it from unsigned char to int, we're
> outside "when practical" territory.
> 
> [...]
> 

Interesting, this text contradicts what was implied before (quoting you):

>>>> Functions that set an error on some failures only tend to be awkward: in
>>>> addition to checking the return value for failure, you have to check
>>>> @errp for special failures.  This is particularly cumbersome when it
>>>> requires a @local_err and an error_propagate() just for that.  I
>>>> generally prefer to return an error code and always set an error.


And this may even be inferred from error.h around line 60, just before the snipped you quoted above.

Quoting error.h around line 60:

 * - On success, the function should not touch *errp.  On failure, it                                                                       
 *   should set a new error, e.g. with error_setg(errp, ...), or                                                                            
 *   propagate an existing one, e.g. with error_propagate(errp, ...).    

The problem is, there is no definition of what "success" and "failure" mean in error.h.
And in this absence of definition, it is easy to assume this is connected with the return value of the function,
and mix up the return value of what the function is trying to do (in this case load a module),
with whether an error/exception was generated when trying to do so.

The issue I think happened here specifically is that seeing a "bool" in the return value triggers a pattern in most readers,
and even reading the snippet above in error.h would quickly lead one to believe that the bool return value seen is connected with the errp.

In the original patch, the meaning and definition attached to the bool return value (as I documented in the code comments),
was "returns true if module found and loaded, and false otherwise".

When seeing

bool function(..., Error **errp);

a short-circuit seems to happen, making assumptions about the meaning of the boolean return value, as if connected to the errp.

That is why when changing this to an int, suddenly people seem more comfortable with it, and I think this is independent of the extended range of the return values.

In any case, returning -1, 0 or 1 is fine, it solves the problem at hand, and I think by virtue of the fact that we are using an int instead of the bool,
this will force more thinking from the reader and avoid the short-circuit presented here.

Thanks,

Claudio
















^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 17:05                         ` Kevin Wolf
@ 2022-09-23  9:42                           ` Claudio Fontana
  2022-09-23  9:44                           ` Claudio Fontana
  1 sibling, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-23  9:42 UTC (permalink / raw)
  To: Kevin Wolf, Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

On 9/22/22 19:05, Kevin Wolf wrote:
> Am 22.09.2022 um 17:27 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>>
>>> Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
>>
>> [...]
>>
>>>> If you have callers that need to distinguish between not found, found
>>>> but bad, found and good, then return three distinct values.
>>>>
>>>> I proposed to return -1 for found but bad (error), 0 for not found (no
>>>> error), and 1 for loaded (no error).
>>>
>>> My intuition with this one was that "not found" is still an error case,
>>> but it's an error case that we need to distinguish from other error
>>> cases.
>>>
>>> Is this one of the rare use cases for error classes?
>>
>> If I remember correctly, "not found" is not actually an error for most
>> callers.  If we make it one, these callers have to error_free().  Minor
>> annoyance, especially when you have to add an else just for that.
> 
> AIUI most callers don't actually need the three way distinction, but
> just success (may or may not be loaded now) and error.
> 
> The example for a caller that needs it was dmg. But with the changes
> after review, it won't be using the return code for this any more
> either.
> 
>> Even if we decide to make it an error, I would not create an error class
>> for it.  I like
>>
>>     rc = module_load_one(..., errp);
>>     if (rc == -ENOENT) {
>>         error_free(*errp);
>>     } else if (rc < 0) {
>>         return;
>>     }
>>
>> better than
>>
>>     Error *err = NULL;
>>
>>     module_load_one(..., &err);
>>     if (err && err->class == ERROR_CLASS_NOT_FOUND) {
>>         error_free(err);
>>     } else if (err) {
>>         error_propagate(errp, err);
>>         return;
>>     }
> 
> That's a good point, we can use the return code to distinguish the cases.
> 
>> I respect your intuition.  Would it still say "error case" when the
>> function is called module_load_if_exists()?
> 
> I guess no, then it becomes a second success case.
> 
>> Hmm, another thought... a need to distinguish error cases can be a
>> symptom of trying to do too much in one function.  We could split this
>> into "look up module" and "actually load it".
> 
> Might become slightly inconvenient. On the other hand, we can still have

It is inconventient, and does not solve the problem, because of the race condition between
when the first function is called and then the second one is called.

> a simpler wrapper function that works for the majority of cases where a
> boolean result for the combined operation is enough.
> 
> Kevin
> 
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-22 17:05                         ` Kevin Wolf
  2022-09-23  9:42                           ` Claudio Fontana
@ 2022-09-23  9:44                           ` Claudio Fontana
  1 sibling, 0 replies; 50+ messages in thread
From: Claudio Fontana @ 2022-09-23  9:44 UTC (permalink / raw)
  To: Kevin Wolf, Markus Armbruster
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, qemu-devel, dinechin,
	Gerd Hoffmann, Marc-André Lureau

On 9/22/22 19:05, Kevin Wolf wrote:
> Am 22.09.2022 um 17:27 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>>
>>> Am 22.09.2022 um 14:42 hat Markus Armbruster geschrieben:
>>
>> [...]
>>
>>>> If you have callers that need to distinguish between not found, found
>>>> but bad, found and good, then return three distinct values.
>>>>
>>>> I proposed to return -1 for found but bad (error), 0 for not found (no
>>>> error), and 1 for loaded (no error).
>>>
>>> My intuition with this one was that "not found" is still an error case,
>>> but it's an error case that we need to distinguish from other error
>>> cases.
>>>
>>> Is this one of the rare use cases for error classes?
>>
>> If I remember correctly, "not found" is not actually an error for most
>> callers.  If we make it one, these callers have to error_free().  Minor
>> annoyance, especially when you have to add an else just for that.
> 
> AIUI most callers don't actually need the three way distinction, but

Right, this is why I initially modeled this as:

return value true -> found and loaded
return value false -> not found or error: (check *errp != NULL) to distinguish between the two.


> just success (may or may not be loaded now) and error.
> 
> The example for a caller that needs it was dmg. But with the changes
> after review, it won't be using the return code for this any more
> either.
> 
>> Even if we decide to make it an error, I would not create an error class
>> for it.  I like
>>
>>     rc = module_load_one(..., errp);
>>     if (rc == -ENOENT) {
>>         error_free(*errp);
>>     } else if (rc < 0) {
>>         return;
>>     }
>>
>> better than
>>
>>     Error *err = NULL;
>>
>>     module_load_one(..., &err);
>>     if (err && err->class == ERROR_CLASS_NOT_FOUND) {
>>         error_free(err);
>>     } else if (err) {
>>         error_propagate(errp, err);
>>         return;
>>     }
> 
> That's a good point, we can use the return code to distinguish the cases.
> 
>> I respect your intuition.  Would it still say "error case" when the
>> function is called module_load_if_exists()?
> 
> I guess no, then it becomes a second success case.
> 
>> Hmm, another thought... a need to distinguish error cases can be a
>> symptom of trying to do too much in one function.  We could split this
>> into "look up module" and "actually load it".
> 
> Might become slightly inconvenient. On the other hand, we can still have
> a simpler wrapper function that works for the majority of cases where a
> boolean result for the combined operation is enough.
> 
> Kevin
> 



^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one
  2022-09-21 16:03         ` Claudio Fontana
  2022-09-22  6:07           ` Markus Armbruster
@ 2022-09-25 10:35           ` Richard Henderson
  1 sibling, 0 replies; 50+ messages in thread
From: Richard Henderson @ 2022-09-25 10:35 UTC (permalink / raw)
  To: Claudio Fontana, Markus Armbruster, Philippe Mathieu-Daudé
  Cc: Paolo Bonzini, Kevin Wolf, qemu-devel, dinechin, Gerd Hoffmann,
	Marc-André Lureau, Daniel P.Berrangé

On 9/21/22 16:03, Claudio Fontana wrote:
> On 9/21/22 14:16, Markus Armbruster wrote:
>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
>>
>>> On 16/9/22 11:27, Markus Armbruster wrote:
>>>> Claudio Fontana <cfontana@suse.de> writes:
>>>>
>>>>> improve error handling during module load, by changing:
>>>>>
>>>>> bool module_load_one(const char *prefix, const char *lib_name);
>>>>> void module_load_qom_one(const char *type);
>>>>>
>>>>> to:
>>>>>
>>>>> bool module_load_one(const char *prefix, const char *name, Error **errp);
>>>>> bool module_load_qom_one(const char *type, Error **errp);
>>>>>
>>>>> module_load_qom_one has been introduced in:
>>>>>
>>>>> commit 28457744c345 ("module: qom module support"), which built on top of
>>>>> module_load_one, but discarded the bool return value. Restore it.
>>>>>
>>>>> Adapt all callers to emit errors, or ignore them, or fail hard,
>>>>> as appropriate in each context.
>>>>
>>>> How exactly does behavior change?  The commit message is mum on the
>>>> behavior before the patch, and vague on the behavior afterwards.
>>>>
>>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>>> ---
>>>>>    audio/audio.c         |   9 ++-
>>>>>    block.c               |  15 ++++-
>>>>>    block/dmg.c           |  18 +++++-
>>>>>    hw/core/qdev.c        |  10 ++-
>>>>>    include/qemu/module.h |  38 ++++++++++--
>>>>>    qom/object.c          |  18 +++++-
>>>>>    softmmu/qtest.c       |   6 +-
>>>>>    ui/console.c          |  18 +++++-
>>>>>    util/module.c         | 140 ++++++++++++++++++++++++------------------
>>>>>    9 files changed, 194 insertions(+), 78 deletions(-)
>>>
>>>>> diff --git a/include/qemu/module.h b/include/qemu/module.h
>>>>> index 8c012bbe03..78d4c4de96 100644
>>>>> --- a/include/qemu/module.h
>>>>> +++ b/include/qemu/module.h
>>>>> @@ -61,16 +61,44 @@ typedef enum {
>>>
>>>>>    
>>>>>    void module_call_init(module_init_type type);
>>>>> -bool module_load_one(const char *prefix, const char *lib_name);
>>>>> -void module_load_qom_one(const char *type);
>>>>> +
>>>>> +/*
>>>>> + * module_load_one: attempt to load a module from a set of directories
>>>>> + *
>>>>> + * directories searched are:
>>>>> + * - getenv("QEMU_MODULE_DIR")
>>>>> + * - get_relocated_path(CONFIG_QEMU_MODDIR);
>>>>> + * - /var/run/qemu/${version_dir}
>>>>> + *
>>>>> + * prefix:         a subsystem prefix, or the empty string ("audio-", ..., "")
>>>>> + * name:           name of the module
>>>>> + * errp:           error to set in case the module is found, but load failed.
>>>>> + *
>>>>> + * Return value:   true on success (found and loaded);
>>>>> + *                 if module if found, but load failed, errp will be set.
>>>>> + *                 if module is not found, errp will not be set.
>>>>
>>>> I understand you need to distingush two failure modes "found, but load
>>>> failed" and "not found".
>>>>
>>>> Functions that set an error on some failures only tend to be awkward: in
>>>> addition to checking the return value for failure, you have to check
>>>> @errp for special failures.  This is particularly cumbersome when it
>>>> requires a @local_err and an error_propagate() just for that.  I
>>>> generally prefer to return an error code and always set an error.
>>>
>>> I notice the same issue, therefore would suggest this alternative
>>> prototype:
>>>
>>>     bool module_load_one(const char *prefix, const char *name,
>>>               bool ignore_if_missing, Error **errp);
>>> which always sets *errp when returning false:
>>>
>>>    * Return value:   if ignore_if_missing is true:
>>>    *                   true on success (found or missing), false on
>>>    *                   load failure.
>>>    *                 if ignore_if_missing is false:
>>>    *                   true on success (found and loaded); false if
>>>    *                   not found or load failed.
>>>    *                 errp will be set if the returned value is false.
>>>    */
>>
>> I think this interface is less surprising.

I agree.

>>
>> If having to pass a flag turns out to to be a legibility issue, we can
>> have wrapper functions.
>>
> 
> To me it seems even more confusing tbh. Do we have more ideas? Richard?

My first idea was a tri-state return function (e.g. enum return).  But that would have 
required an extra dance in those places where we *can* accept missing module to free the 
unused Error.

I think the above extra argument is a good replacement for that dance.  The only other 
alternative is to use different functions for the different uses.



r~


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-09-26 22:08 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 18:30 [PATCH v4 0/3] improve error handling for module load Claudio Fontana
2022-09-08 18:30 ` [PATCH v4 1/3] module: removed unused function argument "mayfail" Claudio Fontana
2022-09-08 18:30 ` [PATCH v4 2/3] module: add Error arguments to module_load_one and module_load_qom_one Claudio Fontana
2022-09-15  8:43   ` Claudio Fontana
2022-09-16  8:13   ` Richard Henderson
2022-09-16  8:16     ` Claudio Fontana
2022-09-16  9:27   ` Markus Armbruster
2022-09-16 10:48     ` Claudio Fontana
2022-09-16 14:26       ` Markus Armbruster
2022-09-16 15:06         ` Claudio Fontana
2022-09-19  8:17           ` Markus Armbruster
2022-09-19  8:45             ` Claudio Fontana
2022-09-21 12:47               ` Markus Armbruster
2022-09-19 10:18     ` Philippe Mathieu-Daudé via
2022-09-21 12:16       ` Markus Armbruster
2022-09-21 16:03         ` Claudio Fontana
2022-09-22  6:07           ` Markus Armbruster
2022-09-22  8:28             ` Daniel P. Berrangé
2022-09-22  9:20               ` Claudio Fontana
2022-09-22  9:21                 ` Claudio Fontana
2022-09-22  9:27                   ` Claudio Fontana
2022-09-22  9:31                 ` Daniel P. Berrangé
2022-09-22  9:34                   ` Claudio Fontana
2022-09-22 10:37                     ` Daniel P. Berrangé
2022-09-22 12:30                       ` Claudio Fontana
2022-09-22 12:33                         ` Daniel P. Berrangé
2022-09-22 12:35                           ` Claudio Fontana
2022-09-22  9:38               ` Markus Armbruster
2022-09-22  9:43                 ` Claudio Fontana
2022-09-22 12:42                   ` Markus Armbruster
2022-09-22 12:45                     ` Claudio Fontana
2022-09-22 13:20                       ` Markus Armbruster
2022-09-22 13:33                         ` Claudio Fontana
2022-09-22 14:36                           ` Markus Armbruster
2022-09-22 15:22                             ` Claudio Fontana
2022-09-23  5:31                               ` Markus Armbruster
2022-09-23  9:40                                 ` Claudio Fontana
2022-09-22 13:34                         ` Philippe Mathieu-Daudé via
2022-09-22 13:42                           ` Claudio Fontana
2022-09-22 13:44                           ` Daniel P. Berrangé
2022-09-22 14:01                             ` Claudio Fontana
2022-09-22 14:54                     ` Kevin Wolf
2022-09-22 15:08                       ` Claudio Fontana
2022-09-22 15:27                       ` Markus Armbruster
2022-09-22 15:51                         ` Claudio Fontana
2022-09-22 17:05                         ` Kevin Wolf
2022-09-23  9:42                           ` Claudio Fontana
2022-09-23  9:44                           ` Claudio Fontana
2022-09-25 10:35           ` Richard Henderson
2022-09-08 18:30 ` [PATCH v4 3/3] accel: abort if we fail to load the accelerator plugin Claudio Fontana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.