All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle
@ 2018-12-25 14:04 Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails Fei Li
                   ` (16 more replies)
  0 siblings, 17 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214

Hi,

This idea comes from BiteSizedTasks, and this patch series implement
the error checking of qemu_thread_create: make qemu_thread_create
return a flag to indicate if it succeeded rather than failing with an
error; make all callers check it.

The first and the last patch fixes some segmentation faults occured
during the debugging.   The 6/7 patch modifies the
qemu_thread_create() by passing &error_abort and makes it return a
bool to all direct callers to indicate if it succeeds.   The next 9
patches will improve on &error_abort for callers who can handle more
properly.   The middle four fix some migration issues.

BTW, I am leaving my current company now, and will use
"Fei Li <shirley17fei@gmail.com>" to continue with this patch series.

Please help to review, thanks a lot! :)

v9:
- To ease the review and involve the appropriate maintainers, split
  the previous 6/7 patch into 10 patches: the 6/16 patch passes
  the &error_abort to qemu_thread_create() everywhere, and the next
  9 patches will improve on &error_abort for callers who need.
- Add a new patch 5/7 to unify error handling for
  process_incoming_migration_co().
- Merge the previous 2/7 to current 7/16 to collaboratively handle
  for qemu_X_start_vcpu and for the qemu_init_vpcu in each arch.
- Add comment for multifd_recv_new_channel() in current patch 2/7.

v8:
- Remove previous two patches trying to fix the multifd issue on the
  source side, as we are still waiting for maintainer's opinions.
- Use atomic_read to get multifd_recv_state->count in patch 3/7.
- Get three more "Reviewed-by:".

v7:
- Split the previous multifd-migration into two patches: the src and
  the dst. For the dst, only dump the error instead of quitting.
- Safely do the cleanup for postcopy_ram_enable_notify().
- Split the previous migration-error-handling patch into two patches.

v6:
- Add a new migration-multifd related patch. BTW, delete the previous
  vnc related patch as it has been upstreamed.
- Use error_setg_errno() to set the errno when qemu_thread_create()
  fails for both Linux and Windows implementation.
- Optimize the first patch, less codes are needed

v5:
- Remove `errno = err` in qemu_thread_create() for Linux, and change
  `return errno` to `return -1` in qemu_signal_init() to indicate
  the error in case qemu_thread_create() fails.
- Delete the v4-added qemu_cond/mutex_destroy() in iothread_complete()
  as the destroy() will be done by its callers' object_unref().

v4:
- Separate the migration compression patch from this series
- Add one more error handling patch related with migration
- Add more cleaning up code for touched functions

v3:
- Add two migration related patches to fix the segmentaion fault
- Extract the segmentation fault fix from v2's last patch to be a 
  separate patch

v2:
- Pass errp straightly instead of using a local_err & error_propagate
- Return a bool: false/true to indicate if one function succeeds
- Merge v1's last two patches into one to avoid the compile error
- Fix one omitted error in patch1 and update some error messages


Fei Li (16):
  Fix segmentation fault when qemu_signal_init fails
  migration: fix the multifd code when receiving less channels
  migration: remove unused &local_err parameter in multifd_save_cleanup
  migration: add more error handling for postcopy_ram_enable_notify
  migration: unify error handling for process_incoming_migration_co
  qemu_thread: Make qemu_thread_create() handle errors properly
  qemu_thread: supplement error handling for qemu_X_start_vcpu
  qemu_thread: supplement error handling for qmp_dump_guest_memory
  qemu_thread: supplement error handling for pci_edu_realize
  qemu_thread: supplement error handling for h_resize_hpt_prepare
  qemu_thread: supplement error handling for emulated_realize
  qemu_thread: supplement error handling for
    iothread_complete/qemu_signalfd_compat
  qemu_thread: supplement error handling for migration
  qemu_thread: supplement error handling for vnc_start_worker_thread
  qemu_thread: supplement error handling for touch_all_pages
  qemu_thread_join: fix segmentation fault

 accel/tcg/user-exec-stub.c      |  3 +-
 cpus.c                          | 79 ++++++++++++++++++++++++++---------------
 dump.c                          |  6 ++--
 hw/misc/edu.c                   |  7 ++--
 hw/ppc/spapr_hcall.c            | 10 ++++--
 hw/rdma/rdma_backend.c          |  3 +-
 hw/usb/ccid-card-emulated.c     | 14 +++++---
 include/qemu/thread.h           |  4 +--
 include/qom/cpu.h               |  2 +-
 io/task.c                       |  3 +-
 iothread.c                      | 16 ++++++---
 migration/channel.c             | 11 +++---
 migration/migration.c           | 68 ++++++++++++++++++++++-------------
 migration/migration.h           |  2 +-
 migration/postcopy-ram.c        | 15 ++++++--
 migration/ram.c                 | 68 ++++++++++++++++++++++++-----------
 migration/ram.h                 |  4 +--
 migration/savevm.c              | 12 +++++--
 target/alpha/cpu.c              |  4 ++-
 target/arm/cpu.c                |  4 ++-
 target/cris/cpu.c               |  4 ++-
 target/hppa/cpu.c               |  4 ++-
 target/i386/cpu.c               |  4 ++-
 target/lm32/cpu.c               |  4 ++-
 target/m68k/cpu.c               |  4 ++-
 target/microblaze/cpu.c         |  4 ++-
 target/mips/cpu.c               |  4 ++-
 target/moxie/cpu.c              |  4 ++-
 target/nios2/cpu.c              |  4 ++-
 target/openrisc/cpu.c           |  4 ++-
 target/ppc/translate_init.inc.c |  4 ++-
 target/riscv/cpu.c              |  4 ++-
 target/s390x/cpu.c              |  4 ++-
 target/sh4/cpu.c                |  4 ++-
 target/sparc/cpu.c              |  4 ++-
 target/tilegx/cpu.c             |  4 ++-
 target/tricore/cpu.c            |  4 ++-
 target/unicore32/cpu.c          |  4 ++-
 target/xtensa/cpu.c             |  4 ++-
 tests/atomic_add-bench.c        |  3 +-
 tests/iothread.c                |  2 +-
 tests/qht-bench.c               |  3 +-
 tests/rcutorture.c              |  3 +-
 tests/test-aio.c                |  2 +-
 tests/test-rcu-list.c           |  3 +-
 ui/vnc-jobs.c                   | 17 ++++++---
 ui/vnc-jobs.h                   |  2 +-
 ui/vnc.c                        |  4 ++-
 util/compatfd.c                 | 11 ++++--
 util/main-loop.c                |  8 ++---
 util/oslib-posix.c              | 24 ++++++++-----
 util/qemu-thread-posix.c        | 30 ++++++++++++----
 util/qemu-thread-win32.c        | 18 +++++++---
 util/rcu.c                      |  3 +-
 util/thread-pool.c              |  4 ++-
 55 files changed, 378 insertions(+), 170 deletions(-)

-- 
2.13.7

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 02/16] migration: fix the multifd code when receiving less channels Fei Li
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Paolo Bonzini

When qemu_signal_init() fails in qemu_init_main_loop(), we return
without setting an error.  Its callers crash then when they try to
report the error with error_report_err().

To avoid such segmentation fault, add a new Error parameter to make
the call trace to propagate the err to the final caller.

Fixes: 2f78e491d7b46542158ce0b8132ee4e05bc0ade4
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
---
 util/main-loop.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/util/main-loop.c b/util/main-loop.c
index affe0403c5..443cb4cfe8 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -71,7 +71,7 @@ static void sigfd_handler(void *opaque)
     }
 }
 
-static int qemu_signal_init(void)
+static int qemu_signal_init(Error **errp)
 {
     int sigfd;
     sigset_t set;
@@ -96,7 +96,7 @@ static int qemu_signal_init(void)
     sigdelset(&set, SIG_IPI);
     sigfd = qemu_signalfd(&set);
     if (sigfd == -1) {
-        fprintf(stderr, "failed to create signalfd\n");
+        error_setg_errno(errp, errno, "failed to create signalfd");
         return -errno;
     }
 
@@ -109,7 +109,7 @@ static int qemu_signal_init(void)
 
 #else /* _WIN32 */
 
-static int qemu_signal_init(void)
+static int qemu_signal_init(Error **errp)
 {
     return 0;
 }
@@ -148,7 +148,7 @@ int qemu_init_main_loop(Error **errp)
 
     init_clocks(qemu_timer_notify_cb);
 
-    ret = qemu_signal_init();
+    ret = qemu_signal_init(errp);
     if (ret) {
         return ret;
     }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 02/16] migration: fix the multifd code when receiving less channels
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei
  Cc: lifei1214, Dr . David Alan Gilbert, Peter Xu, Markus Armbruster

In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.

Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel. And update the comment for multifd_recv_new_channel().

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 migration/channel.c   | 11 ++++++-----
 migration/migration.c |  9 +++++++--
 migration/migration.h |  2 +-
 migration/ram.c       | 17 ++++++++++++++---
 migration/ram.h       |  2 +-
 5 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 33e0e9b82f..20e4c8e2dc 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -30,6 +30,7 @@
 void migration_channel_process_incoming(QIOChannel *ioc)
 {
     MigrationState *s = migrate_get_current();
+    Error *local_err = NULL;
 
     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));
@@ -38,13 +39,13 @@ void migration_channel_process_incoming(QIOChannel *ioc)
         *s->parameters.tls_creds &&
         !object_dynamic_cast(OBJECT(ioc),
                              TYPE_QIO_CHANNEL_TLS)) {
-        Error *local_err = NULL;
         migration_tls_channel_process_incoming(s, ioc, &local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        }
     } else {
-        migration_ioc_process_incoming(ioc);
+        migration_ioc_process_incoming(ioc, &local_err);
+    }
+
+    if (local_err) {
+        error_report_err(local_err);
     }
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index ffc4d9e556..24cb4b9d0d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -541,7 +541,7 @@ void migration_fd_process_incoming(QEMUFile *f)
     migration_incoming_process();
 }
 
-void migration_ioc_process_incoming(QIOChannel *ioc)
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     bool start_migration;
@@ -563,9 +563,14 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
          */
         start_migration = !migrate_use_multifd();
     } else {
+        Error *local_err = NULL;
         /* Multiple connections */
         assert(migrate_use_multifd());
-        start_migration = multifd_recv_new_channel(ioc);
+        start_migration = multifd_recv_new_channel(ioc, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
     }
 
     if (start_migration) {
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..02b7304610 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -229,7 +229,7 @@ struct MigrationState
 void migrate_set_state(int *state, int old_state, int new_state);
 
 void migration_fd_process_incoming(QEMUFile *f);
-void migration_ioc_process_incoming(QIOChannel *ioc);
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
 void migration_incoming_process(void);
 
 bool  migration_has_all_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..1671dedc97 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1322,8 +1322,13 @@ bool multifd_recv_all_channels_created(void)
     return thread_count == atomic_read(&multifd_recv_state->count);
 }
 
-/* Return true if multifd is ready for the migration, otherwise false */
-bool multifd_recv_new_channel(QIOChannel *ioc)
+/*
+ * Try to receive all multifd channels to get ready for the migration.
+ * - Return true and do not set @errp when correctly receving all channels;
+ * - Return false and do not set @errp when correctly receiving the current one;
+ * - Return false and set @errp when failing to receive the current channel.
+ */
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
 {
     MultiFDRecvParams *p;
     Error *local_err = NULL;
@@ -1332,6 +1337,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
     id = multifd_recv_initial_packet(ioc, &local_err);
     if (id < 0) {
         multifd_recv_terminate_threads(local_err);
+        error_propagate_prepend(errp, local_err,
+                                "failed to receive packet"
+                                " via multifd channel %d: ",
+                                atomic_read(&multifd_recv_state->count));
         return false;
     }
 
@@ -1340,6 +1349,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
         error_setg(&local_err, "multifd: received id '%d' already setup'",
                    id);
         multifd_recv_terminate_threads(local_err);
+        error_propagate(errp, local_err);
         return false;
     }
     p->c = ioc;
@@ -1351,7 +1361,8 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
     qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
                        QEMU_THREAD_JOINABLE);
     atomic_inc(&multifd_recv_state->count);
-    return multifd_recv_state->count == migrate_multifd_channels();
+    return atomic_read(&multifd_recv_state->count) ==
+           migrate_multifd_channels();
 }
 
 /**
diff --git a/migration/ram.h b/migration/ram.h
index 83ff1bc11a..046d3074be 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -47,7 +47,7 @@ int multifd_save_cleanup(Error **errp);
 int multifd_load_setup(void);
 int multifd_load_cleanup(Error **errp);
 bool multifd_recv_all_channels_created(void);
-bool multifd_recv_new_channel(QIOChannel *ioc);
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
 
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 02/16] migration: fix the multifd code when receiving less channels Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 16:50   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 04/16] migration: add more error handling for postcopy_ram_enable_notify Fei Li
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Dr . David Alan Gilbert

Always call migrate_set_error() to set the error state without relying
on whether multifd_save_cleanup() succeeds.  As the passed &local_err
is never used in multifd_save_cleanup(), remove it. And make the
function be: void multifd_save_cleanup(void).

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c |  5 +----
 migration/ram.c       | 11 ++++-------
 migration/ram.h       |  2 +-
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 24cb4b9d0d..5d322eb9d6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1386,7 +1386,6 @@ static void migrate_fd_cleanup(void *opaque)
     qemu_savevm_state_cleanup();
 
     if (s->to_dst_file) {
-        Error *local_err = NULL;
         QEMUFile *tmp;
 
         trace_migrate_fd_cleanup();
@@ -1397,9 +1396,7 @@ static void migrate_fd_cleanup(void *opaque)
         }
         qemu_mutex_lock_iothread();
 
-        if (multifd_save_cleanup(&local_err) != 0) {
-            error_report_err(local_err);
-        }
+        multifd_save_cleanup();
         qemu_mutex_lock(&s->qemu_file_lock);
         tmp = s->to_dst_file;
         s->to_dst_file = NULL;
diff --git a/migration/ram.c b/migration/ram.c
index 1671dedc97..435a8d2946 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -917,13 +917,12 @@ static void multifd_send_terminate_threads(Error *err)
     }
 }
 
-int multifd_save_cleanup(Error **errp)
+void multifd_save_cleanup(void)
 {
     int i;
-    int ret = 0;
 
     if (!migrate_use_multifd()) {
-        return 0;
+        return;
     }
     multifd_send_terminate_threads(NULL);
     for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -953,7 +952,6 @@ int multifd_save_cleanup(Error **errp)
     multifd_send_state->pages = NULL;
     g_free(multifd_send_state);
     multifd_send_state = NULL;
-    return ret;
 }
 
 static void multifd_send_sync_main(void)
@@ -1071,9 +1069,8 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
     Error *local_err = NULL;
 
     if (qio_task_propagate_error(task, &local_err)) {
-        if (multifd_save_cleanup(&local_err) != 0) {
-            migrate_set_error(migrate_get_current(), local_err);
-        }
+        migrate_set_error(migrate_get_current(), local_err);
+        multifd_save_cleanup();
     } else {
         p->c = QIO_CHANNEL(sioc);
         qio_channel_set_delay(p->c, false);
diff --git a/migration/ram.h b/migration/ram.h
index 046d3074be..936177b3e9 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -43,7 +43,7 @@ uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_total(void);
 
 int multifd_save_setup(void);
-int multifd_save_cleanup(Error **errp);
+void multifd_save_cleanup(void);
 int multifd_load_setup(void);
 int multifd_load_cleanup(Error **errp);
 bool multifd_recv_all_channels_created(void);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 04/16] migration: add more error handling for postcopy_ram_enable_notify
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (2 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co Fei Li
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Dr . David Alan Gilbert

Call postcopy_ram_incoming_cleanup() to do the cleanup when
postcopy_ram_enable_notify fails. Besides, report the error
message when qemu_ram_foreach_migratable_block() fails.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/postcopy-ram.c | 1 +
 migration/savevm.c       | 1 +
 2 files changed, 2 insertions(+)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index e5c02a32c5..fa09dba534 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1117,6 +1117,7 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
 
     /* Mark so that we get notified of accesses to unwritten areas */
     if (qemu_ram_foreach_migratable_block(ram_block_enable_notify, mis)) {
+        error_report("ram_block_enable_notify failed");
         return -1;
     }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index 9e45fb4f3f..d784e8aa40 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1729,6 +1729,7 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
      */
     if (migrate_postcopy_ram()) {
         if (postcopy_ram_enable_notify(mis)) {
+            postcopy_ram_incoming_cleanup(mis);
             return -1;
         }
     }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (3 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 04/16] migration: add more error handling for postcopy_ram_enable_notify Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-03 11:25   ` Dr. David Alan Gilbert
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly Fei Li
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei
  Cc: lifei1214, Markus Armbruster, Dr . David Alan Gilbert, Peter Xu

In the current code, if process_incoming_migration_co() fails we do
the same error handing: set the error state, close the source file,
do the cleanup for multifd, and then exit(EXIT_FAILURE). To make the
code clearer, add a "goto fail" to unify the error handling.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 migration/migration.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5d322eb9d6..ded151b1bf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -438,15 +438,13 @@ static void process_incoming_migration_co(void *opaque)
         /* Make sure all file formats flush their mutable metadata */
         bdrv_invalidate_cache_all(&local_err);
         if (local_err) {
-            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
-                    MIGRATION_STATUS_FAILED);
             error_report_err(local_err);
-            exit(EXIT_FAILURE);
+            goto fail;
         }
 
         if (colo_init_ram_cache() < 0) {
             error_report("Init ram cache failed");
-            exit(EXIT_FAILURE);
+            goto fail;
         }
 
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
@@ -461,20 +459,22 @@ static void process_incoming_migration_co(void *opaque)
     }
 
     if (ret < 0) {
-        Error *local_err = NULL;
-
-        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
         error_report("load of migration failed: %s", strerror(-ret));
-        qemu_fclose(mis->from_src_file);
-        if (multifd_load_cleanup(&local_err) != 0) {
-            error_report_err(local_err);
-        }
-        exit(EXIT_FAILURE);
+        goto fail;
     }
     mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
     qemu_bh_schedule(mis->bh);
     mis->migration_incoming_co = NULL;
+    return;
+fail:
+    local_err = NULL;
+    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
+                      MIGRATION_STATUS_FAILED);
+    qemu_fclose(mis->from_src_file);
+    if (multifd_load_cleanup(&local_err) != 0) {
+        error_report_err(local_err);
+    }
+    exit(EXIT_FAILURE);
 }
 
 static void migration_incoming_setup(QEMUFile *f)
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (4 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:18   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 07/16] qemu_thread: supplement error handling for qemu_X_start_vcpu Fei Li
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster, Paolo Bonzini

qemu_thread_create() abort()s on error. Not nice. Give it a return
value and an Error ** argument, so it can return success/failure.

Considering qemu_thread_create() is quite widely used in qemu, split
this into two steps: this patch passes the &error_abort to
qemu_thread_create() everywhere, and the next 9 patches will improve
on &error_abort for callers who need.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 cpus.c                      | 23 +++++++++++++++--------
 dump.c                      |  3 ++-
 hw/misc/edu.c               |  4 +++-
 hw/ppc/spapr_hcall.c        |  4 +++-
 hw/rdma/rdma_backend.c      |  3 ++-
 hw/usb/ccid-card-emulated.c |  5 +++--
 include/qemu/thread.h       |  4 ++--
 io/task.c                   |  3 ++-
 iothread.c                  |  3 ++-
 migration/migration.c       | 11 ++++++++---
 migration/postcopy-ram.c    |  4 +++-
 migration/ram.c             | 12 ++++++++----
 migration/savevm.c          |  3 ++-
 tests/atomic_add-bench.c    |  3 ++-
 tests/iothread.c            |  2 +-
 tests/qht-bench.c           |  3 ++-
 tests/rcutorture.c          |  3 ++-
 tests/test-aio.c            |  2 +-
 tests/test-rcu-list.c       |  3 ++-
 ui/vnc-jobs.c               |  6 ++++--
 util/compatfd.c             |  6 ++++--
 util/oslib-posix.c          |  3 ++-
 util/qemu-thread-posix.c    | 27 ++++++++++++++++++++-------
 util/qemu-thread-win32.c    | 16 ++++++++++++----
 util/rcu.c                  |  3 ++-
 util/thread-pool.c          |  4 +++-
 26 files changed, 112 insertions(+), 51 deletions(-)

diff --git a/cpus.c b/cpus.c
index 0ddeeefc14..25df03326b 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1961,15 +1961,17 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
                  cpu->cpu_index);
 
+            /* TODO: let the callers handle the error instead of abort() here */
             qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
-                               cpu, QEMU_THREAD_JOINABLE);
+                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
 
         } else {
             /* share a single thread for all cpus with TCG */
             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
+            /* TODO: let the callers handle the error instead of abort() here */
             qemu_thread_create(cpu->thread, thread_name,
                                qemu_tcg_rr_cpu_thread_fn,
-                               cpu, QEMU_THREAD_JOINABLE);
+                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
 
             single_tcg_halt_cond = cpu->halt_cond;
             single_tcg_cpu_thread = cpu->thread;
@@ -1997,8 +1999,9 @@ static void qemu_hax_start_vcpu(CPUState *cpu)
 
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX",
              cpu->cpu_index);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(cpu->thread, thread_name, qemu_hax_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
+                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
 #ifdef _WIN32
     cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -2013,8 +2016,9 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
              cpu->cpu_index);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
+                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
 }
 
 static void qemu_hvf_start_vcpu(CPUState *cpu)
@@ -2031,8 +2035,9 @@ static void qemu_hvf_start_vcpu(CPUState *cpu)
 
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
              cpu->cpu_index);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(cpu->thread, thread_name, qemu_hvf_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
+                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
 }
 
 static void qemu_whpx_start_vcpu(CPUState *cpu)
@@ -2044,8 +2049,9 @@ static void qemu_whpx_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX",
              cpu->cpu_index);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(cpu->thread, thread_name, qemu_whpx_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE);
+                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
 #ifdef _WIN32
     cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -2060,8 +2066,9 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
              cpu->cpu_index);
-    qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
-                       QEMU_THREAD_JOINABLE);
+    /* TODO: let the further caller handle the error instead of abort() here */
+    qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn,
+                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
 }
 
 void qemu_init_vcpu(CPUState *cpu)
diff --git a/dump.c b/dump.c
index 4ec94c5e25..c35d6ddd22 100644
--- a/dump.c
+++ b/dump.c
@@ -2020,8 +2020,9 @@ void qmp_dump_guest_memory(bool paging, const char *file,
     if (detach_p) {
         /* detached dump */
         s->detached = true;
+        /* TODO: let the further caller handle the error instead of abort() */
         qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
-                           s, QEMU_THREAD_DETACHED);
+                           s, QEMU_THREAD_DETACHED, &error_abort);
     } else {
         /* sync dump */
         dump_process(s, errp);
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index cdcf550dd7..3f4ba7ded3 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -28,6 +28,7 @@
 #include "hw/pci/msi.h"
 #include "qemu/timer.h"
 #include "qemu/main-loop.h" /* iothread mutex */
+#include "qapi/error.h"
 #include "qapi/visitor.h"
 
 #define TYPE_PCI_EDU_DEVICE "edu"
@@ -355,8 +356,9 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp)
 
     qemu_mutex_init(&edu->thr_mutex);
     qemu_cond_init(&edu->thr_cond);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
-                       edu, QEMU_THREAD_JOINABLE);
+                       edu, QEMU_THREAD_JOINABLE, &error_abort);
 
     memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu,
                     "edu-mmio", 1 * MiB);
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index ae913d070f..5bc2cf4540 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -538,8 +538,10 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
     pending->shift = shift;
     pending->ret = H_HARDWARE;
 
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
-                       hpt_prepare_thread, pending, QEMU_THREAD_DETACHED);
+                       hpt_prepare_thread, pending,
+                       QEMU_THREAD_DETACHED, &error_abort);
 
     spapr->pending_hpt = pending;
 
diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
index c28bfbd44d..25e0c7d0f5 100644
--- a/hw/rdma/rdma_backend.c
+++ b/hw/rdma/rdma_backend.c
@@ -263,7 +263,8 @@ static void start_comp_thread(RdmaBackendDev *backend_dev)
              ibv_get_device_name(backend_dev->ib_dev));
     backend_dev->comp_thread.run = true;
     qemu_thread_create(&backend_dev->comp_thread.thread, thread_name,
-                       comp_handler_thread, backend_dev, QEMU_THREAD_DETACHED);
+                       comp_handler_thread, backend_dev,
+                       QEMU_THREAD_DETACHED, &error_abort);
 }
 
 void rdma_backend_register_comp_handler(void (*handler)(void *ctx,
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index 25976ed84f..f8ff7ff4a3 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -544,10 +544,11 @@ static void emulated_realize(CCIDCardState *base, Error **errp)
         error_setg(errp, "%s: failed to initialize vcard", TYPE_EMULATED_CCID);
         goto out2;
     }
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
-                       card, QEMU_THREAD_JOINABLE);
+                       card, QEMU_THREAD_JOINABLE, &error_abort);
     qemu_thread_create(&card->apdu_thread_id, "ccid/apdu", handle_apdu_thread,
-                       card, QEMU_THREAD_JOINABLE);
+                       card, QEMU_THREAD_JOINABLE, &error_abort);
 
 out2:
     clean_event_notifier(card);
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 55d83a907c..12291f4ccd 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -152,9 +152,9 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread, const char *name,
+bool qemu_thread_create(QemuThread *thread, const char *name,
                         void *(*start_routine)(void *),
-                        void *arg, int mode);
+                        void *arg, int mode, Error **errp);
 void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
diff --git a/io/task.c b/io/task.c
index 2886a2c1bc..6d3a18ab80 100644
--- a/io/task.c
+++ b/io/task.c
@@ -149,7 +149,8 @@ void qio_task_run_in_thread(QIOTask *task,
                        "io-task-worker",
                        qio_task_thread_worker,
                        data,
-                       QEMU_THREAD_DETACHED);
+                       QEMU_THREAD_DETACHED,
+                       &error_abort);
 }
 
 
diff --git a/iothread.c b/iothread.c
index 2fb1cdf55d..8e8aa01999 100644
--- a/iothread.c
+++ b/iothread.c
@@ -178,8 +178,9 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
      */
     name = object_get_canonical_path_component(OBJECT(obj));
     thread_name = g_strdup_printf("IO %s", name);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&iothread->thread, thread_name, iothread_run,
-                       iothread, QEMU_THREAD_JOINABLE);
+                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
     g_free(thread_name);
     g_free(name);
 
diff --git a/migration/migration.c b/migration/migration.c
index ded151b1bf..ea5839ff0d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -447,8 +447,10 @@ static void process_incoming_migration_co(void *opaque)
             goto fail;
         }
 
+        /* TODO: let the further caller handle the error instead of abort() */
         qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
-             colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
+                           colo_process_incoming_thread, mis,
+                           QEMU_THREAD_JOINABLE, &error_abort);
         mis->have_colo_incoming_thread = true;
         qemu_coroutine_yield();
 
@@ -2358,8 +2360,10 @@ static int open_return_path_on_source(MigrationState *ms,
         return 0;
     }
 
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&ms->rp_state.rp_thread, "return path",
-                       source_return_path_thread, ms, QEMU_THREAD_JOINABLE);
+                       source_return_path_thread, ms,
+                       QEMU_THREAD_JOINABLE, &error_abort);
 
     trace_open_return_path_on_source_continue();
 
@@ -3189,8 +3193,9 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
         migrate_fd_cleanup(s);
         return;
     }
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
-                       QEMU_THREAD_JOINABLE);
+                       QEMU_THREAD_JOINABLE, &error_abort);
     s->migration_thread_running = true;
 }
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index fa09dba534..221ea24919 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1109,8 +1109,10 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
     }
 
     qemu_sem_init(&mis->fault_thread_sem, 0);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&mis->fault_thread, "postcopy/fault",
-                       postcopy_ram_fault_thread, mis, QEMU_THREAD_JOINABLE);
+                       postcopy_ram_fault_thread, mis,
+                       QEMU_THREAD_JOINABLE, &error_abort);
     qemu_sem_wait(&mis->fault_thread_sem);
     qemu_sem_destroy(&mis->fault_thread_sem);
     mis->have_fault_thread = true;
diff --git a/migration/ram.c b/migration/ram.c
index 435a8d2946..eed1daf302 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -502,9 +502,10 @@ static int compress_threads_save_setup(void)
         comp_param[i].quit = false;
         qemu_mutex_init(&comp_param[i].mutex);
         qemu_cond_init(&comp_param[i].cond);
+        /* TODO: let the further caller handle the error instead of abort() */
         qemu_thread_create(compress_threads + i, "compress",
                            do_data_compress, comp_param + i,
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
     }
     return 0;
 
@@ -1075,8 +1076,9 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
         p->c = QIO_CHANNEL(sioc);
         qio_channel_set_delay(p->c, false);
         p->running = true;
+        /* TODO: let the further caller handle the error instead of abort() */
         qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
 
         atomic_inc(&multifd_send_state->count);
     }
@@ -1355,8 +1357,9 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
     p->num_packets = 1;
 
     p->running = true;
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
-                       QEMU_THREAD_JOINABLE);
+                       QEMU_THREAD_JOINABLE, &error_abort);
     atomic_inc(&multifd_recv_state->count);
     return atomic_read(&multifd_recv_state->count) ==
            migrate_multifd_channels();
@@ -3643,9 +3646,10 @@ static int compress_threads_load_setup(QEMUFile *f)
         qemu_cond_init(&decomp_param[i].cond);
         decomp_param[i].done = true;
         decomp_param[i].quit = false;
+        /* TODO: let the further caller handle the error instead of abort() */
         qemu_thread_create(decompress_threads + i, "decompress",
                            do_data_decompress, decomp_param + i,
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
     }
     return 0;
 exit:
diff --git a/migration/savevm.c b/migration/savevm.c
index d784e8aa40..46ce7af239 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1747,9 +1747,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
     mis->have_listen_thread = true;
     /* Start up the listening thread and wait for it to signal ready */
     qemu_sem_init(&mis->listen_thread_sem, 0);
+    /* TODO: let the further caller handle the error instead of abort() here */
     qemu_thread_create(&mis->listen_thread, "postcopy/listen",
                        postcopy_ram_listen_thread, NULL,
-                       QEMU_THREAD_DETACHED);
+                       QEMU_THREAD_DETACHED, &error_abort);
     qemu_sem_wait(&mis->listen_thread_sem);
     qemu_sem_destroy(&mis->listen_thread_sem);
 
diff --git a/tests/atomic_add-bench.c b/tests/atomic_add-bench.c
index 2f6c72f63a..338b9563e3 100644
--- a/tests/atomic_add-bench.c
+++ b/tests/atomic_add-bench.c
@@ -2,6 +2,7 @@
 #include "qemu/thread.h"
 #include "qemu/host-utils.h"
 #include "qemu/processor.h"
+#include "qapi/error.h"
 
 struct thread_info {
     uint64_t r;
@@ -110,7 +111,7 @@ static void create_threads(void)
 
         info->r = (i + 1) ^ time(NULL);
         qemu_thread_create(&threads[i], NULL, thread_func, info,
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
     }
 }
 
diff --git a/tests/iothread.c b/tests/iothread.c
index 777d9eea46..f4ad992e61 100644
--- a/tests/iothread.c
+++ b/tests/iothread.c
@@ -73,7 +73,7 @@ IOThread *iothread_new(void)
     qemu_mutex_init(&iothread->init_done_lock);
     qemu_cond_init(&iothread->init_done_cond);
     qemu_thread_create(&iothread->thread, NULL, iothread_run,
-                       iothread, QEMU_THREAD_JOINABLE);
+                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
 
     /* Wait for initialization to complete */
     qemu_mutex_lock(&iothread->init_done_lock);
diff --git a/tests/qht-bench.c b/tests/qht-bench.c
index ab4e708180..85e1af2f84 100644
--- a/tests/qht-bench.c
+++ b/tests/qht-bench.c
@@ -10,6 +10,7 @@
 #include "qemu/qht.h"
 #include "qemu/rcu.h"
 #include "qemu/xxhash.h"
+#include "qapi/error.h"
 
 struct thread_stats {
     size_t rd;
@@ -248,7 +249,7 @@ th_create_n(QemuThread **threads, struct thread_info **infos, const char *name,
         prepare_thread_info(&info[i], offset + i);
         info[i].func = func;
         qemu_thread_create(&th[i], name, thread_func, &info[i],
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
     }
 }
 
diff --git a/tests/rcutorture.c b/tests/rcutorture.c
index 49311c82ea..0e799ff256 100644
--- a/tests/rcutorture.c
+++ b/tests/rcutorture.c
@@ -64,6 +64,7 @@
 #include "qemu/atomic.h"
 #include "qemu/rcu.h"
 #include "qemu/thread.h"
+#include "qapi/error.h"
 
 long long n_reads = 0LL;
 long n_updates = 0L;
@@ -90,7 +91,7 @@ static void create_thread(void *(*func)(void *))
         exit(-1);
     }
     qemu_thread_create(&threads[n_threads], "test", func, &data[n_threads],
-                       QEMU_THREAD_JOINABLE);
+                       QEMU_THREAD_JOINABLE, &error_abort);
     n_threads++;
 }
 
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 86fb73b3d5..b3ac261724 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -154,7 +154,7 @@ static void test_acquire(void)
 
     qemu_thread_create(&thread, "test_acquire_thread",
                        test_acquire_thread,
-                       &data, QEMU_THREAD_JOINABLE);
+                       &data, QEMU_THREAD_JOINABLE, &error_abort);
 
     /* Block in aio_poll(), let other thread kick us and acquire context */
     aio_context_acquire(ctx);
diff --git a/tests/test-rcu-list.c b/tests/test-rcu-list.c
index 2e6f70bd59..0f7da81291 100644
--- a/tests/test-rcu-list.c
+++ b/tests/test-rcu-list.c
@@ -25,6 +25,7 @@
 #include "qemu/rcu.h"
 #include "qemu/thread.h"
 #include "qemu/rcu_queue.h"
+#include "qapi/error.h"
 
 /*
  * Test variables.
@@ -68,7 +69,7 @@ static void create_thread(void *(*func)(void *))
         exit(-1);
     }
     qemu_thread_create(&threads[n_threads], "test", func, &data[n_threads],
-                       QEMU_THREAD_JOINABLE);
+                       QEMU_THREAD_JOINABLE, &error_abort);
     n_threads++;
 }
 
diff --git a/ui/vnc-jobs.c b/ui/vnc-jobs.c
index 929391f85d..5712f1f501 100644
--- a/ui/vnc-jobs.c
+++ b/ui/vnc-jobs.c
@@ -31,6 +31,7 @@
 #include "vnc-jobs.h"
 #include "qemu/sockets.h"
 #include "qemu/main-loop.h"
+#include "qapi/error.h"
 #include "block/aio.h"
 
 /*
@@ -339,7 +340,8 @@ void vnc_start_worker_thread(void)
         return ;
 
     q = vnc_queue_init();
-    qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread, q,
-                       QEMU_THREAD_DETACHED);
+    /* TODO: let the further caller handle the error instead of abort() here */
+    qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
+                       q, QEMU_THREAD_DETACHED, &error_abort);
     queue = q; /* Set global queue */
 }
diff --git a/util/compatfd.c b/util/compatfd.c
index 980bd33e52..c3d8448264 100644
--- a/util/compatfd.c
+++ b/util/compatfd.c
@@ -16,6 +16,7 @@
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/thread.h"
+#include "qapi/error.h"
 
 #include <sys/syscall.h>
 
@@ -88,8 +89,9 @@ static int qemu_signalfd_compat(const sigset_t *mask)
     memcpy(&info->mask, mask, sizeof(*mask));
     info->fd = fds[1];
 
-    qemu_thread_create(&thread, "signalfd_compat", sigwait_compat, info,
-                       QEMU_THREAD_DETACHED);
+    /* TODO: let the further caller handle the error instead of abort() here */
+    qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
+                       info, QEMU_THREAD_DETACHED, &error_abort);
 
     return fds[0];
 }
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index c1bee2a581..251e2f1aea 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -448,9 +448,10 @@ static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
         memset_thread[i].numpages = (i == (memset_num_threads - 1)) ?
                                     numpages : numpages_per_thread;
         memset_thread[i].hpagesize = hpagesize;
+        /* TODO: let the callers handle the error instead of abort() here */
         qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
                            do_touch_pages, &memset_thread[i],
-                           QEMU_THREAD_JOINABLE);
+                           QEMU_THREAD_JOINABLE, &error_abort);
         addr += size_per_thread;
         numpages -= numpages_per_thread;
     }
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 865e476df5..39834b0551 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -15,6 +15,7 @@
 #include "qemu/atomic.h"
 #include "qemu/notify.h"
 #include "qemu-thread-common.h"
+#include "qapi/error.h"
 
 static bool name_threads;
 
@@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
     return r;
 }
 
-void qemu_thread_create(QemuThread *thread, const char *name,
-                       void *(*start_routine)(void*),
-                       void *arg, int mode)
+/*
+ * Return a boolean: true/false to indicate whether it succeeds.
+ * If fails, propagate the error to Error **errp and set the errno.
+ */
+bool qemu_thread_create(QemuThread *thread, const char *name,
+                        void *(*start_routine)(void *),
+                        void *arg, int mode, Error **errp)
 {
     sigset_t set, oldset;
     int err;
@@ -511,7 +516,9 @@ void qemu_thread_create(QemuThread *thread, const char *name,
 
     err = pthread_attr_init(&attr);
     if (err) {
-        error_exit(err, __func__);
+        errno = err;
+        error_setg_errno(errp, errno, "pthread_attr_init failed");
+        return false;
     }
 
     if (mode == QEMU_THREAD_DETACHED) {
@@ -529,13 +536,19 @@ void qemu_thread_create(QemuThread *thread, const char *name,
 
     err = pthread_create(&thread->thread, &attr,
                          qemu_thread_start, qemu_thread_args);
-
-    if (err)
-        error_exit(err, __func__);
+    if (err) {
+        errno = err;
+        error_setg_errno(errp, errno, "pthread_create failed");
+        pthread_attr_destroy(&attr);
+        g_free(qemu_thread_args->name);
+        g_free(qemu_thread_args);
+        return false;
+    }
 
     pthread_sigmask(SIG_SETMASK, &oldset, NULL);
 
     pthread_attr_destroy(&attr);
+    return true;
 }
 
 void qemu_thread_get_self(QemuThread *thread)
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 4a363ca675..57b1143e97 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -20,6 +20,7 @@
 #include "qemu/thread.h"
 #include "qemu/notify.h"
 #include "qemu-thread-common.h"
+#include "qapi/error.h"
 #include <process.h>
 
 static bool name_threads;
@@ -388,9 +389,9 @@ void *qemu_thread_join(QemuThread *thread)
     return ret;
 }
 
-void qemu_thread_create(QemuThread *thread, const char *name,
-                       void *(*start_routine)(void *),
-                       void *arg, int mode)
+bool qemu_thread_create(QemuThread *thread, const char *name,
+                        void *(*start_routine)(void *),
+                        void *arg, int mode, Error **errp)
 {
     HANDLE hThread;
     struct QemuThreadData *data;
@@ -409,10 +410,17 @@ void qemu_thread_create(QemuThread *thread, const char *name,
     hThread = (HANDLE) _beginthreadex(NULL, 0, win32_start_routine,
                                       data, 0, &thread->tid);
     if (!hThread) {
-        error_exit(GetLastError(), __func__);
+        if (data->mode != QEMU_THREAD_DETACHED) {
+            DeleteCriticalSection(&data->cs);
+        }
+        error_setg_errno(errp, errno,
+                         "failed to create win32_start_routine");
+        g_free(data);
+        return false;
     }
     CloseHandle(hThread);
     thread->data = data;
+    return true;
 }
 
 void qemu_thread_get_self(QemuThread *thread)
diff --git a/util/rcu.c b/util/rcu.c
index 5676c22bd1..145dcdb0c6 100644
--- a/util/rcu.c
+++ b/util/rcu.c
@@ -32,6 +32,7 @@
 #include "qemu/atomic.h"
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
+#include "qapi/error.h"
 #if defined(CONFIG_MALLOC_TRIM)
 #include <malloc.h>
 #endif
@@ -325,7 +326,7 @@ static void rcu_init_complete(void)
      * must have been quiescent even after forking, just recreate it.
      */
     qemu_thread_create(&thread, "call_rcu", call_rcu_thread,
-                       NULL, QEMU_THREAD_DETACHED);
+                       NULL, QEMU_THREAD_DETACHED, &error_abort);
 
     rcu_register_thread();
 }
diff --git a/util/thread-pool.c b/util/thread-pool.c
index 610646d131..ad0f980783 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -22,6 +22,7 @@
 #include "trace.h"
 #include "block/thread-pool.h"
 #include "qemu/main-loop.h"
+#include "qapi/error.h"
 
 static void do_spawn_thread(ThreadPool *pool);
 
@@ -132,7 +133,8 @@ static void do_spawn_thread(ThreadPool *pool)
     pool->new_threads--;
     pool->pending_threads++;
 
-    qemu_thread_create(&t, "worker", worker_thread, pool, QEMU_THREAD_DETACHED);
+    qemu_thread_create(&t, "worker", worker_thread, pool,
+                       QEMU_THREAD_DETACHED, &error_abort);
 }
 
 static void spawn_thread_bh_fn(void *opaque)
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 07/16] qemu_thread: supplement error handling for qemu_X_start_vcpu
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (5 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory Fei Li
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Paolo Bonzini

The callers of qemu_init_vcpu() already passed the **errp to handle
errors. In view of this, add a new Error parameter to qemu_init_vcpu()
and all qemu_X_start_vcpu() functions called by qemu_init_vcpu() to
propagate the error and let the further callers check it.

Besides, make qemu_init_vcpu() return a Boolean value to let its
callers know whether it succeeds.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 accel/tcg/user-exec-stub.c      |  3 +-
 cpus.c                          | 86 ++++++++++++++++++++++++-----------------
 include/qom/cpu.h               |  2 +-
 target/alpha/cpu.c              |  4 +-
 target/arm/cpu.c                |  4 +-
 target/cris/cpu.c               |  4 +-
 target/hppa/cpu.c               |  4 +-
 target/i386/cpu.c               |  4 +-
 target/lm32/cpu.c               |  4 +-
 target/m68k/cpu.c               |  4 +-
 target/microblaze/cpu.c         |  4 +-
 target/mips/cpu.c               |  4 +-
 target/moxie/cpu.c              |  4 +-
 target/nios2/cpu.c              |  4 +-
 target/openrisc/cpu.c           |  4 +-
 target/ppc/translate_init.inc.c |  4 +-
 target/riscv/cpu.c              |  4 +-
 target/s390x/cpu.c              |  4 +-
 target/sh4/cpu.c                |  4 +-
 target/sparc/cpu.c              |  4 +-
 target/tilegx/cpu.c             |  4 +-
 target/tricore/cpu.c            |  4 +-
 target/unicore32/cpu.c          |  4 +-
 target/xtensa/cpu.c             |  4 +-
 24 files changed, 117 insertions(+), 58 deletions(-)

diff --git a/accel/tcg/user-exec-stub.c b/accel/tcg/user-exec-stub.c
index a32b4496af..f8c38a375c 100644
--- a/accel/tcg/user-exec-stub.c
+++ b/accel/tcg/user-exec-stub.c
@@ -10,8 +10,9 @@ void cpu_resume(CPUState *cpu)
 {
 }
 
-void qemu_init_vcpu(CPUState *cpu)
+bool qemu_init_vcpu(CPUState *cpu, Error **errp)
 {
+    return true;
 }
 
 /* User mode emulation does not support record/replay yet.  */
diff --git a/cpus.c b/cpus.c
index 25df03326b..e8450e518a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1931,7 +1931,7 @@ void cpu_remove_sync(CPUState *cpu)
 /* For temporary buffers for forming a name */
 #define VCPU_THREAD_NAME_SIZE 16
 
-static void qemu_tcg_init_vcpu(CPUState *cpu)
+static void qemu_tcg_init_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
     static QemuCond *single_tcg_halt_cond;
@@ -1961,17 +1961,20 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
                  cpu->cpu_index);
 
-            /* TODO: let the callers handle the error instead of abort() here */
-            qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
-                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
+            if (!qemu_thread_create(cpu->thread, thread_name,
+                                    qemu_tcg_cpu_thread_fn, cpu,
+                                    QEMU_THREAD_JOINABLE, errp)) {
+                return;
+            }
 
         } else {
             /* share a single thread for all cpus with TCG */
             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
-            /* TODO: let the callers handle the error instead of abort() here */
-            qemu_thread_create(cpu->thread, thread_name,
-                               qemu_tcg_rr_cpu_thread_fn,
-                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
+            if (!qemu_thread_create(cpu->thread, thread_name,
+                                    qemu_tcg_rr_cpu_thread_fn, cpu,
+                                    QEMU_THREAD_JOINABLE, errp)) {
+                return;
+            }
 
             single_tcg_halt_cond = cpu->halt_cond;
             single_tcg_cpu_thread = cpu->thread;
@@ -1989,7 +1992,7 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
     }
 }
 
-static void qemu_hax_start_vcpu(CPUState *cpu)
+static void qemu_hax_start_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
 
@@ -1999,15 +2002,16 @@ static void qemu_hax_start_vcpu(CPUState *cpu)
 
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HAX",
              cpu->cpu_index);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(cpu->thread, thread_name, qemu_hax_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(cpu->thread, thread_name, qemu_hax_cpu_thread_fn,
+                            cpu, QEMU_THREAD_JOINABLE, errp)) {
+        return;
+    }
 #ifdef _WIN32
     cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
 }
 
-static void qemu_kvm_start_vcpu(CPUState *cpu)
+static void qemu_kvm_start_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
 
@@ -2016,12 +2020,13 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
              cpu->cpu_index);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+                            cpu, QEMU_THREAD_JOINABLE, errp)) {
+        /* keep 'if' here in case there is further error handling logic */
+    }
 }
 
-static void qemu_hvf_start_vcpu(CPUState *cpu)
+static void qemu_hvf_start_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
 
@@ -2035,12 +2040,13 @@ static void qemu_hvf_start_vcpu(CPUState *cpu)
 
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/HVF",
              cpu->cpu_index);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(cpu->thread, thread_name, qemu_hvf_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(cpu->thread, thread_name, qemu_hvf_cpu_thread_fn,
+                            cpu, QEMU_THREAD_JOINABLE, errp)) {
+        /* keep 'if' here in case there is further error handling logic */
+    }
 }
 
-static void qemu_whpx_start_vcpu(CPUState *cpu)
+static void qemu_whpx_start_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
 
@@ -2049,15 +2055,16 @@ static void qemu_whpx_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/WHPX",
              cpu->cpu_index);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(cpu->thread, thread_name, qemu_whpx_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(cpu->thread, thread_name, qemu_whpx_cpu_thread_fn,
+                            cpu, QEMU_THREAD_JOINABLE, errp)) {
+        return;
+    }
 #ifdef _WIN32
     cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
 }
 
-static void qemu_dummy_start_vcpu(CPUState *cpu)
+static void qemu_dummy_start_vcpu(CPUState *cpu, Error **errp)
 {
     char thread_name[VCPU_THREAD_NAME_SIZE];
 
@@ -2066,16 +2073,18 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
     qemu_cond_init(cpu->halt_cond);
     snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
              cpu->cpu_index);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn,
-                       cpu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn,
+                            cpu, QEMU_THREAD_JOINABLE, errp)) {
+        /* keep 'if' here in case there is further error handling logic */
+    }
 }
 
-void qemu_init_vcpu(CPUState *cpu)
+bool qemu_init_vcpu(CPUState *cpu, Error **errp)
 {
     cpu->nr_cores = smp_cores;
     cpu->nr_threads = smp_threads;
     cpu->stopped = true;
+    Error *local_err = NULL;
 
     if (!cpu->as) {
         /* If the target cpu hasn't set up any address spaces itself,
@@ -2086,22 +2095,29 @@ void qemu_init_vcpu(CPUState *cpu)
     }
 
     if (kvm_enabled()) {
-        qemu_kvm_start_vcpu(cpu);
+        qemu_kvm_start_vcpu(cpu, &local_err);
     } else if (hax_enabled()) {
-        qemu_hax_start_vcpu(cpu);
+        qemu_hax_start_vcpu(cpu, &local_err);
     } else if (hvf_enabled()) {
-        qemu_hvf_start_vcpu(cpu);
+        qemu_hvf_start_vcpu(cpu, &local_err);
     } else if (tcg_enabled()) {
-        qemu_tcg_init_vcpu(cpu);
+        qemu_tcg_init_vcpu(cpu, &local_err);
     } else if (whpx_enabled()) {
-        qemu_whpx_start_vcpu(cpu);
+        qemu_whpx_start_vcpu(cpu, &local_err);
     } else {
-        qemu_dummy_start_vcpu(cpu);
+        qemu_dummy_start_vcpu(cpu, &local_err);
+    }
+
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return false;
     }
 
     while (!cpu->created) {
         qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
     }
+
+    return true;
 }
 
 void cpu_stop_current(void)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 1396f53e5b..696c3608d2 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -1006,7 +1006,7 @@ void end_exclusive(void);
  *
  * Initializes a vCPU.
  */
-void qemu_init_vcpu(CPUState *cpu);
+bool qemu_init_vcpu(CPUState *cpu, Error **errp);
 
 #define SSTEP_ENABLE  0x1  /* Enable simulated HW single stepping */
 #define SSTEP_NOIRQ   0x2  /* Do not use IRQ while single stepping */
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index a953897fcc..bf3c34516d 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -66,7 +66,9 @@ static void alpha_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     acc->parent_realize(dev, errp);
 }
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index c8505eaaee..ef745ff4c7 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1125,7 +1125,9 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 #endif
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     cpu_reset(cs);
 
     acc->parent_realize(dev, errp);
diff --git a/target/cris/cpu.c b/target/cris/cpu.c
index a23aba2688..ec92d69781 100644
--- a/target/cris/cpu.c
+++ b/target/cris/cpu.c
@@ -140,7 +140,9 @@ static void cris_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     ccc->parent_realize(dev, errp);
 }
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
index 00bf444620..08f600ced9 100644
--- a/target/hppa/cpu.c
+++ b/target/hppa/cpu.c
@@ -98,7 +98,9 @@ static void hppa_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     acc->parent_realize(dev, errp);
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 677a3bd5fb..21fc8122eb 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5279,7 +5279,9 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 #endif
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     /*
      * Most Intel and certain AMD CPUs support hyperthreading. Even though QEMU
diff --git a/target/lm32/cpu.c b/target/lm32/cpu.c
index b7499cb627..d50b1e4a43 100644
--- a/target/lm32/cpu.c
+++ b/target/lm32/cpu.c
@@ -139,7 +139,9 @@ static void lm32_cpu_realizefn(DeviceState *dev, Error **errp)
 
     cpu_reset(cs);
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     lcc->parent_realize(dev, errp);
 }
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 582e3a73b3..4ab53f2d58 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -231,7 +231,9 @@ static void m68k_cpu_realizefn(DeviceState *dev, Error **errp)
     m68k_cpu_init_gdb(cpu);
 
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     mcc->parent_realize(dev, errp);
 }
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 9b546a2c18..3906c864a3 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -161,7 +161,9 @@ static void mb_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     env->pvr.regs[0] = PVR0_USE_EXC_MASK \
                        | PVR0_USE_ICACHE_MASK \
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index e217fb3e36..1e5aa69c57 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -145,7 +145,9 @@ static void mips_cpu_realizefn(DeviceState *dev, Error **errp)
     cpu_mips_realize_env(&cpu->env);
 
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     mcc->parent_realize(dev, errp);
 }
diff --git a/target/moxie/cpu.c b/target/moxie/cpu.c
index 8d67eb6727..8581a6d922 100644
--- a/target/moxie/cpu.c
+++ b/target/moxie/cpu.c
@@ -66,7 +66,9 @@ static void moxie_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     cpu_reset(cs);
 
     mcc->parent_realize(dev, errp);
diff --git a/target/nios2/cpu.c b/target/nios2/cpu.c
index fbfaa2ce26..5c7b4b486e 100644
--- a/target/nios2/cpu.c
+++ b/target/nios2/cpu.c
@@ -94,7 +94,9 @@ static void nios2_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     cpu_reset(cs);
 
     ncc->parent_realize(dev, errp);
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index fb7cb5c507..a6dcdb9df9 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -83,7 +83,9 @@ static void openrisc_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     cpu_reset(cs);
 
     occ->parent_realize(dev, errp);
diff --git a/target/ppc/translate_init.inc.c b/target/ppc/translate_init.inc.c
index 03f1d34a97..2fba2e4741 100644
--- a/target/ppc/translate_init.inc.c
+++ b/target/ppc/translate_init.inc.c
@@ -9711,7 +9711,9 @@ static void ppc_cpu_realize(DeviceState *dev, Error **errp)
                                  32, "power-vsx.xml", 0);
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        goto unrealize;
+    }
 
     pcc->parent_realize(dev, errp);
 
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index a025a0a3ba..9829fd9bc4 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -305,7 +305,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
     cpu_reset(cs);
 
     mcc->parent_realize(dev, errp);
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 18ba7f85a5..2a3eac9761 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -222,7 +222,9 @@ static void s390_cpu_realizefn(DeviceState *dev, Error **errp)
     qemu_register_reset(s390_cpu_machine_reset_cb, cpu);
 #endif
     s390_cpu_gdb_init(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     /*
      * KVM requires the initial CPU reset ioctl to be executed on the target
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
index b9f393b7c7..d32ef2e1cb 100644
--- a/target/sh4/cpu.c
+++ b/target/sh4/cpu.c
@@ -196,7 +196,9 @@ static void superh_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     scc->parent_realize(dev, errp);
 }
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 0f090ece54..9c22f6a7df 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -773,7 +773,9 @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     scc->parent_realize(dev, errp);
 }
diff --git a/target/tilegx/cpu.c b/target/tilegx/cpu.c
index bfe9be59b5..234148fabd 100644
--- a/target/tilegx/cpu.c
+++ b/target/tilegx/cpu.c
@@ -92,7 +92,9 @@ static void tilegx_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     tcc->parent_realize(dev, errp);
 }
diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
index 2edaef1aef..5482d6ea3f 100644
--- a/target/tricore/cpu.c
+++ b/target/tricore/cpu.c
@@ -96,7 +96,9 @@ static void tricore_cpu_realizefn(DeviceState *dev, Error **errp)
         set_feature(env, TRICORE_FEATURE_13);
     }
     cpu_reset(cs);
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     tcc->parent_realize(dev, errp);
 }
diff --git a/target/unicore32/cpu.c b/target/unicore32/cpu.c
index 2b49d1ca40..0c737c3187 100644
--- a/target/unicore32/cpu.c
+++ b/target/unicore32/cpu.c
@@ -96,7 +96,9 @@ static void uc32_cpu_realizefn(DeviceState *dev, Error **errp)
         return;
     }
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     ucc->parent_realize(dev, errp);
 }
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
index a54dbe4260..d2351c9b20 100644
--- a/target/xtensa/cpu.c
+++ b/target/xtensa/cpu.c
@@ -131,7 +131,9 @@ static void xtensa_cpu_realizefn(DeviceState *dev, Error **errp)
 
     cs->gdb_num_regs = xcc->config->gdb_regmap.num_regs;
 
-    qemu_init_vcpu(cs);
+    if (!qemu_init_vcpu(cs, errp)) {
+        return;
+    }
 
     xcc->parent_realize(dev, errp);
 }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (6 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 07/16] qemu_thread: supplement error handling for qemu_X_start_vcpu Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:21   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize Fei Li
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei
  Cc: lifei1214, Markus Armbruster, Marc-André Lureau

Utilize the existed errp to propagate the error instead of the
temporary &error_abort.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 dump.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/dump.c b/dump.c
index c35d6ddd22..ef5ea324fa 100644
--- a/dump.c
+++ b/dump.c
@@ -2020,9 +2020,10 @@ void qmp_dump_guest_memory(bool paging, const char *file,
     if (detach_p) {
         /* detached dump */
         s->detached = true;
-        /* TODO: let the further caller handle the error instead of abort() */
-        qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
-                           s, QEMU_THREAD_DETACHED, &error_abort);
+        if (!qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
+                           s, QEMU_THREAD_DETACHED, errp)) {
+            /* keep 'if' here in case there is further error handling logic */
+        }
     } else {
         /* sync dump */
         dump_process(s, errp);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (7 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:29   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare Fei Li
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster, Jiri Slaby

Utilize the existed errp to propagate the error instead of the
temporary &error_abort.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Fei Li <fli@suse.com>
---
 hw/misc/edu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index 3f4ba7ded3..011fe6e0b7 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp)
 
     qemu_mutex_init(&edu->thr_mutex);
     qemu_cond_init(&edu->thr_cond);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
-                       edu, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
+                            edu, QEMU_THREAD_JOINABLE, errp)) {
+        return;
+    }
 
     memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu,
                     "edu-mmio", 1 * MiB);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (8 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-02  2:36   ` David Gibson
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize Fei Li
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster, David Gibson

Add a local_err to hold the error, and return the corresponding
error code to replace the temporary &error_abort.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Fei Li <fli@suse.com>
---
 hw/ppc/spapr_hcall.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 5bc2cf4540..7c16ade04a 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
     sPAPRPendingHPT *pending = spapr->pending_hpt;
     uint64_t current_ram_size;
     int rc;
+    Error *local_err = NULL;
 
     if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
         return H_AUTHORITY;
@@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
     pending->shift = shift;
     pending->ret = H_HARDWARE;
 
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
-                       hpt_prepare_thread, pending,
-                       QEMU_THREAD_DETACHED, &error_abort);
+    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
+                            hpt_prepare_thread, pending,
+                            QEMU_THREAD_DETACHED, &local_err)) {
+        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
+        g_free(pending);
+        return H_RESOURCE;
+    }
 
     spapr->pending_hpt = pending;
 
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (9 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:31   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat Fei Li
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Cc : Markus Armbruster, Gerd Hoffmann

Utilize the existed errp to propagate the error and do the
corresponding cleanup to replace the temporary &error_abort.

Cc: Cc: Markus Armbruster <armbru@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 hw/usb/ccid-card-emulated.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index f8ff7ff4a3..9245b4fcad 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -32,7 +32,6 @@
 #include "qemu/thread.h"
 #include "qemu/main-loop.h"
 #include "ccid.h"
-#include "qapi/error.h"
 
 #define DPRINTF(card, lvl, fmt, ...) \
 do {\
@@ -544,11 +543,15 @@ static void emulated_realize(CCIDCardState *base, Error **errp)
         error_setg(errp, "%s: failed to initialize vcard", TYPE_EMULATED_CCID);
         goto out2;
     }
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
-                       card, QEMU_THREAD_JOINABLE, &error_abort);
-    qemu_thread_create(&card->apdu_thread_id, "ccid/apdu", handle_apdu_thread,
-                       card, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
+                            card, QEMU_THREAD_JOINABLE, errp)) {
+        goto out2;
+    }
+    if (!qemu_thread_create(&card->apdu_thread_id, "ccid/apdu",
+                            handle_apdu_thread, card,
+                            QEMU_THREAD_JOINABLE, errp)) {
+        goto out2;
+    }
 
 out2:
     clean_event_notifier(card);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (10 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:50   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster, Eric Blake

For iothread_complete: utilize the existed errp to propagate the
error and do the corresponding cleanup to replace the temporary
&error_abort.

For qemu_signalfd_compat: add a local_err to hold the error, and
return the corresponding error code to replace the temporary
&error_abort.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 iothread.c      | 17 +++++++++++------
 util/compatfd.c | 11 ++++++++---
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/iothread.c b/iothread.c
index 8e8aa01999..7335dacf0b 100644
--- a/iothread.c
+++ b/iothread.c
@@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
                                 &local_error);
     if (local_error) {
         error_propagate(errp, local_error);
-        aio_context_unref(iothread->ctx);
-        iothread->ctx = NULL;
-        return;
+        goto fail;
     }
 
     qemu_mutex_init(&iothread->init_done_lock);
@@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
      */
     name = object_get_canonical_path_component(OBJECT(obj));
     thread_name = g_strdup_printf("IO %s", name);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
-                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
+                            iothread, QEMU_THREAD_JOINABLE, errp)) {
+        g_free(thread_name);
+        g_free(name);
+        goto fail;
+    }
     g_free(thread_name);
     g_free(name);
 
@@ -191,6 +192,10 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
                        &iothread->init_done_lock);
     }
     qemu_mutex_unlock(&iothread->init_done_lock);
+    return;
+fail:
+    aio_context_unref(iothread->ctx);
+    iothread->ctx = NULL;
 }
 
 typedef struct {
diff --git a/util/compatfd.c b/util/compatfd.c
index c3d8448264..9cb13381e4 100644
--- a/util/compatfd.c
+++ b/util/compatfd.c
@@ -71,6 +71,7 @@ static int qemu_signalfd_compat(const sigset_t *mask)
     struct sigfd_compat_info *info;
     QemuThread thread;
     int fds[2];
+    Error *local_err = NULL;
 
     info = malloc(sizeof(*info));
     if (info == NULL) {
@@ -89,9 +90,13 @@ static int qemu_signalfd_compat(const sigset_t *mask)
     memcpy(&info->mask, mask, sizeof(*mask));
     info->fd = fds[1];
 
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
-                       info, QEMU_THREAD_DETACHED, &error_abort);
+    if (!qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
+                            info, QEMU_THREAD_DETACHED, &local_err)) {
+        close(fds[0]);
+        close(fds[1]);
+        free(info);
+        return -1;
+    }
 
     return fds[0];
 }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (11 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-03 12:35   ` Dr. David Alan Gilbert
  2019-01-09 15:26   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread Fei Li
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei
  Cc: lifei1214, Markus Armbruster, Dr . David Alan Gilbert, Peter Xu

Update qemu_thread_create()'s callers by
- setting an error on qemu_thread_create() failure for callers that
  set an error on failure;
- reporting the error and returning failure for callers that return
  an error code on failure;
- reporting the error and setting some state for callers that just
  report errors and choose not to continue on.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 migration/migration.c    | 33 ++++++++++++++++++++++-----------
 migration/postcopy-ram.c | 16 ++++++++++++----
 migration/ram.c          | 44 ++++++++++++++++++++++++++++++--------------
 migration/savevm.c       | 12 ++++++++----
 4 files changed, 72 insertions(+), 33 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ea5839ff0d..9654bde101 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -447,10 +447,13 @@ static void process_incoming_migration_co(void *opaque)
             goto fail;
         }
 
-        /* TODO: let the further caller handle the error instead of abort() */
-        qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
-                           colo_process_incoming_thread, mis,
-                           QEMU_THREAD_JOINABLE, &error_abort);
+        if (!qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
+                                colo_process_incoming_thread, mis,
+                                QEMU_THREAD_JOINABLE, &local_err)) {
+            error_reportf_err(local_err, "failed to create "
+                              "colo_process_incoming_thread: ");
+            goto fail;
+        }
         mis->have_colo_incoming_thread = true;
         qemu_coroutine_yield();
 
@@ -2347,6 +2350,7 @@ out:
 static int open_return_path_on_source(MigrationState *ms,
                                       bool create_thread)
 {
+    Error *local_err = NULL;
 
     ms->rp_state.from_dst_file = qemu_file_get_return_path(ms->to_dst_file);
     if (!ms->rp_state.from_dst_file) {
@@ -2360,10 +2364,13 @@ static int open_return_path_on_source(MigrationState *ms,
         return 0;
     }
 
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&ms->rp_state.rp_thread, "return path",
-                       source_return_path_thread, ms,
-                       QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&ms->rp_state.rp_thread, "return path",
+                            source_return_path_thread, ms,
+                            QEMU_THREAD_JOINABLE, &local_err)) {
+        error_reportf_err(local_err,
+                          "failed to create source_return_path_thread: ");
+        return -1;
+     }
 
     trace_open_return_path_on_source_continue();
 
@@ -3193,9 +3200,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
         migrate_fd_cleanup(s);
         return;
     }
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
-                       QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
+                            QEMU_THREAD_JOINABLE, &error_in)) {
+        error_reportf_err(error_in, "failed to create migration_thread: ");
+        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+        migrate_fd_cleanup(s);
+        return;
+    }
     s->migration_thread_running = true;
 }
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 221ea24919..80bfa9c4a2 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1083,6 +1083,8 @@ retry:
 
 int postcopy_ram_enable_notify(MigrationIncomingState *mis)
 {
+    Error *local_err = NULL;
+
     /* Open the fd for the kernel to give us userfaults */
     mis->userfault_fd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
     if (mis->userfault_fd == -1) {
@@ -1109,10 +1111,16 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
     }
 
     qemu_sem_init(&mis->fault_thread_sem, 0);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&mis->fault_thread, "postcopy/fault",
-                       postcopy_ram_fault_thread, mis,
-                       QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&mis->fault_thread, "postcopy/fault",
+                            postcopy_ram_fault_thread, mis,
+                            QEMU_THREAD_JOINABLE, &local_err)) {
+        error_reportf_err(local_err,
+                          "failed to create postcopy_ram_fault_thread: ");
+        close(mis->userfault_event_fd);
+        close(mis->userfault_fd);
+        qemu_sem_destroy(&mis->fault_thread_sem);
+        return -1;
+    }
     qemu_sem_wait(&mis->fault_thread_sem);
     qemu_sem_destroy(&mis->fault_thread_sem);
     mis->have_fault_thread = true;
diff --git a/migration/ram.c b/migration/ram.c
index eed1daf302..1e24a78eaa 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -473,6 +473,7 @@ static void compress_threads_save_cleanup(void)
 static int compress_threads_save_setup(void)
 {
     int i, thread_count;
+    Error *local_err = NULL;
 
     if (!migrate_use_compression()) {
         return 0;
@@ -502,10 +503,12 @@ static int compress_threads_save_setup(void)
         comp_param[i].quit = false;
         qemu_mutex_init(&comp_param[i].mutex);
         qemu_cond_init(&comp_param[i].cond);
-        /* TODO: let the further caller handle the error instead of abort() */
-        qemu_thread_create(compress_threads + i, "compress",
-                           do_data_compress, comp_param + i,
-                           QEMU_THREAD_JOINABLE, &error_abort);
+        if (!qemu_thread_create(compress_threads + i, "compress",
+                                do_data_compress, comp_param + i,
+                                QEMU_THREAD_JOINABLE, &local_err)) {
+            error_reportf_err(local_err, "failed to create do_data_compress: ");
+            goto exit;
+        }
     }
     return 0;
 
@@ -1076,9 +1079,14 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
         p->c = QIO_CHANNEL(sioc);
         qio_channel_set_delay(p->c, false);
         p->running = true;
-        /* TODO: let the further caller handle the error instead of abort() */
-        qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
-                           QEMU_THREAD_JOINABLE, &error_abort);
+        if (!qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
+                                QEMU_THREAD_JOINABLE, &local_err)) {
+            migrate_set_error(migrate_get_current(), local_err);
+            error_reportf_err(local_err,
+                              "failed to create multifd_send_thread: ");
+            multifd_save_cleanup();
+            return;
+        }
 
         atomic_inc(&multifd_send_state->count);
     }
@@ -1357,9 +1365,13 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
     p->num_packets = 1;
 
     p->running = true;
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
-                       QEMU_THREAD_JOINABLE, &error_abort);
+    if (!qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
+                            QEMU_THREAD_JOINABLE, &local_err)) {
+        error_propagate_prepend(errp, local_err,
+                                "failed to create multifd_recv_thread: ");
+        multifd_recv_terminate_threads(local_err);
+        return false;
+    }
     atomic_inc(&multifd_recv_state->count);
     return atomic_read(&multifd_recv_state->count) ==
            migrate_multifd_channels();
@@ -3625,6 +3637,7 @@ static void compress_threads_load_cleanup(void)
 static int compress_threads_load_setup(QEMUFile *f)
 {
     int i, thread_count;
+    Error *local_err = NULL;
 
     if (!migrate_use_compression()) {
         return 0;
@@ -3646,10 +3659,13 @@ static int compress_threads_load_setup(QEMUFile *f)
         qemu_cond_init(&decomp_param[i].cond);
         decomp_param[i].done = true;
         decomp_param[i].quit = false;
-        /* TODO: let the further caller handle the error instead of abort() */
-        qemu_thread_create(decompress_threads + i, "decompress",
-                           do_data_decompress, decomp_param + i,
-                           QEMU_THREAD_JOINABLE, &error_abort);
+        if (!qemu_thread_create(decompress_threads + i, "decompress",
+                                do_data_decompress, decomp_param + i,
+                                QEMU_THREAD_JOINABLE, &local_err)) {
+            error_reportf_err(local_err,
+                              "failed to create do_data_decompress: ");
+            goto exit;
+        }
     }
     return 0;
 exit:
diff --git a/migration/savevm.c b/migration/savevm.c
index 46ce7af239..b8bdcde5d8 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1747,10 +1747,14 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
     mis->have_listen_thread = true;
     /* Start up the listening thread and wait for it to signal ready */
     qemu_sem_init(&mis->listen_thread_sem, 0);
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&mis->listen_thread, "postcopy/listen",
-                       postcopy_ram_listen_thread, NULL,
-                       QEMU_THREAD_DETACHED, &error_abort);
+    if (!qemu_thread_create(&mis->listen_thread, "postcopy/listen",
+                            postcopy_ram_listen_thread, NULL,
+                            QEMU_THREAD_DETACHED, &local_err)) {
+        error_reportf_err(local_err,
+                          "failed to create postcopy_ram_listen_thread: ");
+        qemu_sem_destroy(&mis->listen_thread_sem);
+        return -1;
+    }
     qemu_sem_wait(&mis->listen_thread_sem);
     qemu_sem_destroy(&mis->listen_thread_sem);
 
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (12 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:54   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages Fei Li
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster, Gerd Hoffmann

Supplement the error handling for vnc_thread_worker_thread: add
an Error parameter for it to propagate the error to its caller to
handle in case it fails, and make it return a Boolean to indicate
whether it succeeds.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 ui/vnc-jobs.c | 17 +++++++++++------
 ui/vnc-jobs.h |  2 +-
 ui/vnc.c      |  4 +++-
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/ui/vnc-jobs.c b/ui/vnc-jobs.c
index 5712f1f501..35a652d1fd 100644
--- a/ui/vnc-jobs.c
+++ b/ui/vnc-jobs.c
@@ -332,16 +332,21 @@ static bool vnc_worker_thread_running(void)
     return queue; /* Check global queue */
 }
 
-void vnc_start_worker_thread(void)
+bool vnc_start_worker_thread(Error **errp)
 {
     VncJobQueue *q;
 
-    if (vnc_worker_thread_running())
-        return ;
+    if (vnc_worker_thread_running()) {
+        goto out;
+    }
 
     q = vnc_queue_init();
-    /* TODO: let the further caller handle the error instead of abort() here */
-    qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
-                       q, QEMU_THREAD_DETACHED, &error_abort);
+    if (!qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
+                            q, QEMU_THREAD_DETACHED, errp)) {
+        vnc_queue_clear(q);
+        return false;
+    }
     queue = q; /* Set global queue */
+out:
+    return true;
 }
diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
index 59f66bcc35..14640593db 100644
--- a/ui/vnc-jobs.h
+++ b/ui/vnc-jobs.h
@@ -37,7 +37,7 @@ void vnc_job_push(VncJob *job);
 void vnc_jobs_join(VncState *vs);
 
 void vnc_jobs_consume_buffer(VncState *vs);
-void vnc_start_worker_thread(void);
+bool vnc_start_worker_thread(Error **errp);
 
 /* Locks */
 static inline int vnc_trylock_display(VncDisplay *vd)
diff --git a/ui/vnc.c b/ui/vnc.c
index 0c1b477425..0ffe9e6a5d 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -3236,7 +3236,9 @@ void vnc_display_init(const char *id, Error **errp)
     vd->connections_limit = 32;
 
     qemu_mutex_init(&vd->mutex);
-    vnc_start_worker_thread();
+    if (!vnc_start_worker_thread(errp)) {
+        return;
+    }
 
     vd->dcl.ops = &dcl_ops;
     register_displaychangelistener(&vd->dcl);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (13 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 18:13   ` Markus Armbruster
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault Fei Li
  2019-01-02 13:46 ` [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle no-reply
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Markus Armbruster

Supplement the error handling for touch_all_pages: add an Error
parameter for it to propagate the error to its caller to do the
handling in case it fails.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 util/oslib-posix.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 251e2f1aea..afc1d99093 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -431,15 +431,17 @@ static inline int get_memset_num_threads(int smp_cpus)
 }
 
 static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
-                            int smp_cpus)
+                            int smp_cpus, Error **errp)
 {
     size_t numpages_per_thread;
     size_t size_per_thread;
     char *addr = area;
     int i = 0;
+    int started_thread = 0;
 
     memset_thread_failed = false;
     memset_num_threads = get_memset_num_threads(smp_cpus);
+    started_thread = memset_num_threads;
     memset_thread = g_new0(MemsetThread, memset_num_threads);
     numpages_per_thread = (numpages / memset_num_threads);
     size_per_thread = (hpagesize * numpages_per_thread);
@@ -448,14 +450,18 @@ static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
         memset_thread[i].numpages = (i == (memset_num_threads - 1)) ?
                                     numpages : numpages_per_thread;
         memset_thread[i].hpagesize = hpagesize;
-        /* TODO: let the callers handle the error instead of abort() here */
-        qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
-                           do_touch_pages, &memset_thread[i],
-                           QEMU_THREAD_JOINABLE, &error_abort);
+        if (!qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
+                                do_touch_pages, &memset_thread[i],
+                                QEMU_THREAD_JOINABLE, errp)) {
+            memset_thread_failed = true;
+            started_thread = i;
+            goto out;
+        }
         addr += size_per_thread;
         numpages -= numpages_per_thread;
     }
-    for (i = 0; i < memset_num_threads; i++) {
+out:
+    for (i = 0; i < started_thread; i++) {
         qemu_thread_join(&memset_thread[i].pgthread);
     }
     g_free(memset_thread);
@@ -471,6 +477,7 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
     struct sigaction act, oldact;
     size_t hpagesize = qemu_fd_getpagesize(fd);
     size_t numpages = DIV_ROUND_UP(memory, hpagesize);
+    Error *local_err = NULL;
 
     memset(&act, 0, sizeof(act));
     act.sa_handler = &sigbus_handler;
@@ -484,9 +491,9 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
     }
 
     /* touch pages simultaneously */
-    if (touch_all_pages(area, hpagesize, numpages, smp_cpus)) {
-        error_setg(errp, "os_mem_prealloc: Insufficient free host memory "
-            "pages available to allocate guest RAM");
+    if (touch_all_pages(area, hpagesize, numpages, smp_cpus, &local_err)) {
+        error_propagate_prepend(errp, local_err, "os_mem_prealloc: Insufficient"
+            " free host memory pages available to allocate guest RAM: ");
     }
 
     ret = sigaction(SIGBUS, &oldact, NULL);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (14 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages Fei Li
@ 2018-12-25 14:04 ` Fei Li
  2019-01-07 17:55   ` Markus Armbruster
  2019-01-02 13:46 ` [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle no-reply
  16 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2018-12-25 14:04 UTC (permalink / raw)
  To: qemu-devel, shirley17fei; +Cc: lifei1214, Stefan Weil

To avoid the segmentation fault in qemu_thread_join(), just directly
return when the QemuThread *thread failed to be created in either
qemu-thread-posix.c or qemu-thread-win32.c.

Cc: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 util/qemu-thread-posix.c | 3 +++
 util/qemu-thread-win32.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 39834b0551..3548935dac 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
     int err;
     void *ret;
 
+    if (!thread->thread) {
+        return NULL;
+    }
     err = pthread_join(thread->thread, &ret);
     if (err) {
         error_exit(err, __func__);
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 57b1143e97..ca4d5329e3 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -367,7 +367,7 @@ void *qemu_thread_join(QemuThread *thread)
     HANDLE handle;
 
     data = thread->data;
-    if (data->mode == QEMU_THREAD_DETACHED) {
+    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {
         return NULL;
     }
 
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare Fei Li
@ 2019-01-02  2:36   ` David Gibson
  2019-01-02  6:44     ` 李菲
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2019-01-02  2:36 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 2314 bytes --]

On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
> Add a local_err to hold the error, and return the corresponding
> error code to replace the temporary &error_abort.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Fei Li <fli@suse.com>

This looks like a good change, but it no longer applies due to a
change in the qemu_thread_create() signature.

> ---
>  hw/ppc/spapr_hcall.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 5bc2cf4540..7c16ade04a 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>      sPAPRPendingHPT *pending = spapr->pending_hpt;
>      uint64_t current_ram_size;
>      int rc;
> +    Error *local_err = NULL;
>  
>      if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
>          return H_AUTHORITY;
> @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>      pending->shift = shift;
>      pending->ret = H_HARDWARE;
>  
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> -                       hpt_prepare_thread, pending,
> -                       QEMU_THREAD_DETACHED, &error_abort);
> +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> +                            hpt_prepare_thread, pending,
> +                            QEMU_THREAD_DETACHED, &local_err)) {
> +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
> +        g_free(pending);
> +        return H_RESOURCE;

I also think H_HARDWARE would be a better choice here.  Although the
failure is due to a resource constraint, it's not because the guest
asked for too much, just because the host is in dire straits.  From
the guest's point of view it's basically a hardware failure.

> +    }
>  
>      spapr->pending_hpt = pending;
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2019-01-02  2:36   ` David Gibson
@ 2019-01-02  6:44     ` 李菲
  2019-01-03  3:43       ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: 李菲 @ 2019-01-02  6:44 UTC (permalink / raw)
  To: David Gibson, Fei Li; +Cc: qemu-devel, shirley17fei, Markus Armbruster


在 2019/1/2 上午10:36, David Gibson 写道:
> On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
>> Add a local_err to hold the error, and return the corresponding
>> error code to replace the temporary &error_abort.
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Signed-off-by: Fei Li <fli@suse.com>
> This looks like a good change, but it no longer applies due to a
> change in the qemu_thread_create() signature.
Sorry that I am not sure whether I understand. Do you mean using
&error_abort is more suitable for this handling, rather than report
the &local_err & return a failure reason?
>
>> ---
>>   hw/ppc/spapr_hcall.c | 12 ++++++++----
>>   1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>> index 5bc2cf4540..7c16ade04a 100644
>> --- a/hw/ppc/spapr_hcall.c
>> +++ b/hw/ppc/spapr_hcall.c
>> @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>       sPAPRPendingHPT *pending = spapr->pending_hpt;
>>       uint64_t current_ram_size;
>>       int rc;
>> +    Error *local_err = NULL;
>>   
>>       if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
>>           return H_AUTHORITY;
>> @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>       pending->shift = shift;
>>       pending->ret = H_HARDWARE;
>>   
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>> -                       hpt_prepare_thread, pending,
>> -                       QEMU_THREAD_DETACHED, &error_abort);
>> +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>> +                            hpt_prepare_thread, pending,
>> +                            QEMU_THREAD_DETACHED, &local_err)) {
>> +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
>> +        g_free(pending);
>> +        return H_RESOURCE;
> I also think H_HARDWARE would be a better choice here.  Although the
> failure is due to a resource constraint, it's not because the guest
> asked for too much, just because the host is in dire straits.  From
> the guest's point of view it's basically a hardware failure.

Ok, thanks. Will use H_HARDWARE instead.

Have a nice day, thanks for the review. :)
Fei
>
>> +    }
>>   
>>       spapr->pending_hpt = pending;
>>   

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle
  2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
                   ` (15 preceding siblings ...)
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault Fei Li
@ 2019-01-02 13:46 ` no-reply
  2019-01-07 12:44   ` Fei Li
  16 siblings, 1 reply; 74+ messages in thread
From: no-reply @ 2019-01-02 13:46 UTC (permalink / raw)
  To: fli; +Cc: fam, qemu-devel, shirley17fei, lifei1214

Patchew URL: https://patchew.org/QEMU/20181225140449.15786-1-fli@suse.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-quick@centos7 SHOW_ENV=1 J=8
=== TEST SCRIPT END ===

libpmem support   no
libudev           no

WARNING: Use of SDL 1.2 is deprecated and will be removed in
WARNING: future releases. Please switch to using SDL 2.0

NOTE: cross-compilers enabled:  'cc'
  GEN     x86_64-softmmu/config-devices.mak.tmp
---
  CC      hw/usb/host-stub.o
  CC      hw/virtio/virtio-bus.o
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: In function 'init_event_notifier':
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:404:9: error: implicit declaration of function 'error_setg' [-Werror=implicit-function-declaration]
         error_setg(errp, "ccid-card-emul: event notifier creation failed");
         ^
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:404:9: error: nested extern declaration of 'error_setg' [-Werror=nested-externs]
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: In function 'emulated_realize':
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:513:13: error: implicit declaration of function 'error_append_hint' [-Werror=implicit-function-declaration]
             error_append_hint(errp, "%s\n", ptable->name);
             ^
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:513:13: error: nested extern declaration of 'error_append_hint' [-Werror=nested-externs]
/tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: At top level:
cc1: error: unrecognized command line option "-Wno-format-truncation" [-Werror]
cc1: all warnings being treated as errors
make: *** [hw/usb/ccid-card-emulated.o] Error 1
make: *** Waiting for unfinished jobs....


The full log is available at
http://patchew.org/logs/20181225140449.15786-1-fli@suse.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2019-01-02  6:44     ` 李菲
@ 2019-01-03  3:43       ` David Gibson
  2019-01-03 13:41         ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2019-01-03  3:43 UTC (permalink / raw)
  To: 李菲; +Cc: Fei Li, qemu-devel, shirley17fei, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 3013 bytes --]

On Wed, Jan 02, 2019 at 02:44:17PM +0800, 李菲 wrote:
> 
> 在 2019/1/2 上午10:36, David Gibson 写道:
> > On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
> > > Add a local_err to hold the error, and return the corresponding
> > > error code to replace the temporary &error_abort.
> > > 
> > > Cc: Markus Armbruster <armbru@redhat.com>
> > > Cc: David Gibson <david@gibson.dropbear.id.au>
> > > Signed-off-by: Fei Li <fli@suse.com>
> > This looks like a good change, but it no longer applies due to a
> > change in the qemu_thread_create() signature.
> Sorry that I am not sure whether I understand. Do you mean using
> &error_abort is more suitable for this handling, rather than report
> the &local_err & return a failure reason?

No, I just mean that context has been altered by a global change and
the patch will need to be fixed up to cope with that.

> > 
> > > ---
> > >   hw/ppc/spapr_hcall.c | 12 ++++++++----
> > >   1 file changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > > index 5bc2cf4540..7c16ade04a 100644
> > > --- a/hw/ppc/spapr_hcall.c
> > > +++ b/hw/ppc/spapr_hcall.c
> > > @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
> > >       sPAPRPendingHPT *pending = spapr->pending_hpt;
> > >       uint64_t current_ram_size;
> > >       int rc;
> > > +    Error *local_err = NULL;
> > >       if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
> > >           return H_AUTHORITY;
> > > @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
> > >       pending->shift = shift;
> > >       pending->ret = H_HARDWARE;
> > > -    /* TODO: let the further caller handle the error instead of abort() here */
> > > -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> > > -                       hpt_prepare_thread, pending,
> > > -                       QEMU_THREAD_DETACHED, &error_abort);
> > > +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> > > +                            hpt_prepare_thread, pending,
> > > +                            QEMU_THREAD_DETACHED, &local_err)) {
> > > +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
> > > +        g_free(pending);
> > > +        return H_RESOURCE;
> > I also think H_HARDWARE would be a better choice here.  Although the
> > failure is due to a resource constraint, it's not because the guest
> > asked for too much, just because the host is in dire straits.  From
> > the guest's point of view it's basically a hardware failure.
> 
> Ok, thanks. Will use H_HARDWARE instead.
> 
> Have a nice day, thanks for the review. :)
> Fei
> > 
> > > +    }
> > >       spapr->pending_hpt = pending;
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co Fei Li
@ 2019-01-03 11:25   ` Dr. David Alan Gilbert
  2019-01-03 13:27     ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-03 11:25 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Markus Armbruster, Peter Xu

* Fei Li (fli@suse.com) wrote:
> In the current code, if process_incoming_migration_co() fails we do
> the same error handing: set the error state, close the source file,
> do the cleanup for multifd, and then exit(EXIT_FAILURE). To make the
> code clearer, add a "goto fail" to unify the error handling.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  migration/migration.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 5d322eb9d6..ded151b1bf 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -438,15 +438,13 @@ static void process_incoming_migration_co(void *opaque)
>          /* Make sure all file formats flush their mutable metadata */
>          bdrv_invalidate_cache_all(&local_err);
>          if (local_err) {
> -            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> -                    MIGRATION_STATUS_FAILED);
>              error_report_err(local_err);
> -            exit(EXIT_FAILURE);
> +            goto fail;
>          }
>  
>          if (colo_init_ram_cache() < 0) {
>              error_report("Init ram cache failed");
> -            exit(EXIT_FAILURE);
> +            goto fail;
>          }
>  
>          qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
> @@ -461,20 +459,22 @@ static void process_incoming_migration_co(void *opaque)
>      }
>  
>      if (ret < 0) {
> -        Error *local_err = NULL;
> -
> -        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
>          error_report("load of migration failed: %s", strerror(-ret));
> -        qemu_fclose(mis->from_src_file);
> -        if (multifd_load_cleanup(&local_err) != 0) {
> -            error_report_err(local_err);
> -        }
> -        exit(EXIT_FAILURE);
> +        goto fail;
>      }
>      mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
>      qemu_bh_schedule(mis->bh);
>      mis->migration_incoming_co = NULL;
> +    return;
> +fail:
> +    local_err = NULL;
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> +                      MIGRATION_STATUS_FAILED);
> +    qemu_fclose(mis->from_src_file);
> +    if (multifd_load_cleanup(&local_err) != 0) {
> +        error_report_err(local_err);
> +    }
> +    exit(EXIT_FAILURE);
>  }
>  
>  static void migration_incoming_setup(QEMUFile *f)

OK, so this is really unifying the normal error case and the two
colo-incoming error cases; so I think that's fine.


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> -- 
> 2.13.7
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
@ 2019-01-03 12:35   ` Dr. David Alan Gilbert
  2019-01-03 12:47     ` Fei Li
  2019-01-09 15:26   ` Markus Armbruster
  1 sibling, 1 reply; 74+ messages in thread
From: Dr. David Alan Gilbert @ 2019-01-03 12:35 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Markus Armbruster, Peter Xu

* Fei Li (fli@suse.com) wrote:
> Update qemu_thread_create()'s callers by
> - setting an error on qemu_thread_create() failure for callers that
>   set an error on failure;
> - reporting the error and returning failure for callers that return
>   an error code on failure;
> - reporting the error and setting some state for callers that just
>   report errors and choose not to continue on.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  migration/migration.c    | 33 ++++++++++++++++++++++-----------
>  migration/postcopy-ram.c | 16 ++++++++++++----
>  migration/ram.c          | 44 ++++++++++++++++++++++++++++++--------------
>  migration/savevm.c       | 12 ++++++++----
>  4 files changed, 72 insertions(+), 33 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index ea5839ff0d..9654bde101 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -447,10 +447,13 @@ static void process_incoming_migration_co(void *opaque)
>              goto fail;
>          }
>  
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
> -                           colo_process_incoming_thread, mis,
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
> +                                colo_process_incoming_thread, mis,
> +                                QEMU_THREAD_JOINABLE, &local_err)) {
> +            error_reportf_err(local_err, "failed to create "
> +                              "colo_process_incoming_thread: ");
> +            goto fail;
> +        }
>          mis->have_colo_incoming_thread = true;
>          qemu_coroutine_yield();

OK

> @@ -2347,6 +2350,7 @@ out:
>  static int open_return_path_on_source(MigrationState *ms,
>                                        bool create_thread)
>  {
> +    Error *local_err = NULL;
>  
>      ms->rp_state.from_dst_file = qemu_file_get_return_path(ms->to_dst_file);
>      if (!ms->rp_state.from_dst_file) {
> @@ -2360,10 +2364,13 @@ static int open_return_path_on_source(MigrationState *ms,
>          return 0;
>      }
>  
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&ms->rp_state.rp_thread, "return path",
> -                       source_return_path_thread, ms,
> -                       QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&ms->rp_state.rp_thread, "return path",
> +                            source_return_path_thread, ms,
> +                            QEMU_THREAD_JOINABLE, &local_err)) {
> +        error_reportf_err(local_err,
> +                          "failed to create source_return_path_thread: ");
> +        return -1;
> +     }

I think that has to close the from_dst_file and set the
from_dst_file=NULL.  That file is owned by the thread, and it's normally
the thread that cleans it up.

I think other than that missing close it's fine; and we can do that as a
fix later, so:


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

>  
>      trace_open_return_path_on_source_continue();
>  
> @@ -3193,9 +3200,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>          migrate_fd_cleanup(s);
>          return;
>      }
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
> -                       QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
> +                            QEMU_THREAD_JOINABLE, &error_in)) {
> +        error_reportf_err(error_in, "failed to create migration_thread: ");
> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> +        migrate_fd_cleanup(s);
> +        return;
> +    }

OK

>      s->migration_thread_running = true;
>  }
>  
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 221ea24919..80bfa9c4a2 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1083,6 +1083,8 @@ retry:
>  
>  int postcopy_ram_enable_notify(MigrationIncomingState *mis)
>  {
> +    Error *local_err = NULL;
> +
>      /* Open the fd for the kernel to give us userfaults */
>      mis->userfault_fd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
>      if (mis->userfault_fd == -1) {
> @@ -1109,10 +1111,16 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
>      }
>  
>      qemu_sem_init(&mis->fault_thread_sem, 0);
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&mis->fault_thread, "postcopy/fault",
> -                       postcopy_ram_fault_thread, mis,
> -                       QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&mis->fault_thread, "postcopy/fault",
> +                            postcopy_ram_fault_thread, mis,
> +                            QEMU_THREAD_JOINABLE, &local_err)) {
> +        error_reportf_err(local_err,
> +                          "failed to create postcopy_ram_fault_thread: ");
> +        close(mis->userfault_event_fd);
> +        close(mis->userfault_fd);
> +        qemu_sem_destroy(&mis->fault_thread_sem);
> +        return -1;
> +    }
>      qemu_sem_wait(&mis->fault_thread_sem);
>      qemu_sem_destroy(&mis->fault_thread_sem);
>      mis->have_fault_thread = true;

OK

> diff --git a/migration/ram.c b/migration/ram.c
> index eed1daf302..1e24a78eaa 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -473,6 +473,7 @@ static void compress_threads_save_cleanup(void)
>  static int compress_threads_save_setup(void)
>  {
>      int i, thread_count;
> +    Error *local_err = NULL;
>  
>      if (!migrate_use_compression()) {
>          return 0;
> @@ -502,10 +503,12 @@ static int compress_threads_save_setup(void)
>          comp_param[i].quit = false;
>          qemu_mutex_init(&comp_param[i].mutex);
>          qemu_cond_init(&comp_param[i].cond);
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(compress_threads + i, "compress",
> -                           do_data_compress, comp_param + i,
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(compress_threads + i, "compress",
> +                                do_data_compress, comp_param + i,
> +                                QEMU_THREAD_JOINABLE, &local_err)) {
> +            error_reportf_err(local_err, "failed to create do_data_compress: ");
> +            goto exit;
> +        }

OK

>      }
>      return 0;
>  
> @@ -1076,9 +1079,14 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
>          p->c = QIO_CHANNEL(sioc);
>          qio_channel_set_delay(p->c, false);
>          p->running = true;
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
> +                                QEMU_THREAD_JOINABLE, &local_err)) {
> +            migrate_set_error(migrate_get_current(), local_err);
> +            error_reportf_err(local_err,
> +                              "failed to create multifd_send_thread: ");
> +            multifd_save_cleanup();
> +            return;
> +        }
>  
>          atomic_inc(&multifd_send_state->count);
>      }
> @@ -1357,9 +1365,13 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
>      p->num_packets = 1;
>  
>      p->running = true;
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
> -                       QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
> +                            QEMU_THREAD_JOINABLE, &local_err)) {
> +        error_propagate_prepend(errp, local_err,
> +                                "failed to create multifd_recv_thread: ");
> +        multifd_recv_terminate_threads(local_err);
> +        return false;
> +    }
>      atomic_inc(&multifd_recv_state->count);
>      return atomic_read(&multifd_recv_state->count) ==
>             migrate_multifd_channels();
> @@ -3625,6 +3637,7 @@ static void compress_threads_load_cleanup(void)
>  static int compress_threads_load_setup(QEMUFile *f)
>  {
>      int i, thread_count;
> +    Error *local_err = NULL;
>  
>      if (!migrate_use_compression()) {
>          return 0;
> @@ -3646,10 +3659,13 @@ static int compress_threads_load_setup(QEMUFile *f)
>          qemu_cond_init(&decomp_param[i].cond);
>          decomp_param[i].done = true;
>          decomp_param[i].quit = false;
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(decompress_threads + i, "decompress",
> -                           do_data_decompress, decomp_param + i,
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(decompress_threads + i, "decompress",
> +                                do_data_decompress, decomp_param + i,
> +                                QEMU_THREAD_JOINABLE, &local_err)) {
> +            error_reportf_err(local_err,
> +                              "failed to create do_data_decompress: ");
> +            goto exit;
> +        }
>      }
>      return 0;
>  exit:
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 46ce7af239..b8bdcde5d8 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1747,10 +1747,14 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>      mis->have_listen_thread = true;
>      /* Start up the listening thread and wait for it to signal ready */
>      qemu_sem_init(&mis->listen_thread_sem, 0);
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&mis->listen_thread, "postcopy/listen",
> -                       postcopy_ram_listen_thread, NULL,
> -                       QEMU_THREAD_DETACHED, &error_abort);
> +    if (!qemu_thread_create(&mis->listen_thread, "postcopy/listen",
> +                            postcopy_ram_listen_thread, NULL,
> +                            QEMU_THREAD_DETACHED, &local_err)) {
> +        error_reportf_err(local_err,
> +                          "failed to create postcopy_ram_listen_thread: ");
> +        qemu_sem_destroy(&mis->listen_thread_sem);
> +        return -1;
> +    }
>      qemu_sem_wait(&mis->listen_thread_sem);
>      qemu_sem_destroy(&mis->listen_thread_sem);
>  
> -- 
> 2.13.7
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration
  2019-01-03 12:35   ` Dr. David Alan Gilbert
@ 2019-01-03 12:47     ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-03 12:47 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Fei Li
  Cc: qemu-devel, shirley17fei, Markus Armbruster, Peter Xu


在 2019/1/3 下午8:35, Dr. David Alan Gilbert 写道:
> * Fei Li (fli@suse.com) wrote:
>> Update qemu_thread_create()'s callers by
>> - setting an error on qemu_thread_create() failure for callers that
>>    set an error on failure;
>> - reporting the error and returning failure for callers that return
>>    an error code on failure;
>> - reporting the error and setting some state for callers that just
>>    report errors and choose not to continue on.
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Cc: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>>   migration/migration.c    | 33 ++++++++++++++++++++++-----------
>>   migration/postcopy-ram.c | 16 ++++++++++++----
>>   migration/ram.c          | 44 ++++++++++++++++++++++++++++++--------------
>>   migration/savevm.c       | 12 ++++++++----
>>   4 files changed, 72 insertions(+), 33 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index ea5839ff0d..9654bde101 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -447,10 +447,13 @@ static void process_incoming_migration_co(void *opaque)
>>               goto fail;
>>           }
>>   
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
>> -                           colo_process_incoming_thread, mis,
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
>> +                                colo_process_incoming_thread, mis,
>> +                                QEMU_THREAD_JOINABLE, &local_err)) {
>> +            error_reportf_err(local_err, "failed to create "
>> +                              "colo_process_incoming_thread: ");
>> +            goto fail;
>> +        }
>>           mis->have_colo_incoming_thread = true;
>>           qemu_coroutine_yield();
> OK
>
>> @@ -2347,6 +2350,7 @@ out:
>>   static int open_return_path_on_source(MigrationState *ms,
>>                                         bool create_thread)
>>   {
>> +    Error *local_err = NULL;
>>   
>>       ms->rp_state.from_dst_file = qemu_file_get_return_path(ms->to_dst_file);
>>       if (!ms->rp_state.from_dst_file) {
>> @@ -2360,10 +2364,13 @@ static int open_return_path_on_source(MigrationState *ms,
>>           return 0;
>>       }
>>   
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&ms->rp_state.rp_thread, "return path",
>> -                       source_return_path_thread, ms,
>> -                       QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&ms->rp_state.rp_thread, "return path",
>> +                            source_return_path_thread, ms,
>> +                            QEMU_THREAD_JOINABLE, &local_err)) {
>> +        error_reportf_err(local_err,
>> +                          "failed to create source_return_path_thread: ");
>> +        return -1;
>> +     }
> I think that has to close the from_dst_file and set the
> from_dst_file=NULL.  That file is owned by the thread, and it's normally
> the thread that cleans it up.
>
> I think other than that missing close it's fine; and we can do that as a
> fix later, so:
>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Ok, I will add the cleanup for the from_dst_file in the next version.

Thanks for the review! Have a nice day :)

Fei

>
>>   
>>       trace_open_return_path_on_source_continue();
>>   
>> @@ -3193,9 +3200,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>>           migrate_fd_cleanup(s);
>>           return;
>>       }
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
>> -                       QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&s->thread, "live_migration", migration_thread, s,
>> +                            QEMU_THREAD_JOINABLE, &error_in)) {
>> +        error_reportf_err(error_in, "failed to create migration_thread: ");
>> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
>> +        migrate_fd_cleanup(s);
>> +        return;
>> +    }
> OK
>
>>       s->migration_thread_running = true;
>>   }
>>   
>> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> index 221ea24919..80bfa9c4a2 100644
>> --- a/migration/postcopy-ram.c
>> +++ b/migration/postcopy-ram.c
>> @@ -1083,6 +1083,8 @@ retry:
>>   
>>   int postcopy_ram_enable_notify(MigrationIncomingState *mis)
>>   {
>> +    Error *local_err = NULL;
>> +
>>       /* Open the fd for the kernel to give us userfaults */
>>       mis->userfault_fd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
>>       if (mis->userfault_fd == -1) {
>> @@ -1109,10 +1111,16 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
>>       }
>>   
>>       qemu_sem_init(&mis->fault_thread_sem, 0);
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&mis->fault_thread, "postcopy/fault",
>> -                       postcopy_ram_fault_thread, mis,
>> -                       QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&mis->fault_thread, "postcopy/fault",
>> +                            postcopy_ram_fault_thread, mis,
>> +                            QEMU_THREAD_JOINABLE, &local_err)) {
>> +        error_reportf_err(local_err,
>> +                          "failed to create postcopy_ram_fault_thread: ");
>> +        close(mis->userfault_event_fd);
>> +        close(mis->userfault_fd);
>> +        qemu_sem_destroy(&mis->fault_thread_sem);
>> +        return -1;
>> +    }
>>       qemu_sem_wait(&mis->fault_thread_sem);
>>       qemu_sem_destroy(&mis->fault_thread_sem);
>>       mis->have_fault_thread = true;
> OK
>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index eed1daf302..1e24a78eaa 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -473,6 +473,7 @@ static void compress_threads_save_cleanup(void)
>>   static int compress_threads_save_setup(void)
>>   {
>>       int i, thread_count;
>> +    Error *local_err = NULL;
>>   
>>       if (!migrate_use_compression()) {
>>           return 0;
>> @@ -502,10 +503,12 @@ static int compress_threads_save_setup(void)
>>           comp_param[i].quit = false;
>>           qemu_mutex_init(&comp_param[i].mutex);
>>           qemu_cond_init(&comp_param[i].cond);
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(compress_threads + i, "compress",
>> -                           do_data_compress, comp_param + i,
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(compress_threads + i, "compress",
>> +                                do_data_compress, comp_param + i,
>> +                                QEMU_THREAD_JOINABLE, &local_err)) {
>> +            error_reportf_err(local_err, "failed to create do_data_compress: ");
>> +            goto exit;
>> +        }
> OK
>
>>       }
>>       return 0;
>>   
>> @@ -1076,9 +1079,14 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
>>           p->c = QIO_CHANNEL(sioc);
>>           qio_channel_set_delay(p->c, false);
>>           p->running = true;
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
>> +                                QEMU_THREAD_JOINABLE, &local_err)) {
>> +            migrate_set_error(migrate_get_current(), local_err);
>> +            error_reportf_err(local_err,
>> +                              "failed to create multifd_send_thread: ");
>> +            multifd_save_cleanup();
>> +            return;
>> +        }
>>   
>>           atomic_inc(&multifd_send_state->count);
>>       }
>> @@ -1357,9 +1365,13 @@ bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
>>       p->num_packets = 1;
>>   
>>       p->running = true;
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
>> -                       QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
>> +                            QEMU_THREAD_JOINABLE, &local_err)) {
>> +        error_propagate_prepend(errp, local_err,
>> +                                "failed to create multifd_recv_thread: ");
>> +        multifd_recv_terminate_threads(local_err);
>> +        return false;
>> +    }
>>       atomic_inc(&multifd_recv_state->count);
>>       return atomic_read(&multifd_recv_state->count) ==
>>              migrate_multifd_channels();
>> @@ -3625,6 +3637,7 @@ static void compress_threads_load_cleanup(void)
>>   static int compress_threads_load_setup(QEMUFile *f)
>>   {
>>       int i, thread_count;
>> +    Error *local_err = NULL;
>>   
>>       if (!migrate_use_compression()) {
>>           return 0;
>> @@ -3646,10 +3659,13 @@ static int compress_threads_load_setup(QEMUFile *f)
>>           qemu_cond_init(&decomp_param[i].cond);
>>           decomp_param[i].done = true;
>>           decomp_param[i].quit = false;
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(decompress_threads + i, "decompress",
>> -                           do_data_decompress, decomp_param + i,
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(decompress_threads + i, "decompress",
>> +                                do_data_decompress, decomp_param + i,
>> +                                QEMU_THREAD_JOINABLE, &local_err)) {
>> +            error_reportf_err(local_err,
>> +                              "failed to create do_data_decompress: ");
>> +            goto exit;
>> +        }
>>       }
>>       return 0;
>>   exit:
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 46ce7af239..b8bdcde5d8 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1747,10 +1747,14 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>>       mis->have_listen_thread = true;
>>       /* Start up the listening thread and wait for it to signal ready */
>>       qemu_sem_init(&mis->listen_thread_sem, 0);
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&mis->listen_thread, "postcopy/listen",
>> -                       postcopy_ram_listen_thread, NULL,
>> -                       QEMU_THREAD_DETACHED, &error_abort);
>> +    if (!qemu_thread_create(&mis->listen_thread, "postcopy/listen",
>> +                            postcopy_ram_listen_thread, NULL,
>> +                            QEMU_THREAD_DETACHED, &local_err)) {
>> +        error_reportf_err(local_err,
>> +                          "failed to create postcopy_ram_listen_thread: ");
>> +        qemu_sem_destroy(&mis->listen_thread_sem);
>> +        return -1;
>> +    }
>>       qemu_sem_wait(&mis->listen_thread_sem);
>>       qemu_sem_destroy(&mis->listen_thread_sem);
>>   
>> -- 
>> 2.13.7
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co
  2019-01-03 11:25   ` Dr. David Alan Gilbert
@ 2019-01-03 13:27     ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-03 13:27 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Fei Li
  Cc: qemu-devel, shirley17fei, Markus Armbruster, Peter Xu


在 2019/1/3 下午7:25, Dr. David Alan Gilbert 写道:
> * Fei Li (fli@suse.com) wrote:
>> In the current code, if process_incoming_migration_co() fails we do
>> the same error handing: set the error state, close the source file,
>> do the cleanup for multifd, and then exit(EXIT_FAILURE). To make the
>> code clearer, add a "goto fail" to unify the error handling.
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Cc: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>>   migration/migration.c | 26 +++++++++++++-------------
>>   1 file changed, 13 insertions(+), 13 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 5d322eb9d6..ded151b1bf 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -438,15 +438,13 @@ static void process_incoming_migration_co(void *opaque)
>>           /* Make sure all file formats flush their mutable metadata */
>>           bdrv_invalidate_cache_all(&local_err);
>>           if (local_err) {
>> -            migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>> -                    MIGRATION_STATUS_FAILED);
>>               error_report_err(local_err);
>> -            exit(EXIT_FAILURE);
>> +            goto fail;
>>           }
>>   
>>           if (colo_init_ram_cache() < 0) {
>>               error_report("Init ram cache failed");
>> -            exit(EXIT_FAILURE);
>> +            goto fail;
>>           }
>>   
>>           qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming",
>> @@ -461,20 +459,22 @@ static void process_incoming_migration_co(void *opaque)
>>       }
>>   
>>       if (ret < 0) {
>> -        Error *local_err = NULL;
>> -
>> -        migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>> -                          MIGRATION_STATUS_FAILED);
>>           error_report("load of migration failed: %s", strerror(-ret));
>> -        qemu_fclose(mis->from_src_file);
>> -        if (multifd_load_cleanup(&local_err) != 0) {
>> -            error_report_err(local_err);
>> -        }
>> -        exit(EXIT_FAILURE);
>> +        goto fail;
>>       }
>>       mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
>>       qemu_bh_schedule(mis->bh);
>>       mis->migration_incoming_co = NULL;
>> +    return;
>> +fail:
>> +    local_err = NULL;
>> +    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>> +                      MIGRATION_STATUS_FAILED);
>> +    qemu_fclose(mis->from_src_file);
>> +    if (multifd_load_cleanup(&local_err) != 0) {
>> +        error_report_err(local_err);
>> +    }
>> +    exit(EXIT_FAILURE);
>>   }
>>   
>>   static void migration_incoming_setup(QEMUFile *f)
> OK, so this is really unifying the normal error case and the two
> colo-incoming error cases; so I think that's fine.
>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Thanks for the review :)

Fei

>
>> -- 
>> 2.13.7
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2019-01-03  3:43       ` David Gibson
@ 2019-01-03 13:41         ` Fei Li
  2019-01-04  5:21           ` David Gibson
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-03 13:41 UTC (permalink / raw)
  To: David Gibson; +Cc: Fei Li, qemu-devel, shirley17fei, Markus Armbruster


在 2019/1/3 上午11:43, David Gibson 写道:
> On Wed, Jan 02, 2019 at 02:44:17PM +0800, 李菲 wrote:
>> 在 2019/1/2 上午10:36, David Gibson 写道:
>>> On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
>>>> Add a local_err to hold the error, and return the corresponding
>>>> error code to replace the temporary &error_abort.
>>>>
>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
>>>> Signed-off-by: Fei Li <fli@suse.com>
>>> This looks like a good change, but it no longer applies due to a
>>> change in the qemu_thread_create() signature.
>> Sorry that I am not sure whether I understand. Do you mean using
>> &error_abort is more suitable for this handling, rather than report
>> the &local_err & return a failure reason?
> No, I just mean that context has been altered by a global change and
> the patch will need to be fixed up to cope with that.

Just to be clearer: does the "global change" mean the "[patch 06/16] 
qemu_thread: Make qemu_thread_create() handle errors properly", or 
another patch not in this patch series?

If it means the [patch 06/16], I want to explain more: the 06/16 handles all
qemu_thread_create() by passing &error_abort as the parameter, and the
following patches are to improve on the &error_abort for callers who can
handle more properly. E.g. if qemu_thread_create() fails in 
h_resize_hpt_prepare(),
I think reporting the &local_err & returning the failure reason is more 
proper
than just abort() inside qemu_thread_create() when calls error_setg_errno().

In other words, this patch is actually written to apply to patch 06. And 
I have
no clue where it needs to be fixed up. Please correct me if I understand 
wrong.


Have a nice day, thanks :)
Fei


>
>>>> ---
>>>>    hw/ppc/spapr_hcall.c | 12 ++++++++----
>>>>    1 file changed, 8 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>>>> index 5bc2cf4540..7c16ade04a 100644
>>>> --- a/hw/ppc/spapr_hcall.c
>>>> +++ b/hw/ppc/spapr_hcall.c
>>>> @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>>>        sPAPRPendingHPT *pending = spapr->pending_hpt;
>>>>        uint64_t current_ram_size;
>>>>        int rc;
>>>> +    Error *local_err = NULL;
>>>>        if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
>>>>            return H_AUTHORITY;
>>>> @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>>>        pending->shift = shift;
>>>>        pending->ret = H_HARDWARE;
>>>> -    /* TODO: let the further caller handle the error instead of abort() here */
>>>> -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>>>> -                       hpt_prepare_thread, pending,
>>>> -                       QEMU_THREAD_DETACHED, &error_abort);
>>>> +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>>>> +                            hpt_prepare_thread, pending,
>>>> +                            QEMU_THREAD_DETACHED, &local_err)) {
>>>> +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
>>>> +        g_free(pending);
>>>> +        return H_RESOURCE;
>>> I also think H_HARDWARE would be a better choice here.  Although the
>>> failure is due to a resource constraint, it's not because the guest
>>> asked for too much, just because the host is in dire straits.  From
>>> the guest's point of view it's basically a hardware failure.
>> Ok, thanks. Will use H_HARDWARE instead.
>>
>> Have a nice day, thanks for the review. :)
>> Fei
>>>> +    }
>>>>        spapr->pending_hpt = pending;

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2019-01-03 13:41         ` Fei Li
@ 2019-01-04  5:21           ` David Gibson
  2019-01-04  6:20             ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: David Gibson @ 2019-01-04  5:21 UTC (permalink / raw)
  To: Fei Li; +Cc: Fei Li, qemu-devel, shirley17fei, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 4731 bytes --]

On Thu, Jan 03, 2019 at 09:41:49PM +0800, Fei Li wrote:
> 
> 在 2019/1/3 上午11:43, David Gibson 写道:
> > On Wed, Jan 02, 2019 at 02:44:17PM +0800, 李菲 wrote:
> > > 在 2019/1/2 上午10:36, David Gibson 写道:
> > > > On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
> > > > > Add a local_err to hold the error, and return the corresponding
> > > > > error code to replace the temporary &error_abort.
> > > > > 
> > > > > Cc: Markus Armbruster <armbru@redhat.com>
> > > > > Cc: David Gibson <david@gibson.dropbear.id.au>
> > > > > Signed-off-by: Fei Li <fli@suse.com>
> > > > This looks like a good change, but it no longer applies due to a
> > > > change in the qemu_thread_create() signature.
> > > Sorry that I am not sure whether I understand. Do you mean using
> > > &error_abort is more suitable for this handling, rather than report
> > > the &local_err & return a failure reason?
> > No, I just mean that context has been altered by a global change and
> > the patch will need to be fixed up to cope with that.
> 
> Just to be clearer: does the "global change" mean the "[patch 06/16]
> qemu_thread: Make qemu_thread_create() handle errors properly", or another
> patch not in this patch series?
> 
> If it means the [patch 06/16], I want to explain more: the 06/16 handles all
> qemu_thread_create() by passing &error_abort as the parameter, and the
> following patches are to improve on the &error_abort for callers who can
> handle more properly. E.g. if qemu_thread_create() fails in
> h_resize_hpt_prepare(),
> I think reporting the &local_err & returning the failure reason is more
> proper
> than just abort() inside qemu_thread_create() when calls error_setg_errno().
> 
> In other words, this patch is actually written to apply to patch 06. And I
> have
> no clue where it needs to be fixed up. Please correct me if I understand
> wrong.
> 
> 
> Have a nice day, thanks :)

Ah, sorry.  Since I was only CCed on this patch, not the rest of the
series, I assumed it was independent and didn't think to check the
earlier patches of the series.

So, yes, I think the global change I'm referring to is 6/16, which I
didn't have, so that explains the problem.

In that case it's probably best if this goes in via the same tree the
rest of the series is going to.  So, with the H_HARDWARE change made:

Acked-by: David Gibson <david@gibson.dropbear.id.au>


> Fei
> 
> 
> > 
> > > > > ---
> > > > >    hw/ppc/spapr_hcall.c | 12 ++++++++----
> > > > >    1 file changed, 8 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > > > > index 5bc2cf4540..7c16ade04a 100644
> > > > > --- a/hw/ppc/spapr_hcall.c
> > > > > +++ b/hw/ppc/spapr_hcall.c
> > > > > @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
> > > > >        sPAPRPendingHPT *pending = spapr->pending_hpt;
> > > > >        uint64_t current_ram_size;
> > > > >        int rc;
> > > > > +    Error *local_err = NULL;
> > > > >        if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
> > > > >            return H_AUTHORITY;
> > > > > @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
> > > > >        pending->shift = shift;
> > > > >        pending->ret = H_HARDWARE;
> > > > > -    /* TODO: let the further caller handle the error instead of abort() here */
> > > > > -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> > > > > -                       hpt_prepare_thread, pending,
> > > > > -                       QEMU_THREAD_DETACHED, &error_abort);
> > > > > +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
> > > > > +                            hpt_prepare_thread, pending,
> > > > > +                            QEMU_THREAD_DETACHED, &local_err)) {
> > > > > +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
> > > > > +        g_free(pending);
> > > > > +        return H_RESOURCE;
> > > > I also think H_HARDWARE would be a better choice here.  Although the
> > > > failure is due to a resource constraint, it's not because the guest
> > > > asked for too much, just because the host is in dire straits.  From
> > > > the guest's point of view it's basically a hardware failure.
> > > Ok, thanks. Will use H_HARDWARE instead.
> > > 
> > > Have a nice day, thanks for the review. :)
> > > Fei
> > > > > +    }
> > > > >        spapr->pending_hpt = pending;
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare
  2019-01-04  5:21           ` David Gibson
@ 2019-01-04  6:20             ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-04  6:20 UTC (permalink / raw)
  To: David Gibson; +Cc: Fei Li, qemu-devel, shirley17fei, Markus Armbruster


在 2019/1/4 下午1:21, David Gibson 写道:
> On Thu, Jan 03, 2019 at 09:41:49PM +0800, Fei Li wrote:
>> 在 2019/1/3 上午11:43, David Gibson 写道:
>>> On Wed, Jan 02, 2019 at 02:44:17PM +0800, 李菲 wrote:
>>>> 在 2019/1/2 上午10:36, David Gibson 写道:
>>>>> On Tue, Dec 25, 2018 at 10:04:43PM +0800, Fei Li wrote:
>>>>>> Add a local_err to hold the error, and return the corresponding
>>>>>> error code to replace the temporary &error_abort.
>>>>>>
>>>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>>>> Cc: David Gibson <david@gibson.dropbear.id.au>
>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>> This looks like a good change, but it no longer applies due to a
>>>>> change in the qemu_thread_create() signature.
>>>> Sorry that I am not sure whether I understand. Do you mean using
>>>> &error_abort is more suitable for this handling, rather than report
>>>> the &local_err & return a failure reason?
>>> No, I just mean that context has been altered by a global change and
>>> the patch will need to be fixed up to cope with that.
>> Just to be clearer: does the "global change" mean the "[patch 06/16]
>> qemu_thread: Make qemu_thread_create() handle errors properly", or another
>> patch not in this patch series?
>>
>> If it means the [patch 06/16], I want to explain more: the 06/16 handles all
>> qemu_thread_create() by passing &error_abort as the parameter, and the
>> following patches are to improve on the &error_abort for callers who can
>> handle more properly. E.g. if qemu_thread_create() fails in
>> h_resize_hpt_prepare(),
>> I think reporting the &local_err & returning the failure reason is more
>> proper
>> than just abort() inside qemu_thread_create() when calls error_setg_errno().
>>
>> In other words, this patch is actually written to apply to patch 06. And I
>> have
>> no clue where it needs to be fixed up. Please correct me if I understand
>> wrong.
>>
>>
>> Have a nice day, thanks :)
> Ah, sorry.  Since I was only CCed on this patch, not the rest of the
> series, I assumed it was independent and didn't think to check the
> earlier patches of the series.
A good reminder, CCing all during the review stage seems more reasonable. :)
>
> So, yes, I think the global change I'm referring to is 6/16, which I
> didn't have, so that explains the problem.
>
> In that case it's probably best if this goes in via the same tree the
> rest of the series is going to.  So, with the H_HARDWARE change made:
>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>

Thanks for the review!

Have a nice day ;)
Fei
>
>
>> Fei
>>
>>
>>>>>> ---
>>>>>>     hw/ppc/spapr_hcall.c | 12 ++++++++----
>>>>>>     1 file changed, 8 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>>>>>> index 5bc2cf4540..7c16ade04a 100644
>>>>>> --- a/hw/ppc/spapr_hcall.c
>>>>>> +++ b/hw/ppc/spapr_hcall.c
>>>>>> @@ -478,6 +478,7 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>>>>>         sPAPRPendingHPT *pending = spapr->pending_hpt;
>>>>>>         uint64_t current_ram_size;
>>>>>>         int rc;
>>>>>> +    Error *local_err = NULL;
>>>>>>         if (spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED) {
>>>>>>             return H_AUTHORITY;
>>>>>> @@ -538,10 +539,13 @@ static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
>>>>>>         pending->shift = shift;
>>>>>>         pending->ret = H_HARDWARE;
>>>>>> -    /* TODO: let the further caller handle the error instead of abort() here */
>>>>>> -    qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>>>>>> -                       hpt_prepare_thread, pending,
>>>>>> -                       QEMU_THREAD_DETACHED, &error_abort);
>>>>>> +    if (!qemu_thread_create(&pending->thread, "sPAPR HPT prepare",
>>>>>> +                            hpt_prepare_thread, pending,
>>>>>> +                            QEMU_THREAD_DETACHED, &local_err)) {
>>>>>> +        error_reportf_err(local_err, "failed to create hpt_prepare_thread: ");
>>>>>> +        g_free(pending);
>>>>>> +        return H_RESOURCE;
>>>>> I also think H_HARDWARE would be a better choice here.  Although the
>>>>> failure is due to a resource constraint, it's not because the guest
>>>>> asked for too much, just because the host is in dire straits.  From
>>>>> the guest's point of view it's basically a hardware failure.
>>>> Ok, thanks. Will use H_HARDWARE instead.
>>>>
>>>> Have a nice day, thanks for the review. :)
>>>> Fei
>>>>>> +    }
>>>>>>         spapr->pending_hpt = pending;

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle
  2019-01-02 13:46 ` [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle no-reply
@ 2019-01-07 12:44   ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-07 12:44 UTC (permalink / raw)
  To: qemu-devel, fli; +Cc: fam, lifei1214

Hi all,

Sorry for the mistakenly deleted " include "qapi/error.h" " for
[PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for 
emulated_realize,
will add this #include back in next version.


Have a nice day, and again sorry for the trouble.
Fei


在 2019/1/2 下午9:46, no-reply@patchew.org 写道:
> Patchew URL: https://patchew.org/QEMU/20181225140449.15786-1-fli@suse.com/
>
>
>
> Hi,
>
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
>
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> time make docker-test-quick@centos7 SHOW_ENV=1 J=8
> === TEST SCRIPT END ===
>
> libpmem support   no
> libudev           no
>
> WARNING: Use of SDL 1.2 is deprecated and will be removed in
> WARNING: future releases. Please switch to using SDL 2.0
>
> NOTE: cross-compilers enabled:  'cc'
>    GEN     x86_64-softmmu/config-devices.mak.tmp
> ---
>    CC      hw/usb/host-stub.o
>    CC      hw/virtio/virtio-bus.o
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: In function 'init_event_notifier':
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:404:9: error: implicit declaration of function 'error_setg' [-Werror=implicit-function-declaration]
>           error_setg(errp, "ccid-card-emul: event notifier creation failed");
>           ^
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:404:9: error: nested extern declaration of 'error_setg' [-Werror=nested-externs]
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: In function 'emulated_realize':
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:513:13: error: implicit declaration of function 'error_append_hint' [-Werror=implicit-function-declaration]
>               error_append_hint(errp, "%s\n", ptable->name);
>               ^
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c:513:13: error: nested extern declaration of 'error_append_hint' [-Werror=nested-externs]
> /tmp/qemu-test/src/hw/usb/ccid-card-emulated.c: At top level:
> cc1: error: unrecognized command line option "-Wno-format-truncation" [-Werror]
> cc1: all warnings being treated as errors
> make: *** [hw/usb/ccid-card-emulated.o] Error 1
> make: *** Waiting for unfinished jobs....
>
>
> The full log is available at
> http://patchew.org/logs/20181225140449.15786-1-fli@suse.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
@ 2019-01-07 16:50   ` Markus Armbruster
  2019-01-08 15:58     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 16:50 UTC (permalink / raw)
  To: Fei Li
  Cc: qemu-devel, shirley17fei, lifei1214, Dr . David Alan Gilbert,
	Juan Quintela

Fei Li <fli@suse.com> writes:

> Always call migrate_set_error() to set the error state without relying
> on whether multifd_save_cleanup() succeeds.  As the passed &local_err
> is never used in multifd_save_cleanup(), remove it. And make the
> function be: void multifd_save_cleanup(void).
>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>

The commit message is confusing.  Suggest:

    migration: multifd_save_cleanup() can't fail, simplify

    multifd_save_cleanup() takes an Error ** argument and returns an
    error code even though it can't actually fail.  Its callers
    dutifully check for failure.  Remove the useless argument and return
    value, and simplify the callers.

I think multifd_load_cleanup() has exactly the same issue.  Should we
clean it up, too?  Juan, what do you think?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly Fei Li
@ 2019-01-07 17:18   ` Markus Armbruster
  2019-01-08 15:55     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:18 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Paolo Bonzini

Fei Li <fli@suse.com> writes:

> qemu_thread_create() abort()s on error. Not nice. Give it a return
> value and an Error ** argument, so it can return success/failure.
>
> Considering qemu_thread_create() is quite widely used in qemu, split
> this into two steps: this patch passes the &error_abort to
> qemu_thread_create() everywhere, and the next 9 patches will improve
> on &error_abort for callers who need.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>

The commit message's title promises more than the patch delivers.
Suggest:

    qemu_thread: Make qemu_thread_create() take Error ** argument

The rest of the commit message is fine.

> ---
>  cpus.c                      | 23 +++++++++++++++--------
>  dump.c                      |  3 ++-
>  hw/misc/edu.c               |  4 +++-
>  hw/ppc/spapr_hcall.c        |  4 +++-
>  hw/rdma/rdma_backend.c      |  3 ++-
>  hw/usb/ccid-card-emulated.c |  5 +++--
>  include/qemu/thread.h       |  4 ++--
>  io/task.c                   |  3 ++-
>  iothread.c                  |  3 ++-
>  migration/migration.c       | 11 ++++++++---
>  migration/postcopy-ram.c    |  4 +++-
>  migration/ram.c             | 12 ++++++++----
>  migration/savevm.c          |  3 ++-
>  tests/atomic_add-bench.c    |  3 ++-
>  tests/iothread.c            |  2 +-
>  tests/qht-bench.c           |  3 ++-
>  tests/rcutorture.c          |  3 ++-
>  tests/test-aio.c            |  2 +-
>  tests/test-rcu-list.c       |  3 ++-
>  ui/vnc-jobs.c               |  6 ++++--
>  util/compatfd.c             |  6 ++++--
>  util/oslib-posix.c          |  3 ++-
>  util/qemu-thread-posix.c    | 27 ++++++++++++++++++++-------
>  util/qemu-thread-win32.c    | 16 ++++++++++++----
>  util/rcu.c                  |  3 ++-
>  util/thread-pool.c          |  4 +++-
>  26 files changed, 112 insertions(+), 51 deletions(-)
>
> diff --git a/cpus.c b/cpus.c
> index 0ddeeefc14..25df03326b 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -1961,15 +1961,17 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>              snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>                   cpu->cpu_index);
>  
> +            /* TODO: let the callers handle the error instead of abort() here */
>              qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
> -                               cpu, QEMU_THREAD_JOINABLE);
> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>  
>          } else {
>              /* share a single thread for all cpus with TCG */
>              snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
> +            /* TODO: let the callers handle the error instead of abort() here */
>              qemu_thread_create(cpu->thread, thread_name,
>                                 qemu_tcg_rr_cpu_thread_fn,
> -                               cpu, QEMU_THREAD_JOINABLE);
> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>  
>              single_tcg_halt_cond = cpu->halt_cond;
>              single_tcg_cpu_thread = cpu->thread;

You add this TODO comment to 24 out of 37 calls.  Can you give your
reasons for adding it to some calls, but not to others?

[...]
> diff --git a/include/qemu/thread.h b/include/qemu/thread.h
> index 55d83a907c..12291f4ccd 100644
> --- a/include/qemu/thread.h
> +++ b/include/qemu/thread.h
> @@ -152,9 +152,9 @@ void qemu_event_reset(QemuEvent *ev);
>  void qemu_event_wait(QemuEvent *ev);
>  void qemu_event_destroy(QemuEvent *ev);
>  
> -void qemu_thread_create(QemuThread *thread, const char *name,
> +bool qemu_thread_create(QemuThread *thread, const char *name,
>                          void *(*start_routine)(void *),
> -                        void *arg, int mode);
> +                        void *arg, int mode, Error **errp);
>  void *qemu_thread_join(QemuThread *thread);
>  void qemu_thread_get_self(QemuThread *thread);
>  bool qemu_thread_is_self(QemuThread *thread);
[...]
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index 865e476df5..39834b0551 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -15,6 +15,7 @@
>  #include "qemu/atomic.h"
>  #include "qemu/notify.h"
>  #include "qemu-thread-common.h"
> +#include "qapi/error.h"
>  
>  static bool name_threads;
>  
> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>      return r;
>  }
>  
> -void qemu_thread_create(QemuThread *thread, const char *name,
> -                       void *(*start_routine)(void*),
> -                       void *arg, int mode)
> +/*
> + * Return a boolean: true/false to indicate whether it succeeds.
> + * If fails, propagate the error to Error **errp and set the errno.
> + */

Let's write something that can pass as a function contract:

   /*
    * Create a new thread with name @name
    * The thread executes @start_routine() with argument @arg.
    * The thread will be created in a detached state if @mode is
    * QEMU_THREAD_DETACHED, and in a jounable state if it's
    * QEMU_THREAD_JOINABLE.
    * On success, return true.
    * On failure, set @errno, store an error through @errp and return
    * false.
    */

Personally, I'd return negative errno instead of false, and dispense
with setting errno.

> +bool qemu_thread_create(QemuThread *thread, const char *name,
> +                        void *(*start_routine)(void *),
> +                        void *arg, int mode, Error **errp)
>  {
>      sigset_t set, oldset;
>      int err;
> @@ -511,7 +516,9 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>  
>      err = pthread_attr_init(&attr);
>      if (err) {
> -        error_exit(err, __func__);
> +        errno = err;
> +        error_setg_errno(errp, errno, "pthread_attr_init failed");
> +        return false;
>      }
>  
>      if (mode == QEMU_THREAD_DETACHED) {
> @@ -529,13 +536,19 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>  
>      err = pthread_create(&thread->thread, &attr,
>                           qemu_thread_start, qemu_thread_args);
> -
> -    if (err)
> -        error_exit(err, __func__);
> +    if (err) {
> +        errno = err;
> +        error_setg_errno(errp, errno, "pthread_create failed");
> +        pthread_attr_destroy(&attr);
> +        g_free(qemu_thread_args->name);
> +        g_free(qemu_thread_args);
> +        return false;
> +    }
>  
>      pthread_sigmask(SIG_SETMASK, &oldset, NULL);
>  
>      pthread_attr_destroy(&attr);
> +    return true;
>  }
>  
>  void qemu_thread_get_self(QemuThread *thread)
> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
> index 4a363ca675..57b1143e97 100644
> --- a/util/qemu-thread-win32.c
> +++ b/util/qemu-thread-win32.c
> @@ -20,6 +20,7 @@
>  #include "qemu/thread.h"
>  #include "qemu/notify.h"
>  #include "qemu-thread-common.h"
> +#include "qapi/error.h"
>  #include <process.h>
>  
>  static bool name_threads;
> @@ -388,9 +389,9 @@ void *qemu_thread_join(QemuThread *thread)
>      return ret;
>  }
>  
> -void qemu_thread_create(QemuThread *thread, const char *name,
> -                       void *(*start_routine)(void *),
> -                       void *arg, int mode)
> +bool qemu_thread_create(QemuThread *thread, const char *name,
> +                        void *(*start_routine)(void *),
> +                        void *arg, int mode, Error **errp)
>  {
>      HANDLE hThread;
>      struct QemuThreadData *data;
> @@ -409,10 +410,17 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>      hThread = (HANDLE) _beginthreadex(NULL, 0, win32_start_routine,
>                                        data, 0, &thread->tid);
>      if (!hThread) {
> -        error_exit(GetLastError(), __func__);
> +        if (data->mode != QEMU_THREAD_DETACHED) {
> +            DeleteCriticalSection(&data->cs);
> +        }
> +        error_setg_errno(errp, errno,
> +                         "failed to create win32_start_routine");
> +        g_free(data);
> +        return false;
>      }
>      CloseHandle(hThread);
>      thread->data = data;
> +    return true;
>  }
>  
>  void qemu_thread_get_self(QemuThread *thread)
[...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory Fei Li
@ 2019-01-07 17:21   ` Markus Armbruster
  2019-01-08 16:00     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:21 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Marc-André Lureau

Fei Li <fli@suse.com> writes:

> Utilize the existed errp to propagate the error instead of the
> temporary &error_abort.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  dump.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/dump.c b/dump.c
> index c35d6ddd22..ef5ea324fa 100644
> --- a/dump.c
> +++ b/dump.c
> @@ -2020,9 +2020,10 @@ void qmp_dump_guest_memory(bool paging, const char *file,
>      if (detach_p) {
>          /* detached dump */
>          s->detached = true;
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
> -                           s, QEMU_THREAD_DETACHED, &error_abort);
> +        if (!qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
> +                           s, QEMU_THREAD_DETACHED, errp)) {
> +            /* keep 'if' here in case there is further error handling logic */
> +        }

I don't think keeping the conditional "just in case" is worthwhile.
Plain

           qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
                              s, QEMU_THREAD_DETACHED, errp);

should do fine.

>      } else {
>          /* sync dump */
>          dump_process(s, errp);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize Fei Li
@ 2019-01-07 17:29   ` Markus Armbruster
  2019-01-08  6:14     ` Jiri Slaby
  2019-01-13 15:44     ` Fei Li
  0 siblings, 2 replies; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:29 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Jiri Slaby

Fei Li <fli@suse.com> writes:

> Utilize the existed errp to propagate the error instead of the
> temporary &error_abort.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Jiri Slaby <jslaby@suse.cz>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  hw/misc/edu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/hw/misc/edu.c b/hw/misc/edu.c
> index 3f4ba7ded3..011fe6e0b7 100644
> --- a/hw/misc/edu.c
> +++ b/hw/misc/edu.c
> @@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp)
>  
>      qemu_mutex_init(&edu->thr_mutex);
>      qemu_cond_init(&edu->thr_cond);
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
> -                       edu, QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
> +                            edu, QEMU_THREAD_JOINABLE, errp)) {
> +        return;

You need to clean up everything that got initialized so far.  You might
want to call qemu_thread_create() earlier so you have less to clean up.

> +    }
>  
>      memory_region_init_io(&edu->mmio, OBJECT(edu), &edu_mmio_ops, edu,
>                      "edu-mmio", 1 * MiB);
       pci_register_bar(pdev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &edu->mmio);
   }

   static void pci_edu_uninit(PCIDevice *pdev)
   {
       EduState *edu = EDU(pdev);

       qemu_mutex_lock(&edu->thr_mutex);
       edu->stopping = true;
       qemu_mutex_unlock(&edu->thr_mutex);
       qemu_cond_signal(&edu->thr_cond);
       qemu_thread_join(&edu->thread);

       qemu_cond_destroy(&edu->thr_cond);
       qemu_mutex_destroy(&edu->thr_mutex);

       timer_del(&edu->dma_timer);
   }

Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize Fei Li
@ 2019-01-07 17:31   ` Markus Armbruster
  2019-01-09 13:21     ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:31 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Gerd Hoffmann

Fei Li <fli@suse.com> writes:

> Utilize the existed errp to propagate the error and do the
> corresponding cleanup to replace the temporary &error_abort.
>
> Cc: Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  hw/usb/ccid-card-emulated.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
> index f8ff7ff4a3..9245b4fcad 100644
> --- a/hw/usb/ccid-card-emulated.c
> +++ b/hw/usb/ccid-card-emulated.c
> @@ -32,7 +32,6 @@
>  #include "qemu/thread.h"
>  #include "qemu/main-loop.h"
>  #include "ccid.h"
> -#include "qapi/error.h"
>  
>  #define DPRINTF(card, lvl, fmt, ...) \
>  do {\
> @@ -544,11 +543,15 @@ static void emulated_realize(CCIDCardState *base, Error **errp)
>          error_setg(errp, "%s: failed to initialize vcard", TYPE_EMULATED_CCID);
>          goto out2;
>      }
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
> -                       card, QEMU_THREAD_JOINABLE, &error_abort);
> -    qemu_thread_create(&card->apdu_thread_id, "ccid/apdu", handle_apdu_thread,
> -                       card, QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
> +                            card, QEMU_THREAD_JOINABLE, errp)) {
> +        goto out2;
> +    }
> +    if (!qemu_thread_create(&card->apdu_thread_id, "ccid/apdu",
> +                            handle_apdu_thread, card,
> +                            QEMU_THREAD_JOINABLE, errp)) {
> +        goto out2;

You need to stop and join the first thread.

> +    }
>  
>  out2:
>      clean_event_notifier(card);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat Fei Li
@ 2019-01-07 17:50   ` Markus Armbruster
  2019-01-08 16:18     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:50 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Stefan Hajnoczi

Fei Li <fli@suse.com> writes:

> For iothread_complete: utilize the existed errp to propagate the
> error and do the corresponding cleanup to replace the temporary
> &error_abort.
>
> For qemu_signalfd_compat: add a local_err to hold the error, and
> return the corresponding error code to replace the temporary
> &error_abort.

I'd split the patch.

>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Eric Blake <eblake@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  iothread.c      | 17 +++++++++++------
>  util/compatfd.c | 11 ++++++++---
>  2 files changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/iothread.c b/iothread.c
> index 8e8aa01999..7335dacf0b 100644
> --- a/iothread.c
> +++ b/iothread.c
> @@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>                                  &local_error);
>      if (local_error) {
>          error_propagate(errp, local_error);
> -        aio_context_unref(iothread->ctx);
> -        iothread->ctx = NULL;
> -        return;
> +        goto fail;
>      }
>  
>      qemu_mutex_init(&iothread->init_done_lock);
> @@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>       */
>      name = object_get_canonical_path_component(OBJECT(obj));
>      thread_name = g_strdup_printf("IO %s", name);
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
> -                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
> +    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
> +                            iothread, QEMU_THREAD_JOINABLE, errp)) {
> +        g_free(thread_name);
> +        g_free(name);

I suspect you're missing cleanup here:

           qemu_cond_destroy(&iothread->init_done_cond);
           qemu_mutex_destroy(&iothread->init_done_lock);

But I'm not 100% sure, to be honest.  Stefan, can you help?


> +        goto fail;
> +    }
>      g_free(thread_name);
>      g_free(name);
>  

I'd avoid the code duplication like this:

       thread_ok = qemu_thread_create(&iothread->thread, thread_name,
                                      iothread_run, iothread,
                                      QEMU_THREAD_JOINABLE, errp);
       g_free(thread_name);
       g_free(name);
       if (!thread_ok) {
           qemu_cond_destroy(&iothread->init_done_cond);
           qemu_mutex_destroy(&iothread->init_done_lock);
           goto fail;
       }

Matter of taste.

Hmm, iothread.c has no maintainer.  Stefan, you created it, would you be
willing to serve as maintainer?

> @@ -191,6 +192,10 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>                         &iothread->init_done_lock);
>      }
>      qemu_mutex_unlock(&iothread->init_done_lock);
> +    return;
> +fail:
> +    aio_context_unref(iothread->ctx);
> +    iothread->ctx = NULL;
>  }
>  
>  typedef struct {
> diff --git a/util/compatfd.c b/util/compatfd.c
> index c3d8448264..9cb13381e4 100644
> --- a/util/compatfd.c
> +++ b/util/compatfd.c
> @@ -71,6 +71,7 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>      struct sigfd_compat_info *info;
>      QemuThread thread;
>      int fds[2];
> +    Error *local_err = NULL;
>  
>      info = malloc(sizeof(*info));
>      if (info == NULL) {
> @@ -89,9 +90,13 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>      memcpy(&info->mask, mask, sizeof(*mask));
>      info->fd = fds[1];
>  
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
> -                       info, QEMU_THREAD_DETACHED, &error_abort);
> +    if (!qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
> +                            info, QEMU_THREAD_DETACHED, &local_err)) {
> +        close(fds[0]);
> +        close(fds[1]);
> +        free(info);
> +        return -1;

Leaks @local_err.  Pass NULL instead.

> +    }
>  
>      return fds[0];
>  }

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread Fei Li
@ 2019-01-07 17:54   ` Markus Armbruster
  2019-01-08 16:24     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:54 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Gerd Hoffmann

Fei Li <fli@suse.com> writes:

> Supplement the error handling for vnc_thread_worker_thread: add
> an Error parameter for it to propagate the error to its caller to
> handle in case it fails, and make it return a Boolean to indicate
> whether it succeeds.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  ui/vnc-jobs.c | 17 +++++++++++------
>  ui/vnc-jobs.h |  2 +-
>  ui/vnc.c      |  4 +++-
>  3 files changed, 15 insertions(+), 8 deletions(-)
>
> diff --git a/ui/vnc-jobs.c b/ui/vnc-jobs.c
> index 5712f1f501..35a652d1fd 100644
> --- a/ui/vnc-jobs.c
> +++ b/ui/vnc-jobs.c
> @@ -332,16 +332,21 @@ static bool vnc_worker_thread_running(void)
>      return queue; /* Check global queue */
>  }
>  
> -void vnc_start_worker_thread(void)
> +bool vnc_start_worker_thread(Error **errp)
>  {
>      VncJobQueue *q;
>  
> -    if (vnc_worker_thread_running())
> -        return ;
> +    if (vnc_worker_thread_running()) {
> +        goto out;

Why not simply return true?

> +    }
>  
>      q = vnc_queue_init();
> -    /* TODO: let the further caller handle the error instead of abort() here */
> -    qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
> -                       q, QEMU_THREAD_DETACHED, &error_abort);
> +    if (!qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
> +                            q, QEMU_THREAD_DETACHED, errp)) {
> +        vnc_queue_clear(q);
> +        return false;
> +    }
>      queue = q; /* Set global queue */
> +out:
> +    return true;
>  }
> diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
> index 59f66bcc35..14640593db 100644
> --- a/ui/vnc-jobs.h
> +++ b/ui/vnc-jobs.h
> @@ -37,7 +37,7 @@ void vnc_job_push(VncJob *job);
>  void vnc_jobs_join(VncState *vs);
>  
>  void vnc_jobs_consume_buffer(VncState *vs);
> -void vnc_start_worker_thread(void);
> +bool vnc_start_worker_thread(Error **errp);
>  
>  /* Locks */
>  static inline int vnc_trylock_display(VncDisplay *vd)
> diff --git a/ui/vnc.c b/ui/vnc.c
> index 0c1b477425..0ffe9e6a5d 100644
> --- a/ui/vnc.c
> +++ b/ui/vnc.c
> @@ -3236,7 +3236,9 @@ void vnc_display_init(const char *id, Error **errp)
>      vd->connections_limit = 32;
>  
>      qemu_mutex_init(&vd->mutex);
> -    vnc_start_worker_thread();
> +    if (!vnc_start_worker_thread(errp)) {
> +        return;
> +    }
>  
>      vd->dcl.ops = &dcl_ops;
>      register_displaychangelistener(&vd->dcl);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault Fei Li
@ 2019-01-07 17:55   ` Markus Armbruster
  2019-01-08 16:50     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 17:55 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, Stefan Weil

Fei Li <fli@suse.com> writes:

> To avoid the segmentation fault in qemu_thread_join(), just directly
> return when the QemuThread *thread failed to be created in either
> qemu-thread-posix.c or qemu-thread-win32.c.
>
> Cc: Stefan Weil <sw@weilnetz.de>
> Signed-off-by: Fei Li <fli@suse.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> ---
>  util/qemu-thread-posix.c | 3 +++
>  util/qemu-thread-win32.c | 2 +-
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index 39834b0551..3548935dac 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>      int err;
>      void *ret;
>  
> +    if (!thread->thread) {
> +        return NULL;
> +    }

How can this happen?

>      err = pthread_join(thread->thread, &ret);
>      if (err) {
>          error_exit(err, __func__);
> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
> index 57b1143e97..ca4d5329e3 100644
> --- a/util/qemu-thread-win32.c
> +++ b/util/qemu-thread-win32.c
> @@ -367,7 +367,7 @@ void *qemu_thread_join(QemuThread *thread)
>      HANDLE handle;
>  
>      data = thread->data;
> -    if (data->mode == QEMU_THREAD_DETACHED) {
> +    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {
>          return NULL;
>      }

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages Fei Li
@ 2019-01-07 18:13   ` Markus Armbruster
  2019-01-09 16:13     ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-07 18:13 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214

Fei Li <fli@suse.com> writes:

> Supplement the error handling for touch_all_pages: add an Error
> parameter for it to propagate the error to its caller to do the
> handling in case it fails.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> ---
>  util/oslib-posix.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> index 251e2f1aea..afc1d99093 100644
> --- a/util/oslib-posix.c
> +++ b/util/oslib-posix.c
> @@ -431,15 +431,17 @@ static inline int get_memset_num_threads(int smp_cpus)
>  }
>  
>  static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
> -                            int smp_cpus)
> +                            int smp_cpus, Error **errp)
>  {
>      size_t numpages_per_thread;
>      size_t size_per_thread;
>      char *addr = area;
>      int i = 0;
> +    int started_thread = 0;
>  
>      memset_thread_failed = false;
>      memset_num_threads = get_memset_num_threads(smp_cpus);
> +    started_thread = memset_num_threads;
>      memset_thread = g_new0(MemsetThread, memset_num_threads);
>      numpages_per_thread = (numpages / memset_num_threads);
>      size_per_thread = (hpagesize * numpages_per_thread);
> @@ -448,14 +450,18 @@ static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
>          memset_thread[i].numpages = (i == (memset_num_threads - 1)) ?
>                                      numpages : numpages_per_thread;
>          memset_thread[i].hpagesize = hpagesize;
> -        /* TODO: let the callers handle the error instead of abort() here */
> -        qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
> -                           do_touch_pages, &memset_thread[i],
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
> +                                do_touch_pages, &memset_thread[i],
> +                                QEMU_THREAD_JOINABLE, errp)) {
> +            memset_thread_failed = true;
> +            started_thread = i;
> +            goto out;

break rather than goto, please.

> +        }
>          addr += size_per_thread;
>          numpages -= numpages_per_thread;
>      }
> -    for (i = 0; i < memset_num_threads; i++) {
> +out:
> +    for (i = 0; i < started_thread; i++) {
>          qemu_thread_join(&memset_thread[i].pgthread);
>      }

I don't like how @started_thread is computed.  The name suggests it's
the number of threads started so far.  That's the case when you
initialize it to zero.  But then you immediately set it to
memset_thread().  It again becomes the case only when you break the loop
on error, or when you complete it successfully.

There's no need for @started_thread, since the number of threads created
is readily available as @i:

       memset_num_threads = i;
       for (i = 0; i < memset_num_threads; i++) {
           qemu_thread_join(&memset_thread[i].pgthread);
       }

Rest of the function:

>      g_free(memset_thread);
       memset_thread = NULL;

       return memset_thread_failed;
   }

If do_touch_pages() set memset_thread_failed(), we return false without
setting an error.  I believe you should

       if (memset_thread_failed) {
           error_setg(errp, "os_mem_prealloc: Insufficient free host memory "
               "pages available to allocate guest RAM");
           return false;
       }
       return true;

here, and ...

> @@ -471,6 +477,7 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
>      struct sigaction act, oldact;
>      size_t hpagesize = qemu_fd_getpagesize(fd);
>      size_t numpages = DIV_ROUND_UP(memory, hpagesize);
> +    Error *local_err = NULL;
>  
>      memset(&act, 0, sizeof(act));
>      act.sa_handler = &sigbus_handler;
> @@ -484,9 +491,9 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
>      }
>  
>      /* touch pages simultaneously */
> -    if (touch_all_pages(area, hpagesize, numpages, smp_cpus)) {
> -        error_setg(errp, "os_mem_prealloc: Insufficient free host memory "
> -            "pages available to allocate guest RAM");
> +    if (touch_all_pages(area, hpagesize, numpages, smp_cpus, &local_err)) {
> +        error_propagate_prepend(errp, local_err, "os_mem_prealloc: Insufficient"
> +            " free host memory pages available to allocate guest RAM: ");
>      }

... not mess with the error message here, i.e.

       touch_all_pages(area, hpagesize, numpages, smp_cpus), errp);

>  
>      ret = sigaction(SIGBUS, &oldact, NULL);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-07 17:29   ` Markus Armbruster
@ 2019-01-08  6:14     ` Jiri Slaby
  2019-01-08  6:51       ` Peter Xu
  2019-01-13 15:44     ` Fei Li
  1 sibling, 1 reply; 74+ messages in thread
From: Jiri Slaby @ 2019-01-08  6:14 UTC (permalink / raw)
  To: Markus Armbruster, Fei Li; +Cc: qemu-devel, shirley17fei, lifei1214, peterx

On 07. 01. 19, 18:29, Markus Armbruster wrote:
>    static void pci_edu_uninit(PCIDevice *pdev)
>    {
>        EduState *edu = EDU(pdev);
> 
>        qemu_mutex_lock(&edu->thr_mutex);
>        edu->stopping = true;
>        qemu_mutex_unlock(&edu->thr_mutex);
>        qemu_cond_signal(&edu->thr_cond);
>        qemu_thread_join(&edu->thread);
> 
>        qemu_cond_destroy(&edu->thr_cond);
>        qemu_mutex_destroy(&edu->thr_mutex);
> 
>        timer_del(&edu->dma_timer);
>    }
> 
> Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\

I don't know, the MSI support was added in:
commit eabb5782f70b4a10975b24ccd7129929a05ac932
Author: Peter Xu <peterx@redhat.com>
Date:   Wed Sep 28 21:03:39 2016 +0800

    hw/misc/edu: support MSI interrupt

Hence CCing Peter.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-08  6:14     ` Jiri Slaby
@ 2019-01-08  6:51       ` Peter Xu
  2019-01-08  8:43         ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Xu @ 2019-01-08  6:51 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Markus Armbruster, Fei Li, qemu-devel, shirley17fei, lifei1214

On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote:
> On 07. 01. 19, 18:29, Markus Armbruster wrote:
> >    static void pci_edu_uninit(PCIDevice *pdev)
> >    {
> >        EduState *edu = EDU(pdev);
> > 
> >        qemu_mutex_lock(&edu->thr_mutex);
> >        edu->stopping = true;
> >        qemu_mutex_unlock(&edu->thr_mutex);
> >        qemu_cond_signal(&edu->thr_cond);
> >        qemu_thread_join(&edu->thread);
> > 
> >        qemu_cond_destroy(&edu->thr_cond);
> >        qemu_mutex_destroy(&edu->thr_mutex);
> > 
> >        timer_del(&edu->dma_timer);
> >    }
> > 
> > Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\
> 
> I don't know, the MSI support was added in:
> commit eabb5782f70b4a10975b24ccd7129929a05ac932
> Author: Peter Xu <peterx@redhat.com>
> Date:   Wed Sep 28 21:03:39 2016 +0800
> 
>     hw/misc/edu: support MSI interrupt
> 
> Hence CCing Peter.

Hi, Jiri, Markus, Fei,

IMHO msi_uninit() is optional since it only operates on the config
space of the device to remove the capability or fix up the flags
without really doing any real destruction of objects so nothing will
be leaked (unlike msix_uninit, which should be required).  But I do
agree that calling msi_uninit() could be even nicer here.

Anyone would like to post a patch? Or should I?

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-08  6:51       ` Peter Xu
@ 2019-01-08  8:43         ` Markus Armbruster
  2019-01-10 13:29           ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-08  8:43 UTC (permalink / raw)
  To: Peter Xu
  Cc: Jiri Slaby, Fei Li, lifei1214, shirley17fei, qemu-devel,
	Michael S. Tsirkin, Marcel Apfelbaum

Peter Xu <peterx@redhat.com> writes:

> On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote:
>> On 07. 01. 19, 18:29, Markus Armbruster wrote:
>> >    static void pci_edu_uninit(PCIDevice *pdev)
>> >    {
>> >        EduState *edu = EDU(pdev);
>> > 
>> >        qemu_mutex_lock(&edu->thr_mutex);
>> >        edu->stopping = true;
>> >        qemu_mutex_unlock(&edu->thr_mutex);
>> >        qemu_cond_signal(&edu->thr_cond);
>> >        qemu_thread_join(&edu->thread);
>> > 
>> >        qemu_cond_destroy(&edu->thr_cond);
>> >        qemu_mutex_destroy(&edu->thr_mutex);
>> > 
>> >        timer_del(&edu->dma_timer);
>> >    }
>> > 
>> > Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\
>> 
>> I don't know, the MSI support was added in:
>> commit eabb5782f70b4a10975b24ccd7129929a05ac932
>> Author: Peter Xu <peterx@redhat.com>
>> Date:   Wed Sep 28 21:03:39 2016 +0800
>> 
>>     hw/misc/edu: support MSI interrupt
>> 
>> Hence CCing Peter.
>
> Hi, Jiri, Markus, Fei,
>
> IMHO msi_uninit() is optional since it only operates on the config
> space of the device to remove the capability or fix up the flags
> without really doing any real destruction of objects so nothing will
> be leaked (unlike msix_uninit, which should be required).

Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless
bug, or a harmful bug?

>                                                            But I do
> agree that calling msi_uninit() could be even nicer here.
>
> Anyone would like to post a patch? Or should I?

Please coordinate fixing this with Fei Li.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2019-01-07 17:18   ` Markus Armbruster
@ 2019-01-08 15:55     ` fei
  2019-01-08 17:07       ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: fei @ 2019-01-08 15:55 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, qemu-devel, shirley17fei, Paolo Bonzini



> 在 2019年1月8日,01:18,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> qemu_thread_create() abort()s on error. Not nice. Give it a return
>> value and an Error ** argument, so it can return success/failure.
>> 
>> Considering qemu_thread_create() is quite widely used in qemu, split
>> this into two steps: this patch passes the &error_abort to
>> qemu_thread_create() everywhere, and the next 9 patches will improve
>> on &error_abort for callers who need.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
> 
> The commit message's title promises more than the patch delivers.
> Suggest:
> 
>    qemu_thread: Make qemu_thread_create() take Error ** argument
Ok, thanks for the suggestion. :)
> 
> The rest of the commit message is fine.
> 
>> ---
>> cpus.c                      | 23 +++++++++++++++--------
>> dump.c                      |  3 ++-
>> hw/misc/edu.c               |  4 +++-
>> hw/ppc/spapr_hcall.c        |  4 +++-
>> hw/rdma/rdma_backend.c      |  3 ++-
>> hw/usb/ccid-card-emulated.c |  5 +++--
>> include/qemu/thread.h       |  4 ++--
>> io/task.c                   |  3 ++-
>> iothread.c                  |  3 ++-
>> migration/migration.c       | 11 ++++++++---
>> migration/postcopy-ram.c    |  4 +++-
>> migration/ram.c             | 12 ++++++++----
>> migration/savevm.c          |  3 ++-
>> tests/atomic_add-bench.c    |  3 ++-
>> tests/iothread.c            |  2 +-
>> tests/qht-bench.c           |  3 ++-
>> tests/rcutorture.c          |  3 ++-
>> tests/test-aio.c            |  2 +-
>> tests/test-rcu-list.c       |  3 ++-
>> ui/vnc-jobs.c               |  6 ++++--
>> util/compatfd.c             |  6 ++++--
>> util/oslib-posix.c          |  3 ++-
>> util/qemu-thread-posix.c    | 27 ++++++++++++++++++++-------
>> util/qemu-thread-win32.c    | 16 ++++++++++++----
>> util/rcu.c                  |  3 ++-
>> util/thread-pool.c          |  4 +++-
>> 26 files changed, 112 insertions(+), 51 deletions(-)
>> 
>> diff --git a/cpus.c b/cpus.c
>> index 0ddeeefc14..25df03326b 100644
>> --- a/cpus.c
>> +++ b/cpus.c
>> @@ -1961,15 +1961,17 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>                  cpu->cpu_index);
>> 
>> +            /* TODO: let the callers handle the error instead of abort() here */
>>             qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
>> -                               cpu, QEMU_THREAD_JOINABLE);
>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>> 
>>         } else {
>>             /* share a single thread for all cpus with TCG */
>>             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
>> +            /* TODO: let the callers handle the error instead of abort() here */
>>             qemu_thread_create(cpu->thread, thread_name,
>>                                qemu_tcg_rr_cpu_thread_fn,
>> -                               cpu, QEMU_THREAD_JOINABLE);
>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>> 
>>             single_tcg_halt_cond = cpu->halt_cond;
>>             single_tcg_cpu_thread = cpu->thread;
> 
> You add this TODO comment to 24 out of 37 calls.  Can you give your
> reasons for adding it to some calls, but not to others?
For those have TODO, I polish them in the next following patches, and for those do not have TODO I just let them use &error_abort.
> 
> [...]
>> diff --git a/include/qemu/thread.h b/include/qemu/thread.h
>> index 55d83a907c..12291f4ccd 100644
>> --- a/include/qemu/thread.h
>> +++ b/include/qemu/thread.h
>> @@ -152,9 +152,9 @@ void qemu_event_reset(QemuEvent *ev);
>> void qemu_event_wait(QemuEvent *ev);
>> void qemu_event_destroy(QemuEvent *ev);
>> 
>> -void qemu_thread_create(QemuThread *thread, const char *name,
>> +bool qemu_thread_create(QemuThread *thread, const char *name,
>>                         void *(*start_routine)(void *),
>> -                        void *arg, int mode);
>> +                        void *arg, int mode, Error **errp);
>> void *qemu_thread_join(QemuThread *thread);
>> void qemu_thread_get_self(QemuThread *thread);
>> bool qemu_thread_is_self(QemuThread *thread);
> [...]
>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>> index 865e476df5..39834b0551 100644
>> --- a/util/qemu-thread-posix.c
>> +++ b/util/qemu-thread-posix.c
>> @@ -15,6 +15,7 @@
>> #include "qemu/atomic.h"
>> #include "qemu/notify.h"
>> #include "qemu-thread-common.h"
>> +#include "qapi/error.h"
>> 
>> static bool name_threads;
>> 
>> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>>     return r;
>> }
>> 
>> -void qemu_thread_create(QemuThread *thread, const char *name,
>> -                       void *(*start_routine)(void*),
>> -                       void *arg, int mode)
>> +/*
>> + * Return a boolean: true/false to indicate whether it succeeds.
>> + * If fails, propagate the error to Error **errp and set the errno.
>> + */
> 
> Let's write something that can pass as a function contract:
> 
>   /*
>    * Create a new thread with name @name
>    * The thread executes @start_routine() with argument @arg.
>    * The thread will be created in a detached state if @mode is
>    * QEMU_THREAD_DETACHED, and in a jounable state if it's
>    * QEMU_THREAD_JOINABLE.
>    * On success, return true.
>    * On failure, set @errno, store an error through @errp and return
>    * false.
>    */
Thanks so much for amending! :)
> Personally, I'd return negative errno instead of false, and dispense
> with setting errno.
Emm, I think I have replied this in last version, but due to several reasons I did not wait for your feedback and sent the v9. Sorry for that.. And I like to paste my two considerations here:
“- Actually only one caller needs the errno, that is the above qemu_signalfd_compat(). 
- For the returning value, I remember there's once a email thread talking about it: returning a bool (and let the passed errp hold the error message) is to keep the consistency with glib.
”
So IMO I am wondering whether it is really needed to change the bool (true/false) to int (0/-errno), just for that sole function: qemu_signalfd_compat() which needs the errno. Besides if we return -errno, for each caller we need add a local variable like “ret= qemu_thread_create()” to store the -errno.

Have a nice day, thanks
Fei
> 
>> +bool qemu_thread_create(QemuThread *thread, const char *name,
>> +                        void *(*start_routine)(void *),
>> +                        void *arg, int mode, Error **errp)
>> {
>>     sigset_t set, oldset;
>>     int err;
>> @@ -511,7 +516,9 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>> 
>>     err = pthread_attr_init(&attr);
>>     if (err) {
>> -        error_exit(err, __func__);
>> +        errno = err;
>> +        error_setg_errno(errp, errno, "pthread_attr_init failed");
>> +        return false;
>>     }
>> 
>>     if (mode == QEMU_THREAD_DETACHED) {
>> @@ -529,13 +536,19 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>> 
>>     err = pthread_create(&thread->thread, &attr,
>>                          qemu_thread_start, qemu_thread_args);
>> -
>> -    if (err)
>> -        error_exit(err, __func__);
>> +    if (err) {
>> +        errno = err;
>> +        error_setg_errno(errp, errno, "pthread_create failed");
>> +        pthread_attr_destroy(&attr);
>> +        g_free(qemu_thread_args->name);
>> +        g_free(qemu_thread_args);
>> +        return false;
>> +    }
>> 
>>     pthread_sigmask(SIG_SETMASK, &oldset, NULL);
>> 
>>     pthread_attr_destroy(&attr);
>> +    return true;
>> }
>> 
>> void qemu_thread_get_self(QemuThread *thread)
>> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
>> index 4a363ca675..57b1143e97 100644
>> --- a/util/qemu-thread-win32.c
>> +++ b/util/qemu-thread-win32.c
>> @@ -20,6 +20,7 @@
>> #include "qemu/thread.h"
>> #include "qemu/notify.h"
>> #include "qemu-thread-common.h"
>> +#include "qapi/error.h"
>> #include <process.h>
>> 
>> static bool name_threads;
>> @@ -388,9 +389,9 @@ void *qemu_thread_join(QemuThread *thread)
>>     return ret;
>> }
>> 
>> -void qemu_thread_create(QemuThread *thread, const char *name,
>> -                       void *(*start_routine)(void *),
>> -                       void *arg, int mode)
>> +bool qemu_thread_create(QemuThread *thread, const char *name,
>> +                        void *(*start_routine)(void *),
>> +                        void *arg, int mode, Error **errp)
>> {
>>     HANDLE hThread;
>>     struct QemuThreadData *data;
>> @@ -409,10 +410,17 @@ void qemu_thread_create(QemuThread *thread, const char *name,
>>     hThread = (HANDLE) _beginthreadex(NULL, 0, win32_start_routine,
>>                                       data, 0, &thread->tid);
>>     if (!hThread) {
>> -        error_exit(GetLastError(), __func__);
>> +        if (data->mode != QEMU_THREAD_DETACHED) {
>> +            DeleteCriticalSection(&data->cs);
>> +        }
>> +        error_setg_errno(errp, errno,
>> +                         "failed to create win32_start_routine");
>> +        g_free(data);
>> +        return false;
>>     }
>>     CloseHandle(hThread);
>>     thread->data = data;
>> +    return true;
>> }
>> 
>> void qemu_thread_get_self(QemuThread *thread)
> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup
  2019-01-07 16:50   ` Markus Armbruster
@ 2019-01-08 15:58     ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-08 15:58 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Fei Li, qemu-devel, shirley17fei, Dr . David Alan Gilbert, Juan Quintela



> 在 2019年1月8日,00:50,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> Always call migrate_set_error() to set the error state without relying
>> on whether multifd_save_cleanup() succeeds.  As the passed &local_err
>> is never used in multifd_save_cleanup(), remove it. And make the
>> function be: void multifd_save_cleanup(void).
>> 
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> Reviewed-by: Juan Quintela <quintela@redhat.com>
> 
> The commit message is confusing.  Suggest:
> 
>    migration: multifd_save_cleanup() can't fail, simplify
> 
>    multifd_save_cleanup() takes an Error ** argument and returns an
>    error code even though it can't actually fail.  Its callers
>    dutifully check for failure.  Remove the useless argument and return
>    value, and simplify the callers.
Nice, thanks for the clearer comment. :)

Have a nice day 
Fei
> 
> I think multifd_load_cleanup() has exactly the same issue.  Should we
> clean it up, too?  Juan, what do you think?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory
  2019-01-07 17:21   ` Markus Armbruster
@ 2019-01-08 16:00     ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-08 16:00 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Fei Li, qemu-devel, shirley17fei, Marc-André Lureau



> 在 2019年1月8日,01:21,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> Utilize the existed errp to propagate the error instead of the
>> temporary &error_abort.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>> dump.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>> 
>> diff --git a/dump.c b/dump.c
>> index c35d6ddd22..ef5ea324fa 100644
>> --- a/dump.c
>> +++ b/dump.c
>> @@ -2020,9 +2020,10 @@ void qmp_dump_guest_memory(bool paging, const char *file,
>>     if (detach_p) {
>>         /* detached dump */
>>         s->detached = true;
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
>> -                           s, QEMU_THREAD_DETACHED, &error_abort);
>> +        if (!qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
>> +                           s, QEMU_THREAD_DETACHED, errp)) {
>> +            /* keep 'if' here in case there is further error handling logic */
>> +        }
> 
> I don't think keeping the conditional "just in case" is worthwhile.
> Plain
> 
>           qemu_thread_create(&s->dump_thread, "dump_thread", dump_thread,
>                              s, QEMU_THREAD_DETACHED, errp);
> 
> should do fine.
Ok, will simplify this in next version.

Have a nice day, thanks
Fei
> 
>>     } else {
>>         /* sync dump */
>>         dump_process(s, errp);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2019-01-07 17:50   ` Markus Armbruster
@ 2019-01-08 16:18     ` fei
  2019-01-13 16:16       ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: fei @ 2019-01-08 16:18 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, qemu-devel, shirley17fei, Stefan Hajnoczi



> 在 2019年1月8日,01:50,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> For iothread_complete: utilize the existed errp to propagate the
>> error and do the corresponding cleanup to replace the temporary
>> &error_abort.
>> 
>> For qemu_signalfd_compat: add a local_err to hold the error, and
>> return the corresponding error code to replace the temporary
>> &error_abort.
> 
> I'd split the patch.
Ok.
> 
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>> iothread.c      | 17 +++++++++++------
>> util/compatfd.c | 11 ++++++++---
>> 2 files changed, 19 insertions(+), 9 deletions(-)
>> 
>> diff --git a/iothread.c b/iothread.c
>> index 8e8aa01999..7335dacf0b 100644
>> --- a/iothread.c
>> +++ b/iothread.c
>> @@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>                                 &local_error);
>>     if (local_error) {
>>         error_propagate(errp, local_error);
>> -        aio_context_unref(iothread->ctx);
>> -        iothread->ctx = NULL;
>> -        return;
>> +        goto fail;
>>     }
>> 
>>     qemu_mutex_init(&iothread->init_done_lock);
>> @@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>      */
>>     name = object_get_canonical_path_component(OBJECT(obj));
>>     thread_name = g_strdup_printf("IO %s", name);
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>> -                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>> +                            iothread, QEMU_THREAD_JOINABLE, errp)) {
>> +        g_free(thread_name);
>> +        g_free(name);
> 
> I suspect you're missing cleanup here:
> 
>           qemu_cond_destroy(&iothread->init_done_cond);
>           qemu_mutex_destroy(&iothread->init_done_lock);
I remember I checked the code, when ucc->complete() fails, there’s a finalize() function to do the destroy. But did not test all the callers, so let’s wait for Stefan’s feedback. :)
> 
> But I'm not 100% sure, to be honest.  Stefan, can you help?
> 
> 
>> +        goto fail;
>> +    }
>>     g_free(thread_name);
>>     g_free(name);
>> 
> 
> I'd avoid the code duplication like this:
> 
>       thread_ok = qemu_thread_create(&iothread->thread, thread_name,
>                                      iothread_run, iothread,
>                                      QEMU_THREAD_JOINABLE, errp);
>       g_free(thread_name);
>       g_free(name);
>       if (!thread_ok) {
>           qemu_cond_destroy(&iothread->init_done_cond);
>           qemu_mutex_destroy(&iothread->init_done_lock);
>           goto fail;
>       }
> 
> Matter of taste.
> 
> Hmm, iothread.c has no maintainer.  Stefan, you created it, would you be
> willing to serve as maintainer?
> 
>> @@ -191,6 +192,10 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>                        &iothread->init_done_lock);
>>     }
>>     qemu_mutex_unlock(&iothread->init_done_lock);
>> +    return;
>> +fail:
>> +    aio_context_unref(iothread->ctx);
>> +    iothread->ctx = NULL;
>> }
>> 
>> typedef struct {
>> diff --git a/util/compatfd.c b/util/compatfd.c
>> index c3d8448264..9cb13381e4 100644
>> --- a/util/compatfd.c
>> +++ b/util/compatfd.c
>> @@ -71,6 +71,7 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>>     struct sigfd_compat_info *info;
>>     QemuThread thread;
>>     int fds[2];
>> +    Error *local_err = NULL;
>> 
>>     info = malloc(sizeof(*info));
>>     if (info == NULL) {
>> @@ -89,9 +90,13 @@ static int qemu_signalfd_compat(const sigset_t *mask)
>>     memcpy(&info->mask, mask, sizeof(*mask));
>>     info->fd = fds[1];
>> 
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
>> -                       info, QEMU_THREAD_DETACHED, &error_abort);
>> +    if (!qemu_thread_create(&thread, "signalfd_compat", sigwait_compat,
>> +                            info, QEMU_THREAD_DETACHED, &local_err)) {
>> +        close(fds[0]);
>> +        close(fds[1]);
>> +        free(info);
>> +        return -1;
> 
> Leaks @local_err.  Pass NULL instead.
Ok, thanks!

Have a nice day 
Fei
> 
>> +    }
>> 
>>     return fds[0];
>> }

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread
  2019-01-07 17:54   ` Markus Armbruster
@ 2019-01-08 16:24     ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-08 16:24 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, qemu-devel, shirley17fei, Gerd Hoffmann



> 在 2019年1月8日,01:54,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> Supplement the error handling for vnc_thread_worker_thread: add
>> an Error parameter for it to propagate the error to its caller to
>> handle in case it fails, and make it return a Boolean to indicate
>> whether it succeeds.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>> ui/vnc-jobs.c | 17 +++++++++++------
>> ui/vnc-jobs.h |  2 +-
>> ui/vnc.c      |  4 +++-
>> 3 files changed, 15 insertions(+), 8 deletions(-)
>> 
>> diff --git a/ui/vnc-jobs.c b/ui/vnc-jobs.c
>> index 5712f1f501..35a652d1fd 100644
>> --- a/ui/vnc-jobs.c
>> +++ b/ui/vnc-jobs.c
>> @@ -332,16 +332,21 @@ static bool vnc_worker_thread_running(void)
>>     return queue; /* Check global queue */
>> }
>> 
>> -void vnc_start_worker_thread(void)
>> +bool vnc_start_worker_thread(Error **errp)
>> {
>>     VncJobQueue *q;
>> 
>> -    if (vnc_worker_thread_running())
>> -        return ;
>> +    if (vnc_worker_thread_running()) {
>> +        goto out;
> 
> Why not simply return true?
Sounds right.. Will remove the below “out:” too.

Have a nice day, thanks
Fei

> 
>> +    }
>> 
>>     q = vnc_queue_init();
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
>> -                       q, QEMU_THREAD_DETACHED, &error_abort);
>> +    if (!qemu_thread_create(&q->thread, "vnc_worker", vnc_worker_thread,
>> +                            q, QEMU_THREAD_DETACHED, errp)) {
>> +        vnc_queue_clear(q);
>> +        return false;
>> +    }
>>     queue = q; /* Set global queue */
>> +out:
>> +    return true;
>> }
>> diff --git a/ui/vnc-jobs.h b/ui/vnc-jobs.h
>> index 59f66bcc35..14640593db 100644
>> --- a/ui/vnc-jobs.h
>> +++ b/ui/vnc-jobs.h
>> @@ -37,7 +37,7 @@ void vnc_job_push(VncJob *job);
>> void vnc_jobs_join(VncState *vs);
>> 
>> void vnc_jobs_consume_buffer(VncState *vs);
>> -void vnc_start_worker_thread(void);
>> +bool vnc_start_worker_thread(Error **errp);
>> 
>> /* Locks */
>> static inline int vnc_trylock_display(VncDisplay *vd)
>> diff --git a/ui/vnc.c b/ui/vnc.c
>> index 0c1b477425..0ffe9e6a5d 100644
>> --- a/ui/vnc.c
>> +++ b/ui/vnc.c
>> @@ -3236,7 +3236,9 @@ void vnc_display_init(const char *id, Error **errp)
>>     vd->connections_limit = 32;
>> 
>>     qemu_mutex_init(&vd->mutex);
>> -    vnc_start_worker_thread();
>> +    if (!vnc_start_worker_thread(errp)) {
>> +        return;
>> +    }
>> 
>>     vd->dcl.ops = &dcl_ops;
>>     register_displaychangelistener(&vd->dcl);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-07 17:55   ` Markus Armbruster
@ 2019-01-08 16:50     ` fei
  2019-01-08 17:29       ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: fei @ 2019-01-08 16:50 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, qemu-devel, shirley17fei, Stefan Weil



> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> To avoid the segmentation fault in qemu_thread_join(), just directly
>> return when the QemuThread *thread failed to be created in either
>> qemu-thread-posix.c or qemu-thread-win32.c.
>> 
>> Cc: Stefan Weil <sw@weilnetz.de>
>> Signed-off-by: Fei Li <fli@suse.com>
>> Reviewed-by: Fam Zheng <famz@redhat.com>
>> ---
>> util/qemu-thread-posix.c | 3 +++
>> util/qemu-thread-win32.c | 2 +-
>> 2 files changed, 4 insertions(+), 1 deletion(-)
>> 
>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>> index 39834b0551..3548935dac 100644
>> --- a/util/qemu-thread-posix.c
>> +++ b/util/qemu-thread-posix.c
>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>     int err;
>>     void *ret;
>> 
>> +    if (!thread->thread) {
>> +        return NULL;
>> +    }
> 
> How can this happen?
I think I have answered this earlier, please check the following link to see whether it helps:
http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html

Have a nice day, thanks
Fei
> 
>>     err = pthread_join(thread->thread, &ret);
>>     if (err) {
>>         error_exit(err, __func__);
>> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
>> index 57b1143e97..ca4d5329e3 100644
>> --- a/util/qemu-thread-win32.c
>> +++ b/util/qemu-thread-win32.c
>> @@ -367,7 +367,7 @@ void *qemu_thread_join(QemuThread *thread)
>>     HANDLE handle;
>> 
>>     data = thread->data;
>> -    if (data->mode == QEMU_THREAD_DETACHED) {
>> +    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {
>>         return NULL;
>>     }

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2019-01-08 15:55     ` fei
@ 2019-01-08 17:07       ` Markus Armbruster
  2019-01-09 13:19         ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-08 17:07 UTC (permalink / raw)
  To: fei; +Cc: Markus Armbruster, Fei Li, Paolo Bonzini, qemu-devel, shirley17fei

fei <lifei1214@126.com> writes:

>> 在 2019年1月8日,01:18,Markus Armbruster <armbru@redhat.com> 写道:
>> 
>> Fei Li <fli@suse.com> writes:
>> 
>>> qemu_thread_create() abort()s on error. Not nice. Give it a return
>>> value and an Error ** argument, so it can return success/failure.
>>> 
>>> Considering qemu_thread_create() is quite widely used in qemu, split
>>> this into two steps: this patch passes the &error_abort to
>>> qemu_thread_create() everywhere, and the next 9 patches will improve
>>> on &error_abort for callers who need.
>>> 
>>> Cc: Markus Armbruster <armbru@redhat.com>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Fei Li <fli@suse.com>
>> 
>> The commit message's title promises more than the patch delivers.
>> Suggest:
>> 
>>    qemu_thread: Make qemu_thread_create() take Error ** argument
> Ok, thanks for the suggestion. :)
>> 
>> The rest of the commit message is fine.
>> 
>>> ---
>>> cpus.c                      | 23 +++++++++++++++--------
>>> dump.c                      |  3 ++-
>>> hw/misc/edu.c               |  4 +++-
>>> hw/ppc/spapr_hcall.c        |  4 +++-
>>> hw/rdma/rdma_backend.c      |  3 ++-
>>> hw/usb/ccid-card-emulated.c |  5 +++--
>>> include/qemu/thread.h       |  4 ++--
>>> io/task.c                   |  3 ++-
>>> iothread.c                  |  3 ++-
>>> migration/migration.c       | 11 ++++++++---
>>> migration/postcopy-ram.c    |  4 +++-
>>> migration/ram.c             | 12 ++++++++----
>>> migration/savevm.c          |  3 ++-
>>> tests/atomic_add-bench.c    |  3 ++-
>>> tests/iothread.c            |  2 +-
>>> tests/qht-bench.c           |  3 ++-
>>> tests/rcutorture.c          |  3 ++-
>>> tests/test-aio.c            |  2 +-
>>> tests/test-rcu-list.c       |  3 ++-
>>> ui/vnc-jobs.c               |  6 ++++--
>>> util/compatfd.c             |  6 ++++--
>>> util/oslib-posix.c          |  3 ++-
>>> util/qemu-thread-posix.c    | 27 ++++++++++++++++++++-------
>>> util/qemu-thread-win32.c    | 16 ++++++++++++----
>>> util/rcu.c                  |  3 ++-
>>> util/thread-pool.c          |  4 +++-
>>> 26 files changed, 112 insertions(+), 51 deletions(-)
>>> 
>>> diff --git a/cpus.c b/cpus.c
>>> index 0ddeeefc14..25df03326b 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -1961,15 +1961,17 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>                  cpu->cpu_index);
>>> 
>>> +            /* TODO: let the callers handle the error instead of abort() here */
>>>             qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
>>> -                               cpu, QEMU_THREAD_JOINABLE);
>>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>>> 
>>>         } else {
>>>             /* share a single thread for all cpus with TCG */
>>>             snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
>>> +            /* TODO: let the callers handle the error instead of abort() here */
>>>             qemu_thread_create(cpu->thread, thread_name,
>>>                                qemu_tcg_rr_cpu_thread_fn,
>>> -                               cpu, QEMU_THREAD_JOINABLE);
>>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>>> 
>>>             single_tcg_halt_cond = cpu->halt_cond;
>>>             single_tcg_cpu_thread = cpu->thread;
>> 
>> You add this TODO comment to 24 out of 37 calls.  Can you give your
>> reasons for adding it to some calls, but not to others?
> For those have TODO, I polish them in the next following patches, and for those do not have TODO I just let them use &error_abort.

Please mention that in the commit message.

>> 
>> [...]
>>> diff --git a/include/qemu/thread.h b/include/qemu/thread.h
>>> index 55d83a907c..12291f4ccd 100644
>>> --- a/include/qemu/thread.h
>>> +++ b/include/qemu/thread.h
>>> @@ -152,9 +152,9 @@ void qemu_event_reset(QemuEvent *ev);
>>> void qemu_event_wait(QemuEvent *ev);
>>> void qemu_event_destroy(QemuEvent *ev);
>>> 
>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>> +bool qemu_thread_create(QemuThread *thread, const char *name,
>>>                         void *(*start_routine)(void *),
>>> -                        void *arg, int mode);
>>> +                        void *arg, int mode, Error **errp);
>>> void *qemu_thread_join(QemuThread *thread);
>>> void qemu_thread_get_self(QemuThread *thread);
>>> bool qemu_thread_is_self(QemuThread *thread);
>> [...]
>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>> index 865e476df5..39834b0551 100644
>>> --- a/util/qemu-thread-posix.c
>>> +++ b/util/qemu-thread-posix.c
>>> @@ -15,6 +15,7 @@
>>> #include "qemu/atomic.h"
>>> #include "qemu/notify.h"
>>> #include "qemu-thread-common.h"
>>> +#include "qapi/error.h"
>>> 
>>> static bool name_threads;
>>> 
>>> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>>>     return r;
>>> }
>>> 
>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>> -                       void *(*start_routine)(void*),
>>> -                       void *arg, int mode)
>>> +/*
>>> + * Return a boolean: true/false to indicate whether it succeeds.
>>> + * If fails, propagate the error to Error **errp and set the errno.
>>> + */
>> 
>> Let's write something that can pass as a function contract:
>> 
>>   /*
>>    * Create a new thread with name @name
>>    * The thread executes @start_routine() with argument @arg.
>>    * The thread will be created in a detached state if @mode is
>>    * QEMU_THREAD_DETACHED, and in a jounable state if it's
>>    * QEMU_THREAD_JOINABLE.
>>    * On success, return true.
>>    * On failure, set @errno, store an error through @errp and return
>>    * false.
>>    */
> Thanks so much for amending! :)
>> Personally, I'd return negative errno instead of false, and dispense
>> with setting errno.
> Emm, I think I have replied this in last version, but due to several reasons I did not wait for your feedback and sent the v9. Sorry for that.. And I like to paste my two considerations here:
> “- Actually only one caller needs the errno, that is the above qemu_signalfd_compat(). 

Yes.

> - For the returning value, I remember there's once a email thread talking about it: returning a bool (and let the passed errp hold the error message) is to keep the consistency with glib.

GLib doesn't discourage return types other than boolean.  It only asks
that if you return boolean, then true should mean success and false
should mean failure.  See

    https://developer.gnome.org/glib/stable/glib-Error-Reporting.html

under "Rules for use of GError", item "By convention, if you return a
boolean value".

The discussion you remember was about a convention we used to enforce in
QEMU, namely to avoid returning boolean success, and return void
instead.  That was basically a bad idea.

> So IMO I am wondering whether it is really needed to change the bool (true/false) to int (0/-errno), just for that sole function: qemu_signalfd_compat() which needs the errno. Besides if we return -errno, for each caller we need add a local variable like “ret= qemu_thread_create()” to store the -errno.

Well, you either assign the error code to errno just for that caller, or
you return the error code just for that caller.  I'd do the latter
because I consider it slightly simpler.  Compare

 * On success, return true.
 * On failure, set @errno, store an error through @errp and return
 * false.

to

 * On success, return zero.
 * On failure, store an error through @errp and return negative errno.

where the second sentence describes just two instead of three actions.

[...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-08 16:50     ` fei
@ 2019-01-08 17:29       ` Markus Armbruster
  2019-01-09 14:01         ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-08 17:29 UTC (permalink / raw)
  To: fei; +Cc: Stefan Weil, qemu-devel, shirley17fei

fei <lifei1214@126.com> writes:

>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>> 
>> Fei Li <fli@suse.com> writes:
>> 
>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>> return when the QemuThread *thread failed to be created in either
>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>> 
>>> Cc: Stefan Weil <sw@weilnetz.de>
>>> Signed-off-by: Fei Li <fli@suse.com>
>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>> ---
>>> util/qemu-thread-posix.c | 3 +++
>>> util/qemu-thread-win32.c | 2 +-
>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>> index 39834b0551..3548935dac 100644
>>> --- a/util/qemu-thread-posix.c
>>> +++ b/util/qemu-thread-posix.c
>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>     int err;
>>>     void *ret;
>>> 
>>> +    if (!thread->thread) {
>>> +        return NULL;
>>> +    }
>> 
>> How can this happen?
> I think I have answered this earlier, please check the following link to see whether it helps:
> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html

Thanks for the pointer.  Unfortunately, I don't understand your
explanation.  You also wrote there "I will remove this patch in next
version"; looks like you've since changed your mind.

What exactly breaks if we omit this patch?  Assuming something does
break: imagine we did omit this patch, then forgot we ever saw it, and
now you've discovered the breakage.  Write us the bug report, complete
with reproducer.

[...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2019-01-08 17:07       ` Markus Armbruster
@ 2019-01-09 13:19         ` Fei Li
  2019-01-09 14:36           ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-09 13:19 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, Paolo Bonzini, qemu-devel, shirley17fei


在 2019/1/9 上午1:07, Markus Armbruster 写道:
> fei <lifei1214@126.com> writes:
>
>>> 在 2019年1月8日,01:18,Markus Armbruster <armbru@redhat.com> 写道:
>>>
>>> Fei Li <fli@suse.com> writes:
>>>
>>>> qemu_thread_create() abort()s on error. Not nice. Give it a return
>>>> value and an Error ** argument, so it can return success/failure.
>>>>
>>>> Considering qemu_thread_create() is quite widely used in qemu, split
>>>> this into two steps: this patch passes the &error_abort to
>>>> qemu_thread_create() everywhere, and the next 9 patches will improve
>>>> on &error_abort for callers who need.
>>>>
>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Signed-off-by: Fei Li <fli@suse.com>
>>> The commit message's title promises more than the patch delivers.
>>> Suggest:
>>>
>>>     qemu_thread: Make qemu_thread_create() take Error ** argument
>> Ok, thanks for the suggestion. :)
>>> The rest of the commit message is fine.
>>>
>>>> ---
>>>> cpus.c                      | 23 +++++++++++++++--------
>>>> dump.c                      |  3 ++-
>>>> hw/misc/edu.c               |  4 +++-
>>>> hw/ppc/spapr_hcall.c        |  4 +++-
>>>> hw/rdma/rdma_backend.c      |  3 ++-
>>>> hw/usb/ccid-card-emulated.c |  5 +++--
>>>> include/qemu/thread.h       |  4 ++--
>>>> io/task.c                   |  3 ++-
>>>> iothread.c                  |  3 ++-
>>>> migration/migration.c       | 11 ++++++++---
>>>> migration/postcopy-ram.c    |  4 +++-
>>>> migration/ram.c             | 12 ++++++++----
>>>> migration/savevm.c          |  3 ++-
>>>> tests/atomic_add-bench.c    |  3 ++-
>>>> tests/iothread.c            |  2 +-
>>>> tests/qht-bench.c           |  3 ++-
>>>> tests/rcutorture.c          |  3 ++-
>>>> tests/test-aio.c            |  2 +-
>>>> tests/test-rcu-list.c       |  3 ++-
>>>> ui/vnc-jobs.c               |  6 ++++--
>>>> util/compatfd.c             |  6 ++++--
>>>> util/oslib-posix.c          |  3 ++-
>>>> util/qemu-thread-posix.c    | 27 ++++++++++++++++++++-------
>>>> util/qemu-thread-win32.c    | 16 ++++++++++++----
>>>> util/rcu.c                  |  3 ++-
>>>> util/thread-pool.c          |  4 +++-
>>>> 26 files changed, 112 insertions(+), 51 deletions(-)
>>>>
>>>> diff --git a/cpus.c b/cpus.c
>>>> index 0ddeeefc14..25df03326b 100644
>>>> --- a/cpus.c
>>>> +++ b/cpus.c
>>>> @@ -1961,15 +1961,17 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>              snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>>                   cpu->cpu_index);
>>>>
>>>> +            /* TODO: let the callers handle the error instead of abort() here */
>>>>              qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
>>>> -                               cpu, QEMU_THREAD_JOINABLE);
>>>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>>>>
>>>>          } else {
>>>>              /* share a single thread for all cpus with TCG */
>>>>              snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "ALL CPUs/TCG");
>>>> +            /* TODO: let the callers handle the error instead of abort() here */
>>>>              qemu_thread_create(cpu->thread, thread_name,
>>>>                                 qemu_tcg_rr_cpu_thread_fn,
>>>> -                               cpu, QEMU_THREAD_JOINABLE);
>>>> +                               cpu, QEMU_THREAD_JOINABLE, &error_abort);
>>>>
>>>>              single_tcg_halt_cond = cpu->halt_cond;
>>>>              single_tcg_cpu_thread = cpu->thread;
>>> You add this TODO comment to 24 out of 37 calls.  Can you give your
>>> reasons for adding it to some calls, but not to others?
>> For those have TODO, I polish them in the next following patches, and for those do not have TODO I just let them use &error_abort.
> Please mention that in the commit message.
ok.
>
>>> [...]
>>>> diff --git a/include/qemu/thread.h b/include/qemu/thread.h
>>>> index 55d83a907c..12291f4ccd 100644
>>>> --- a/include/qemu/thread.h
>>>> +++ b/include/qemu/thread.h
>>>> @@ -152,9 +152,9 @@ void qemu_event_reset(QemuEvent *ev);
>>>> void qemu_event_wait(QemuEvent *ev);
>>>> void qemu_event_destroy(QemuEvent *ev);
>>>>
>>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>>> +bool qemu_thread_create(QemuThread *thread, const char *name,
>>>>                          void *(*start_routine)(void *),
>>>> -                        void *arg, int mode);
>>>> +                        void *arg, int mode, Error **errp);
>>>> void *qemu_thread_join(QemuThread *thread);
>>>> void qemu_thread_get_self(QemuThread *thread);
>>>> bool qemu_thread_is_self(QemuThread *thread);
>>> [...]
>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>> index 865e476df5..39834b0551 100644
>>>> --- a/util/qemu-thread-posix.c
>>>> +++ b/util/qemu-thread-posix.c
>>>> @@ -15,6 +15,7 @@
>>>> #include "qemu/atomic.h"
>>>> #include "qemu/notify.h"
>>>> #include "qemu-thread-common.h"
>>>> +#include "qapi/error.h"
>>>>
>>>> static bool name_threads;
>>>>
>>>> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>>>>      return r;
>>>> }
>>>>
>>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>>> -                       void *(*start_routine)(void*),
>>>> -                       void *arg, int mode)
>>>> +/*
>>>> + * Return a boolean: true/false to indicate whether it succeeds.
>>>> + * If fails, propagate the error to Error **errp and set the errno.
>>>> + */
>>> Let's write something that can pass as a function contract:
>>>
>>>    /*
>>>     * Create a new thread with name @name
>>>     * The thread executes @start_routine() with argument @arg.
>>>     * The thread will be created in a detached state if @mode is
>>>     * QEMU_THREAD_DETACHED, and in a jounable state if it's
>>>     * QEMU_THREAD_JOINABLE.
>>>     * On success, return true.
>>>     * On failure, set @errno, store an error through @errp and return
>>>     * false.
>>>     */
>> Thanks so much for amending! :)
>>> Personally, I'd return negative errno instead of false, and dispense
>>> with setting errno.
>> Emm, I think I have replied this in last version, but due to several reasons I did not wait for your feedback and sent the v9. Sorry for that.. And I like to paste my two considerations here:
>> “- Actually only one caller needs the errno, that is the above qemu_signalfd_compat().
> Yes.
>
>> - For the returning value, I remember there's once a email thread talking about it: returning a bool (and let the passed errp hold the error message) is to keep the consistency with glib.
> GLib doesn't discourage return types other than boolean.  It only asks
> that if you return boolean, then true should mean success and false
> should mean failure.  See
>
>      https://developer.gnome.org/glib/stable/glib-Error-Reporting.html
>
> under "Rules for use of GError", item "By convention, if you return a
> boolean value".
>
> The discussion you remember was about a convention we used to enforce in
> QEMU, namely to avoid returning boolean success, and return void
> instead.  That was basically a bad idea.
>
>> So IMO I am wondering whether it is really needed to change the bool (true/false) to int (0/-errno), just for that sole function: qemu_signalfd_compat() which needs the errno. Besides if we return -errno, for each caller we need add a local variable like “ret= qemu_thread_create()” to store the -errno.
> Well, you either assign the error code to errno just for that caller, or
> you return the error code just for that caller.  I'd do the latter
> because I consider it slightly simpler.  Compare
>
>   * On success, return true.
>   * On failure, set @errno, store an error through @errp and return
>   * false.
>
> to
>
>   * On success, return zero.
>   * On failure, store an error through @errp and return negative errno.
>
> where the second sentence describes just two instead of three actions.
>
> [...]
Ok, decribing two actions than three is indeed simpler. But I still have 
one uncertain:
for those callers do not need the errno value, could we just check the 
return value
to see whether it is negative, but not cache the unused return value? I mean

In the caller:

{...
     if (qemu_thread_create() < 0) {// do some cleanup}
...}

instead of

{    int ret;
...
     ret = qemu_thread_create();
     if (ret < 0) { //do some cleanup }

...}

As the first one can lessen quite a lot of codes. :)

Have a nice day, thanks

Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize
  2019-01-07 17:31   ` Markus Armbruster
@ 2019-01-09 13:21     ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-09 13:21 UTC (permalink / raw)
  To: Markus Armbruster, Fei Li; +Cc: qemu-devel, shirley17fei, Gerd Hoffmann


在 2019/1/8 上午1:31, Markus Armbruster 写道:
> Fei Li <fli@suse.com> writes:
>
>> Utilize the existed errp to propagate the error and do the
>> corresponding cleanup to replace the temporary &error_abort.
>>
>> Cc: Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Gerd Hoffmann <kraxel@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>>   hw/usb/ccid-card-emulated.c | 15 +++++++++------
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
>> index f8ff7ff4a3..9245b4fcad 100644
>> --- a/hw/usb/ccid-card-emulated.c
>> +++ b/hw/usb/ccid-card-emulated.c
>> @@ -32,7 +32,6 @@
>>   #include "qemu/thread.h"
>>   #include "qemu/main-loop.h"
>>   #include "ccid.h"
>> -#include "qapi/error.h"
>>   
>>   #define DPRINTF(card, lvl, fmt, ...) \
>>   do {\
>> @@ -544,11 +543,15 @@ static void emulated_realize(CCIDCardState *base, Error **errp)
>>           error_setg(errp, "%s: failed to initialize vcard", TYPE_EMULATED_CCID);
>>           goto out2;
>>       }
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
>> -                       card, QEMU_THREAD_JOINABLE, &error_abort);
>> -    qemu_thread_create(&card->apdu_thread_id, "ccid/apdu", handle_apdu_thread,
>> -                       card, QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&card->event_thread_id, "ccid/event", event_thread,
>> +                            card, QEMU_THREAD_JOINABLE, errp)) {
>> +        goto out2;
>> +    }
>> +    if (!qemu_thread_create(&card->apdu_thread_id, "ccid/apdu",
>> +                            handle_apdu_thread, card,
>> +                            QEMU_THREAD_JOINABLE, errp)) {
>> +        goto out2;
> You need to stop and join the first thread.

Thanks for pointing this out!

Have a nice day
Fei
>
>> +    }
>>   
>>   out2:
>>       clean_event_notifier(card);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-08 17:29       ` Markus Armbruster
@ 2019-01-09 14:01         ` Fei Li
  2019-01-09 15:24           ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-09 14:01 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Stefan Weil, qemu-devel, shirley17fei


在 2019/1/9 上午1:29, Markus Armbruster 写道:
> fei <lifei1214@126.com> writes:
>
>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>
>>> Fei Li <fli@suse.com> writes:
>>>
>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>> return when the QemuThread *thread failed to be created in either
>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>
>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>> ---
>>>> util/qemu-thread-posix.c | 3 +++
>>>> util/qemu-thread-win32.c | 2 +-
>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>> index 39834b0551..3548935dac 100644
>>>> --- a/util/qemu-thread-posix.c
>>>> +++ b/util/qemu-thread-posix.c
>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>      int err;
>>>>      void *ret;
>>>>
>>>> +    if (!thread->thread) {
>>>> +        return NULL;
>>>> +    }
>>> How can this happen?
>> I think I have answered this earlier, please check the following link to see whether it helps:
>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
> Thanks for the pointer.  Unfortunately, I don't understand your
> explanation.  You also wrote there "I will remove this patch in next
> version"; looks like you've since changed your mind.
Emm, issues left over from history.. The background is I was hurry to 
make those five
Reviewed-by patches be merged, including this v9 16/16 patch but not the 
real
qemu_thread_create() modification. But actually this patch is to fix the 
segmentation
fault after we modified qemu_thread_create() related functions although 
it has got a
Reviewed-by earlier. :) Thus to not make troube, I wrote the "remove..." 
sentence
to separate it from those 5 Reviewed-by patches, and were plan to send 
only four patches.
But later I got a message that these five patches are not that urgent to 
catch qemu v3.1,
thus I joined the earlier 5 R-b patches into the later v8 & v9 to have a 
better review.

Sorry for the trouble, I need to explain it without involving too much 
background..

Back at the farm: in our current qemu code, some cleanups use a loop to 
join()
the total number of threads if caller fails. This is not a problem until 
applying the
qemu_thread_create() modification. E.g. when compress_threads_save_setup()
fails while trying to create the last do_data_compress thread, 
segmentation fault
will occur when join() is called (sadly there's not enough condition to 
filter this
unsuccessful created thread) as this thread is actually not be created.

Hope the above makes it clear. :)

Have a nice day
Fei
>
> What exactly breaks if we omit this patch?  Assuming something does
> break: imagine we did omit this patch, then forgot we ever saw it, and
> now you've discovered the breakage.  Write us the bug report, complete
> with reproducer.
>
> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2019-01-09 13:19         ` Fei Li
@ 2019-01-09 14:36           ` Markus Armbruster
  2019-01-09 14:42             ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-09 14:36 UTC (permalink / raw)
  To: Fei Li; +Cc: Paolo Bonzini, qemu-devel, shirley17fei

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/9 上午1:07, Markus Armbruster 写道:
>> fei <lifei1214@126.com> writes:
>>
>>>> 在 2019年1月8日,01:18,Markus Armbruster <armbru@redhat.com> 写道:
>>>>
>>>> Fei Li <fli@suse.com> writes:
[...]
>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>> index 865e476df5..39834b0551 100644
>>>>> --- a/util/qemu-thread-posix.c
>>>>> +++ b/util/qemu-thread-posix.c
>>>>> @@ -15,6 +15,7 @@
>>>>> #include "qemu/atomic.h"
>>>>> #include "qemu/notify.h"
>>>>> #include "qemu-thread-common.h"
>>>>> +#include "qapi/error.h"
>>>>>
>>>>> static bool name_threads;
>>>>>
>>>>> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>>>>>      return r;
>>>>> }
>>>>>
>>>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>>>> -                       void *(*start_routine)(void*),
>>>>> -                       void *arg, int mode)
>>>>> +/*
>>>>> + * Return a boolean: true/false to indicate whether it succeeds.
>>>>> + * If fails, propagate the error to Error **errp and set the errno.
>>>>> + */
>>>> Let's write something that can pass as a function contract:
>>>>
>>>>    /*
>>>>     * Create a new thread with name @name
>>>>     * The thread executes @start_routine() with argument @arg.
>>>>     * The thread will be created in a detached state if @mode is
>>>>     * QEMU_THREAD_DETACHED, and in a jounable state if it's
>>>>     * QEMU_THREAD_JOINABLE.
>>>>     * On success, return true.
>>>>     * On failure, set @errno, store an error through @errp and return
>>>>     * false.
>>>>     */
>>> Thanks so much for amending! :)
>>>> Personally, I'd return negative errno instead of false, and dispense
>>>> with setting errno.
>>> Emm, I think I have replied this in last version, but due to several reasons I did not wait for your feedback and sent the v9. Sorry for that.. And I like to paste my two considerations here:
>>> “- Actually only one caller needs the errno, that is the above qemu_signalfd_compat().
>> Yes.
>>
>>> - For the returning value, I remember there's once a email thread talking about it: returning a bool (and let the passed errp hold the error message) is to keep the consistency with glib.
>> GLib doesn't discourage return types other than boolean.  It only asks
>> that if you return boolean, then true should mean success and false
>> should mean failure.  See
>>
>>      https://developer.gnome.org/glib/stable/glib-Error-Reporting.html
>>
>> under "Rules for use of GError", item "By convention, if you return a
>> boolean value".
>>
>> The discussion you remember was about a convention we used to enforce in
>> QEMU, namely to avoid returning boolean success, and return void
>> instead.  That was basically a bad idea.
>>
>>> So IMO I am wondering whether it is really needed to change the bool (true/false) to int (0/-errno), just for that sole function: qemu_signalfd_compat() which needs the errno. Besides if we return -errno, for each caller we need add a local variable like “ret= qemu_thread_create()” to store the -errno.
>> Well, you either assign the error code to errno just for that caller, or
>> you return the error code just for that caller.  I'd do the latter
>> because I consider it slightly simpler.  Compare
>>
>>   * On success, return true.
>>   * On failure, set @errno, store an error through @errp and return
>>   * false.
>>
>> to
>>
>>   * On success, return zero.
>>   * On failure, store an error through @errp and return negative errno.
>>
>> where the second sentence describes just two instead of three actions.
>>
>> [...]
> Ok, decribing two actions than three is indeed simpler. But I still
> have one uncertain:
> for those callers do not need the errno value, could we just check the
> return value
> to see whether it is negative, but not cache the unused return value? I mean
>
> In the caller:
>
> {...
>     if (qemu_thread_create() < 0) {// do some cleanup}
> ...}

This is just fine when you handle all errors the same.

> instead of
>
> {    int ret;
> ...
>     ret = qemu_thread_create();
>     if (ret < 0) { //do some cleanup }
>
> ...}

I'd object to this one unless the value of @ret gets used elsewhere.

> As the first one can lessen quite a lot of codes. :)
>
> Have a nice day, thanks
>
> Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly
  2019-01-09 14:36           ` Markus Armbruster
@ 2019-01-09 14:42             ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-09 14:42 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Paolo Bonzini, qemu-devel, shirley17fei



> 在 2019年1月9日,22:36,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <lifei1214@126.com> writes:
> 
>>> 在 2019/1/9 上午1:07, Markus Armbruster 写道:
>>> fei <lifei1214@126.com> writes:
>>> 
>>>>> 在 2019年1月8日,01:18,Markus Armbruster <armbru@redhat.com> 写道:
>>>>> 
>>>>> Fei Li <fli@suse.com> writes:
> [...]
>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>> index 865e476df5..39834b0551 100644
>>>>>> --- a/util/qemu-thread-posix.c
>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>> @@ -15,6 +15,7 @@
>>>>>> #include "qemu/atomic.h"
>>>>>> #include "qemu/notify.h"
>>>>>> #include "qemu-thread-common.h"
>>>>>> +#include "qapi/error.h"
>>>>>> 
>>>>>> static bool name_threads;
>>>>>> 
>>>>>> @@ -500,9 +501,13 @@ static void *qemu_thread_start(void *args)
>>>>>>     return r;
>>>>>> }
>>>>>> 
>>>>>> -void qemu_thread_create(QemuThread *thread, const char *name,
>>>>>> -                       void *(*start_routine)(void*),
>>>>>> -                       void *arg, int mode)
>>>>>> +/*
>>>>>> + * Return a boolean: true/false to indicate whether it succeeds.
>>>>>> + * If fails, propagate the error to Error **errp and set the errno.
>>>>>> + */
>>>>> Let's write something that can pass as a function contract:
>>>>> 
>>>>>   /*
>>>>>    * Create a new thread with name @name
>>>>>    * The thread executes @start_routine() with argument @arg.
>>>>>    * The thread will be created in a detached state if @mode is
>>>>>    * QEMU_THREAD_DETACHED, and in a jounable state if it's
>>>>>    * QEMU_THREAD_JOINABLE.
>>>>>    * On success, return true.
>>>>>    * On failure, set @errno, store an error through @errp and return
>>>>>    * false.
>>>>>    */
>>>> Thanks so much for amending! :)
>>>>> Personally, I'd return negative errno instead of false, and dispense
>>>>> with setting errno.
>>>> Emm, I think I have replied this in last version, but due to several reasons I did not wait for your feedback and sent the v9. Sorry for that.. And I like to paste my two considerations here:
>>>> “- Actually only one caller needs the errno, that is the above qemu_signalfd_compat().
>>> Yes.
>>> 
>>>> - For the returning value, I remember there's once a email thread talking about it: returning a bool (and let the passed errp hold the error message) is to keep the consistency with glib.
>>> GLib doesn't discourage return types other than boolean.  It only asks
>>> that if you return boolean, then true should mean success and false
>>> should mean failure.  See
>>> 
>>>     https://developer.gnome.org/glib/stable/glib-Error-Reporting.html
>>> 
>>> under "Rules for use of GError", item "By convention, if you return a
>>> boolean value".
>>> 
>>> The discussion you remember was about a convention we used to enforce in
>>> QEMU, namely to avoid returning boolean success, and return void
>>> instead.  That was basically a bad idea.
>>> 
>>>> So IMO I am wondering whether it is really needed to change the bool (true/false) to int (0/-errno), just for that sole function: qemu_signalfd_compat() which needs the errno. Besides if we return -errno, for each caller we need add a local variable like “ret= qemu_thread_create()” to store the -errno.
>>> Well, you either assign the error code to errno just for that caller, or
>>> you return the error code just for that caller.  I'd do the latter
>>> because I consider it slightly simpler.  Compare
>>> 
>>>  * On success, return true.
>>>  * On failure, set @errno, store an error through @errp and return
>>>  * false.
>>> 
>>> to
>>> 
>>>  * On success, return zero.
>>>  * On failure, store an error through @errp and return negative errno.
>>> 
>>> where the second sentence describes just two instead of three actions.
>>> 
>>> [...]
>> Ok, decribing two actions than three is indeed simpler. But I still
>> have one uncertain:
>> for those callers do not need the errno value, could we just check the
>> return value
>> to see whether it is negative, but not cache the unused return value? I mean
>> 
>> In the caller:
>> 
>> {...
>>     if (qemu_thread_create() < 0) {// do some cleanup}
>> ...}
> 
> This is just fine when you handle all errors the same.
> 
>> instead of
>> 
>> {    int ret;
>> ...
>>     ret = qemu_thread_create();
>>     if (ret < 0) { //do some cleanup }
>> 
>> ...}
> 
> I'd object to this one unless the value of @ret gets used elsewhere.
Ok, thanks for the review :)

Have a nice day
Fei
> 
>> As the first one can lessen quite a lot of codes. :)
>> 
>> Have a nice day, thanks
>> 
>> Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-09 14:01         ` Fei Li
@ 2019-01-09 15:24           ` Markus Armbruster
  2019-01-09 15:57             ` fei
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-09 15:24 UTC (permalink / raw)
  To: Fei Li; +Cc: Stefan Weil, qemu-devel, shirley17fei

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>> fei <lifei1214@126.com> writes:
>>
>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>
>>>> Fei Li <fli@suse.com> writes:
>>>>
>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>> return when the QemuThread *thread failed to be created in either
>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>
>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>> ---
>>>>> util/qemu-thread-posix.c | 3 +++
>>>>> util/qemu-thread-win32.c | 2 +-
>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>> index 39834b0551..3548935dac 100644
>>>>> --- a/util/qemu-thread-posix.c
>>>>> +++ b/util/qemu-thread-posix.c
>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>      int err;
>>>>>      void *ret;
>>>>>
>>>>> +    if (!thread->thread) {
>>>>> +        return NULL;
>>>>> +    }
>>>> How can this happen?
>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>> Thanks for the pointer.  Unfortunately, I don't understand your
>> explanation.  You also wrote there "I will remove this patch in next
>> version"; looks like you've since changed your mind.
> Emm, issues left over from history.. The background is I was hurry to
> make those five
> Reviewed-by patches be merged, including this v9 16/16 patch but not
> the real
> qemu_thread_create() modification. But actually this patch is to fix
> the segmentation
> fault after we modified qemu_thread_create() related functions
> although it has got a
> Reviewed-by earlier. :) Thus to not make troube, I wrote the
> "remove..." sentence
> to separate it from those 5 Reviewed-by patches, and were plan to send
> only four patches.
> But later I got a message that these five patches are not that urgent
> to catch qemu v3.1,
> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
> a better review.
>
> Sorry for the trouble, I need to explain it without involving too much
> background..
>
> Back at the farm: in our current qemu code, some cleanups use a loop
> to join()
> the total number of threads if caller fails. This is not a problem
> until applying the
> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
> fails while trying to create the last do_data_compress thread,
> segmentation fault
> will occur when join() is called (sadly there's not enough condition
> to filter this
> unsuccessful created thread) as this thread is actually not be created.
>
> Hope the above makes it clear. :)

Alright, let's have a look at compress_threads_save_setup():

    static int compress_threads_save_setup(void)
    {
        int i, thread_count;

        if (!migrate_use_compression()) {
            return 0;
        }
        thread_count = migrate_compress_threads();
        compress_threads = g_new0(QemuThread, thread_count);
        comp_param = g_new0(CompressParam, thread_count);
        qemu_cond_init(&comp_done_cond);
        qemu_mutex_init(&comp_done_lock);
        for (i = 0; i < thread_count; i++) {
            comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
            if (!comp_param[i].originbuf) {
                goto exit;
            }

            if (deflateInit(&comp_param[i].stream,
                            migrate_compress_level()) != Z_OK) {
                g_free(comp_param[i].originbuf);
                goto exit;
            }

            /* comp_param[i].file is just used as a dummy buffer to save data,
             * set its ops to empty.
             */
            comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
            comp_param[i].done = true;
            comp_param[i].quit = false;
            qemu_mutex_init(&comp_param[i].mutex);
            qemu_cond_init(&comp_param[i].cond);
            qemu_thread_create(compress_threads + i, "compress",
                               do_data_compress, comp_param + i,
                               QEMU_THREAD_JOINABLE);
        }
        return 0;

    exit:
        compress_threads_save_cleanup();
        return -1;
    }

At label exit, we have @i threads, all fully initialized.  That's an
invariant.

compress_threads_save_cleanup() finds the threads to clean up by
checking comp_param[i].file:

    static void compress_threads_save_cleanup(void)
    {
        int i, thread_count;

        if (!migrate_use_compression() || !comp_param) {
            return;
        }

        thread_count = migrate_compress_threads();
        for (i = 0; i < thread_count; i++) {
            /*
             * we use it as a indicator which shows if the thread is
             * properly init'd or not
             */
--->        if (!comp_param[i].file) {
--->            break;
--->        }

            qemu_mutex_lock(&comp_param[i].mutex);
            comp_param[i].quit = true;
            qemu_cond_signal(&comp_param[i].cond);
            qemu_mutex_unlock(&comp_param[i].mutex);

            qemu_thread_join(compress_threads + i);
            qemu_mutex_destroy(&comp_param[i].mutex);
            qemu_cond_destroy(&comp_param[i].cond);
            deflateEnd(&comp_param[i].stream);
            g_free(comp_param[i].originbuf);
            qemu_fclose(comp_param[i].file);
            comp_param[i].file = NULL;
        }
        qemu_mutex_destroy(&comp_done_lock);
        qemu_cond_destroy(&comp_done_cond);
        g_free(compress_threads);
        g_free(comp_param);
        compress_threads = NULL;
        comp_param = NULL;
    }

Due to the invariant, a comp_param[i] with a null .file doesn't need
*any* cleanup.

To maintain the invariant, compress_threads_save_setup() carefully
cleans up any partial initializations itself before a goto exit.  Since
the code is arranged smartly, the only such cleanup is the
g_free(comp_param[i].originbuf) before the second goto exit.

Your PATCH 13 adds a third goto exit, but neglects to clean up partial
initializations.  Breaks the invariant.

I see two sane solutions:

1. compress_threads_save_setup() carefully cleans up partial
   initializations itself.  compress_threads_save_cleanup() copes only
   with fully initialized comp_param[i].  This is how things work before
   your series.

2. compress_threads_save_cleanup() copes with partially initialized
   comp_param[i], i.e. does the right thing for each goto exit in
   compress_threads_save_setup().  compress_threads_save_setup() doesn't
   clean up partial initializations.

Your PATCH 13 together with the fixup PATCH 16 does

3. A confusing mix of the two.

Don't.

> Have a nice day
> Fei
>>
>> What exactly breaks if we omit this patch?  Assuming something does
>> break: imagine we did omit this patch, then forgot we ever saw it, and
>> now you've discovered the breakage.  Write us the bug report, complete
>> with reproducer.
>>
>> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration
  2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
  2019-01-03 12:35   ` Dr. David Alan Gilbert
@ 2019-01-09 15:26   ` Markus Armbruster
  2019-01-09 16:01     ` fei
  1 sibling, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-09 15:26 UTC (permalink / raw)
  To: Fei Li
  Cc: qemu-devel, shirley17fei, lifei1214, Peter Xu, Dr . David Alan Gilbert

Fei Li <fli@suse.com> writes:

> Update qemu_thread_create()'s callers by
> - setting an error on qemu_thread_create() failure for callers that
>   set an error on failure;
> - reporting the error and returning failure for callers that return
>   an error code on failure;
> - reporting the error and setting some state for callers that just
>   report errors and choose not to continue on.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
[...]
> diff --git a/migration/ram.c b/migration/ram.c
> index eed1daf302..1e24a78eaa 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
[...]
> @@ -3625,6 +3637,7 @@ static void compress_threads_load_cleanup(void)
>  static int compress_threads_load_setup(QEMUFile *f)
>  {
>      int i, thread_count;
> +    Error *local_err = NULL;
>  
>      if (!migrate_use_compression()) {
>          return 0;
> @@ -3646,10 +3659,13 @@ static int compress_threads_load_setup(QEMUFile *f)
>          qemu_cond_init(&decomp_param[i].cond);
>          decomp_param[i].done = true;
>          decomp_param[i].quit = false;
> -        /* TODO: let the further caller handle the error instead of abort() */
> -        qemu_thread_create(decompress_threads + i, "decompress",
> -                           do_data_decompress, decomp_param + i,
> -                           QEMU_THREAD_JOINABLE, &error_abort);
> +        if (!qemu_thread_create(decompress_threads + i, "decompress",
> +                                do_data_decompress, decomp_param + i,
> +                                QEMU_THREAD_JOINABLE, &local_err)) {
> +            error_reportf_err(local_err,
> +                              "failed to create do_data_decompress: ");
> +            goto exit;

Broken error handling, see my review of PATCH 16.

> +        }
>      }
>      return 0;
>  exit:
[...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-09 15:24           ` Markus Armbruster
@ 2019-01-09 15:57             ` fei
  2019-01-10  9:20               ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: fei @ 2019-01-09 15:57 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Stefan Weil, qemu-devel, shirley17fei



> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <lifei1214@126.com> writes:
> 
>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>> fei <lifei1214@126.com> writes:
>>> 
>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>> 
>>>>> Fei Li <fli@suse.com> writes:
>>>>> 
>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>> 
>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>> ---
>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>> 
>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>> index 39834b0551..3548935dac 100644
>>>>>> --- a/util/qemu-thread-posix.c
>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>     int err;
>>>>>>     void *ret;
>>>>>> 
>>>>>> +    if (!thread->thread) {
>>>>>> +        return NULL;
>>>>>> +    }
>>>>> How can this happen?
>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>> explanation.  You also wrote there "I will remove this patch in next
>>> version"; looks like you've since changed your mind.
>> Emm, issues left over from history.. The background is I was hurry to
>> make those five
>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>> the real
>> qemu_thread_create() modification. But actually this patch is to fix
>> the segmentation
>> fault after we modified qemu_thread_create() related functions
>> although it has got a
>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>> "remove..." sentence
>> to separate it from those 5 Reviewed-by patches, and were plan to send
>> only four patches.
>> But later I got a message that these five patches are not that urgent
>> to catch qemu v3.1,
>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>> a better review.
>> 
>> Sorry for the trouble, I need to explain it without involving too much
>> background..
>> 
>> Back at the farm: in our current qemu code, some cleanups use a loop
>> to join()
>> the total number of threads if caller fails. This is not a problem
>> until applying the
>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>> fails while trying to create the last do_data_compress thread,
>> segmentation fault
>> will occur when join() is called (sadly there's not enough condition
>> to filter this
>> unsuccessful created thread) as this thread is actually not be created.
>> 
>> Hope the above makes it clear. :)
> 
> Alright, let's have a look at compress_threads_save_setup():
> 
>    static int compress_threads_save_setup(void)
>    {
>        int i, thread_count;
> 
>        if (!migrate_use_compression()) {
>            return 0;
>        }
>        thread_count = migrate_compress_threads();
>        compress_threads = g_new0(QemuThread, thread_count);
>        comp_param = g_new0(CompressParam, thread_count);
>        qemu_cond_init(&comp_done_cond);
>        qemu_mutex_init(&comp_done_lock);
>        for (i = 0; i < thread_count; i++) {
>            comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>            if (!comp_param[i].originbuf) {
>                goto exit;
>            }
> 
>            if (deflateInit(&comp_param[i].stream,
>                            migrate_compress_level()) != Z_OK) {
>                g_free(comp_param[i].originbuf);
>                goto exit;
>            }
> 
>            /* comp_param[i].file is just used as a dummy buffer to save data,
>             * set its ops to empty.
>             */
>            comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>            comp_param[i].done = true;
>            comp_param[i].quit = false;
>            qemu_mutex_init(&comp_param[i].mutex);
>            qemu_cond_init(&comp_param[i].cond);
>            qemu_thread_create(compress_threads + i, "compress",
>                               do_data_compress, comp_param + i,
>                               QEMU_THREAD_JOINABLE);
>        }
>        return 0;
> 
>    exit:
>        compress_threads_save_cleanup();
>        return -1;
>    }
> 
> At label exit, we have @i threads, all fully initialized.  That's an
> invariant.
> 
> compress_threads_save_cleanup() finds the threads to clean up by
> checking comp_param[i].file:
> 
>    static void compress_threads_save_cleanup(void)
>    {
>        int i, thread_count;
> 
>        if (!migrate_use_compression() || !comp_param) {
>            return;
>        }
> 
>        thread_count = migrate_compress_threads();
>        for (i = 0; i < thread_count; i++) {
>            /*
>             * we use it as a indicator which shows if the thread is
>             * properly init'd or not
>             */
> --->        if (!comp_param[i].file) {
> --->            break;
> --->        }
> 
>            qemu_mutex_lock(&comp_param[i].mutex);
>            comp_param[i].quit = true;
>            qemu_cond_signal(&comp_param[i].cond);
>            qemu_mutex_unlock(&comp_param[i].mutex);
> 
>            qemu_thread_join(compress_threads + i);
>            qemu_mutex_destroy(&comp_param[i].mutex);
>            qemu_cond_destroy(&comp_param[i].cond);
>            deflateEnd(&comp_param[i].stream);
>            g_free(comp_param[i].originbuf);
>            qemu_fclose(comp_param[i].file);
>            comp_param[i].file = NULL;
>        }
>        qemu_mutex_destroy(&comp_done_lock);
>        qemu_cond_destroy(&comp_done_cond);
>        g_free(compress_threads);
>        g_free(comp_param);
>        compress_threads = NULL;
>        comp_param = NULL;
>    }
> 
> Due to the invariant, a comp_param[i] with a null .file doesn't need
> *any* cleanup.
> 
> To maintain the invariant, compress_threads_save_setup() carefully
> cleans up any partial initializations itself before a goto exit.  Since
> the code is arranged smartly, the only such cleanup is the
> g_free(comp_param[i].originbuf) before the second goto exit.
> 
> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
> initializations.  Breaks the invariant.
> 
> I see two sane solutions:
> 
> 1. compress_threads_save_setup() carefully cleans up partial
>   initializations itself.  compress_threads_save_cleanup() copes only
>   with fully initialized comp_param[i].  This is how things work before
>   your series.
> 
> 2. compress_threads_save_cleanup() copes with partially initialized
>   comp_param[i], i.e. does the right thing for each goto exit in
>   compress_threads_save_setup().  compress_threads_save_setup() doesn't
>   clean up partial initializations.
> 
> Your PATCH 13 together with the fixup PATCH 16 does
> 
> 3. A confusing mix of the two.
> 
> Don't.
Thanks for the detail analysis! :)
Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function.. 
Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup? 

[1]
qemu_mutex_lock(&comp_param[i].mutex);
           comp_param[i].quit = true;
           qemu_cond_signal(&comp_param[i].cond);
           qemu_mutex_unlock(&comp_param[i].mutex);

qemu_mutex_destroy(&comp_param[i].mutex);
           qemu_cond_destroy(&comp_param[i].cond);
           deflateEnd(&comp_param[i].stream);
           g_free(comp_param[i].originbuf);
           qemu_fclose(comp_param[i].file);
           comp_param[i].file = NULL;

Have a nice day, thanks
Fei
> 
>> Have a nice day
>> Fei
>>> 
>>> What exactly breaks if we omit this patch?  Assuming something does
>>> break: imagine we did omit this patch, then forgot we ever saw it, and
>>> now you've discovered the breakage.  Write us the bug report, complete
>>> with reproducer.
>>> 
>>> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration
  2019-01-09 15:26   ` Markus Armbruster
@ 2019-01-09 16:01     ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-09 16:01 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Fei Li, qemu-devel, shirley17fei, Peter Xu, Dr . David Alan Gilbert



> 在 2019年1月9日,23:26,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> Update qemu_thread_create()'s callers by
>> - setting an error on qemu_thread_create() failure for callers that
>>  set an error on failure;
>> - reporting the error and returning failure for callers that return
>>  an error code on failure;
>> - reporting the error and setting some state for callers that just
>>  report errors and choose not to continue on.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Cc: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
> [...]
>> diff --git a/migration/ram.c b/migration/ram.c
>> index eed1daf302..1e24a78eaa 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
> [...]
>> @@ -3625,6 +3637,7 @@ static void compress_threads_load_cleanup(void)
>> static int compress_threads_load_setup(QEMUFile *f)
>> {
>>     int i, thread_count;
>> +    Error *local_err = NULL;
>> 
>>     if (!migrate_use_compression()) {
>>         return 0;
>> @@ -3646,10 +3659,13 @@ static int compress_threads_load_setup(QEMUFile *f)
>>         qemu_cond_init(&decomp_param[i].cond);
>>         decomp_param[i].done = true;
>>         decomp_param[i].quit = false;
>> -        /* TODO: let the further caller handle the error instead of abort() */
>> -        qemu_thread_create(decompress_threads + i, "decompress",
>> -                           do_data_decompress, decomp_param + i,
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(decompress_threads + i, "decompress",
>> +                                do_data_decompress, decomp_param + i,
>> +                                QEMU_THREAD_JOINABLE, &local_err)) {
>> +            error_reportf_err(local_err,
>> +                              "failed to create do_data_decompress: ");
>> +            goto exit;
> 
> Broken error handling, see my review of PATCH 16.
Yep, seems both the compress_threads_save_setup() and compress_threads_load_setup() have such problem.

> 
>> +        }
>>     }
>>     return 0;
>> exit:
> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages
  2019-01-07 18:13   ` Markus Armbruster
@ 2019-01-09 16:13     ` fei
  0 siblings, 0 replies; 74+ messages in thread
From: fei @ 2019-01-09 16:13 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, qemu-devel, shirley17fei



> 在 2019年1月8日,02:13,Markus Armbruster <armbru@redhat.com> 写道:
> 
> Fei Li <fli@suse.com> writes:
> 
>> Supplement the error handling for touch_all_pages: add an Error
>> parameter for it to propagate the error to its caller to do the
>> handling in case it fails.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>> util/oslib-posix.c | 25 ++++++++++++++++---------
>> 1 file changed, 16 insertions(+), 9 deletions(-)
>> 
>> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
>> index 251e2f1aea..afc1d99093 100644
>> --- a/util/oslib-posix.c
>> +++ b/util/oslib-posix.c
>> @@ -431,15 +431,17 @@ static inline int get_memset_num_threads(int smp_cpus)
>> }
>> 
>> static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
>> -                            int smp_cpus)
>> +                            int smp_cpus, Error **errp)
>> {
>>     size_t numpages_per_thread;
>>     size_t size_per_thread;
>>     char *addr = area;
>>     int i = 0;
>> +    int started_thread = 0;
>> 
>>     memset_thread_failed = false;
>>     memset_num_threads = get_memset_num_threads(smp_cpus);
>> +    started_thread = memset_num_threads;
>>     memset_thread = g_new0(MemsetThread, memset_num_threads);
>>     numpages_per_thread = (numpages / memset_num_threads);
>>     size_per_thread = (hpagesize * numpages_per_thread);
>> @@ -448,14 +450,18 @@ static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,
>>         memset_thread[i].numpages = (i == (memset_num_threads - 1)) ?
>>                                     numpages : numpages_per_thread;
>>         memset_thread[i].hpagesize = hpagesize;
>> -        /* TODO: let the callers handle the error instead of abort() here */
>> -        qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
>> -                           do_touch_pages, &memset_thread[i],
>> -                           QEMU_THREAD_JOINABLE, &error_abort);
>> +        if (!qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",
>> +                                do_touch_pages, &memset_thread[i],
>> +                                QEMU_THREAD_JOINABLE, errp)) {
>> +            memset_thread_failed = true;
>> +            started_thread = i;
>> +            goto out;
> 
> break rather than goto, please.
Ok
> 
>> +        }
>>         addr += size_per_thread;
>>         numpages -= numpages_per_thread;
>>     }
>> -    for (i = 0; i < memset_num_threads; i++) {
>> +out:
>> +    for (i = 0; i < started_thread; i++) {
>>         qemu_thread_join(&memset_thread[i].pgthread);
>>     }
> 
> I don't like how @started_thread is computed.  The name suggests it's
> the number of threads started so far.  That's the case when you
> initialize it to zero.  But then you immediately set it to
> memset_thread().  It again becomes the case only when you break the loop
> on error, or when you complete it successfully.
> 
> There's no need for @started_thread, since the number of threads created
> is readily available as @i:
> 
>       memset_num_threads = i;
Thanks for this wonderful suggestion! This helps a lot! ;)
>       for (i = 0; i < memset_num_threads; i++) {
>           qemu_thread_join(&memset_thread[i].pgthread);
>       }
> 
> Rest of the function:
> 
>>     g_free(memset_thread);
>       memset_thread = NULL;
> 
>       return memset_thread_failed;
>   }
> 
> If do_touch_pages() set memset_thread_failed(), we return false without
> setting an error.  I believe you should
> 
>       if (memset_thread_failed) {
>           error_setg(errp, "os_mem_prealloc: Insufficient free host memory "
>               "pages available to allocate guest RAM");
>           return false;
>       }
>       return true;
> 
Ok
> here, and ...
> 
>> @@ -471,6 +477,7 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
>>     struct sigaction act, oldact;
>>     size_t hpagesize = qemu_fd_getpagesize(fd);
>>     size_t numpages = DIV_ROUND_UP(memory, hpagesize);
>> +    Error *local_err = NULL;
>> 
>>     memset(&act, 0, sizeof(act));
>>     act.sa_handler = &sigbus_handler;
>> @@ -484,9 +491,9 @@ void os_mem_prealloc(int fd, char *area, size_t memory, int smp_cpus,
>>     }
>> 
>>     /* touch pages simultaneously */
>> -    if (touch_all_pages(area, hpagesize, numpages, smp_cpus)) {
>> -        error_setg(errp, "os_mem_prealloc: Insufficient free host memory "
>> -            "pages available to allocate guest RAM");
>> +    if (touch_all_pages(area, hpagesize, numpages, smp_cpus, &local_err)) {
>> +        error_propagate_prepend(errp, local_err, "os_mem_prealloc: Insufficient"
>> +            " free host memory pages available to allocate guest RAM: ");
>>     }
> 
> ... not mess with the error message here, i.e.
> 
>       touch_all_pages(area, hpagesize, numpages, smp_cpus), errp);
Ok, will amend this in the next version. 

Have a nice day, thanks again
Fei
> 
>> 
>>     ret = sigaction(SIGBUS, &oldact, NULL);

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-09 15:57             ` fei
@ 2019-01-10  9:20               ` Markus Armbruster
  2019-01-10 13:24                 ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-10  9:20 UTC (permalink / raw)
  To: fei; +Cc: Markus Armbruster, Stefan Weil, qemu-devel, shirley17fei

fei <lifei1214@126.com> writes:

>> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
>> 
>> Fei Li <lifei1214@126.com> writes:
>> 
>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>> fei <lifei1214@126.com> writes:
>>>> 
>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>> 
>>>>>> Fei Li <fli@suse.com> writes:
>>>>>> 
>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>> 
>>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>>> ---
>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>> 
>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>     int err;
>>>>>>>     void *ret;
>>>>>>> 
>>>>>>> +    if (!thread->thread) {
>>>>>>> +        return NULL;
>>>>>>> +    }
>>>>>> How can this happen?
>>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>> explanation.  You also wrote there "I will remove this patch in next
>>>> version"; looks like you've since changed your mind.
>>> Emm, issues left over from history.. The background is I was hurry to
>>> make those five
>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>> the real
>>> qemu_thread_create() modification. But actually this patch is to fix
>>> the segmentation
>>> fault after we modified qemu_thread_create() related functions
>>> although it has got a
>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>> "remove..." sentence
>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>> only four patches.
>>> But later I got a message that these five patches are not that urgent
>>> to catch qemu v3.1,
>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>> a better review.
>>> 
>>> Sorry for the trouble, I need to explain it without involving too much
>>> background..
>>> 
>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>> to join()
>>> the total number of threads if caller fails. This is not a problem
>>> until applying the
>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>> fails while trying to create the last do_data_compress thread,
>>> segmentation fault
>>> will occur when join() is called (sadly there's not enough condition
>>> to filter this
>>> unsuccessful created thread) as this thread is actually not be created.
>>> 
>>> Hope the above makes it clear. :)
>> 
>> Alright, let's have a look at compress_threads_save_setup():
>> 
>>    static int compress_threads_save_setup(void)
>>    {
>>        int i, thread_count;
>> 
>>        if (!migrate_use_compression()) {
>>            return 0;
>>        }
>>        thread_count = migrate_compress_threads();
>>        compress_threads = g_new0(QemuThread, thread_count);
>>        comp_param = g_new0(CompressParam, thread_count);
>>        qemu_cond_init(&comp_done_cond);
>>        qemu_mutex_init(&comp_done_lock);
>>        for (i = 0; i < thread_count; i++) {
>>            comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>            if (!comp_param[i].originbuf) {
>>                goto exit;
>>            }
>> 
>>            if (deflateInit(&comp_param[i].stream,
>>                            migrate_compress_level()) != Z_OK) {
>>                g_free(comp_param[i].originbuf);
>>                goto exit;
>>            }
>> 
>>            /* comp_param[i].file is just used as a dummy buffer to save data,
>>             * set its ops to empty.
>>             */
>>            comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>            comp_param[i].done = true;
>>            comp_param[i].quit = false;
>>            qemu_mutex_init(&comp_param[i].mutex);
>>            qemu_cond_init(&comp_param[i].cond);
>>            qemu_thread_create(compress_threads + i, "compress",
>>                               do_data_compress, comp_param + i,
>>                               QEMU_THREAD_JOINABLE);
>>        }
>>        return 0;
>> 
>>    exit:
>>        compress_threads_save_cleanup();
>>        return -1;
>>    }
>> 
>> At label exit, we have @i threads, all fully initialized.  That's an
>> invariant.
>> 
>> compress_threads_save_cleanup() finds the threads to clean up by
>> checking comp_param[i].file:
>> 
>>    static void compress_threads_save_cleanup(void)
>>    {
>>        int i, thread_count;
>> 
>>        if (!migrate_use_compression() || !comp_param) {
>>            return;
>>        }
>> 
>>        thread_count = migrate_compress_threads();
>>        for (i = 0; i < thread_count; i++) {
>>            /*
>>             * we use it as a indicator which shows if the thread is
>>             * properly init'd or not
>>             */
>> --->        if (!comp_param[i].file) {
>> --->            break;
>> --->        }
>> 
>>            qemu_mutex_lock(&comp_param[i].mutex);
>>            comp_param[i].quit = true;
>>            qemu_cond_signal(&comp_param[i].cond);
>>            qemu_mutex_unlock(&comp_param[i].mutex);
>> 
>>            qemu_thread_join(compress_threads + i);
>>            qemu_mutex_destroy(&comp_param[i].mutex);
>>            qemu_cond_destroy(&comp_param[i].cond);
>>            deflateEnd(&comp_param[i].stream);
>>            g_free(comp_param[i].originbuf);
>>            qemu_fclose(comp_param[i].file);
>>            comp_param[i].file = NULL;
>>        }
>>        qemu_mutex_destroy(&comp_done_lock);
>>        qemu_cond_destroy(&comp_done_cond);
>>        g_free(compress_threads);
>>        g_free(comp_param);
>>        compress_threads = NULL;
>>        comp_param = NULL;
>>    }
>> 
>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>> *any* cleanup.
>> 
>> To maintain the invariant, compress_threads_save_setup() carefully
>> cleans up any partial initializations itself before a goto exit.  Since
>> the code is arranged smartly, the only such cleanup is the
>> g_free(comp_param[i].originbuf) before the second goto exit.
>> 
>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>> initializations.  Breaks the invariant.
>> 
>> I see two sane solutions:
>> 
>> 1. compress_threads_save_setup() carefully cleans up partial
>>   initializations itself.  compress_threads_save_cleanup() copes only
>>   with fully initialized comp_param[i].  This is how things work before
>>   your series.
>> 
>> 2. compress_threads_save_cleanup() copes with partially initialized
>>   comp_param[i], i.e. does the right thing for each goto exit in
>>   compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>   clean up partial initializations.
>> 
>> Your PATCH 13 together with the fixup PATCH 16 does
>> 
>> 3. A confusing mix of the two.
>> 
>> Don't.
> Thanks for the detail analysis! :)
> Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
> But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function.. 
> Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup? 
>
> [1]
> qemu_mutex_lock(&comp_param[i].mutex);
>            comp_param[i].quit = true;
>            qemu_cond_signal(&comp_param[i].cond);
>            qemu_mutex_unlock(&comp_param[i].mutex);
>
> qemu_mutex_destroy(&comp_param[i].mutex);
>            qemu_cond_destroy(&comp_param[i].cond);
>            deflateEnd(&comp_param[i].stream);
>            g_free(comp_param[i].originbuf);
>            qemu_fclose(comp_param[i].file);
>            comp_param[i].file = NULL;

Have you considered creating the thread earlier, e.g. right after
initializing compression with deflateInit()?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-10  9:20               ` Markus Armbruster
@ 2019-01-10 13:24                 ` Fei Li
  2019-01-10 16:06                   ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-10 13:24 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Stefan Weil, qemu-devel, shirley17fei


在 2019/1/10 下午5:20, Markus Armbruster 写道:
> fei <lifei1214@126.com> writes:
>
>>> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
>>>
>>> Fei Li <lifei1214@126.com> writes:
>>>
>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>> fei <lifei1214@126.com> writes:
>>>>>
>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>>>
>>>>>>> Fei Li <fli@suse.com> writes:
>>>>>>>
>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>
>>>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>>>> ---
>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>      int err;
>>>>>>>>      void *ret;
>>>>>>>>
>>>>>>>> +    if (!thread->thread) {
>>>>>>>> +        return NULL;
>>>>>>>> +    }
>>>>>>> How can this happen?
>>>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>>> explanation.  You also wrote there "I will remove this patch in next
>>>>> version"; looks like you've since changed your mind.
>>>> Emm, issues left over from history.. The background is I was hurry to
>>>> make those five
>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>> the real
>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>> the segmentation
>>>> fault after we modified qemu_thread_create() related functions
>>>> although it has got a
>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>> "remove..." sentence
>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>> only four patches.
>>>> But later I got a message that these five patches are not that urgent
>>>> to catch qemu v3.1,
>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>> a better review.
>>>>
>>>> Sorry for the trouble, I need to explain it without involving too much
>>>> background..
>>>>
>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>> to join()
>>>> the total number of threads if caller fails. This is not a problem
>>>> until applying the
>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>> fails while trying to create the last do_data_compress thread,
>>>> segmentation fault
>>>> will occur when join() is called (sadly there's not enough condition
>>>> to filter this
>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>
>>>> Hope the above makes it clear. :)
>>> Alright, let's have a look at compress_threads_save_setup():
>>>
>>>     static int compress_threads_save_setup(void)
>>>     {
>>>         int i, thread_count;
>>>
>>>         if (!migrate_use_compression()) {
>>>             return 0;
>>>         }
>>>         thread_count = migrate_compress_threads();
>>>         compress_threads = g_new0(QemuThread, thread_count);
>>>         comp_param = g_new0(CompressParam, thread_count);
>>>         qemu_cond_init(&comp_done_cond);
>>>         qemu_mutex_init(&comp_done_lock);
>>>         for (i = 0; i < thread_count; i++) {
>>>             comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>             if (!comp_param[i].originbuf) {
>>>                 goto exit;
>>>             }
>>>
>>>             if (deflateInit(&comp_param[i].stream,
>>>                             migrate_compress_level()) != Z_OK) {
>>>                 g_free(comp_param[i].originbuf);
>>>                 goto exit;
>>>             }
>>>
>>>             /* comp_param[i].file is just used as a dummy buffer to save data,
>>>              * set its ops to empty.
>>>              */
>>>             comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>             comp_param[i].done = true;
>>>             comp_param[i].quit = false;
>>>             qemu_mutex_init(&comp_param[i].mutex);
>>>             qemu_cond_init(&comp_param[i].cond);
>>>             qemu_thread_create(compress_threads + i, "compress",
>>>                                do_data_compress, comp_param + i,
>>>                                QEMU_THREAD_JOINABLE);
>>>         }
>>>         return 0;
>>>
>>>     exit:
>>>         compress_threads_save_cleanup();
>>>         return -1;
>>>     }
>>>
>>> At label exit, we have @i threads, all fully initialized.  That's an
>>> invariant.
>>>
>>> compress_threads_save_cleanup() finds the threads to clean up by
>>> checking comp_param[i].file:
>>>
>>>     static void compress_threads_save_cleanup(void)
>>>     {
>>>         int i, thread_count;
>>>
>>>         if (!migrate_use_compression() || !comp_param) {
>>>             return;
>>>         }
>>>
>>>         thread_count = migrate_compress_threads();
>>>         for (i = 0; i < thread_count; i++) {
>>>             /*
>>>              * we use it as a indicator which shows if the thread is
>>>              * properly init'd or not
>>>              */
>>> --->        if (!comp_param[i].file) {
>>> --->            break;
>>> --->        }
>>>
>>>             qemu_mutex_lock(&comp_param[i].mutex);
>>>             comp_param[i].quit = true;
>>>             qemu_cond_signal(&comp_param[i].cond);
>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>
>>>             qemu_thread_join(compress_threads + i);
>>>             qemu_mutex_destroy(&comp_param[i].mutex);
>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>             deflateEnd(&comp_param[i].stream);
>>>             g_free(comp_param[i].originbuf);
>>>             qemu_fclose(comp_param[i].file);
>>>             comp_param[i].file = NULL;
>>>         }
>>>         qemu_mutex_destroy(&comp_done_lock);
>>>         qemu_cond_destroy(&comp_done_cond);
>>>         g_free(compress_threads);
>>>         g_free(comp_param);
>>>         compress_threads = NULL;
>>>         comp_param = NULL;
>>>     }
>>>
>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>> *any* cleanup.
>>>
>>> To maintain the invariant, compress_threads_save_setup() carefully
>>> cleans up any partial initializations itself before a goto exit.  Since
>>> the code is arranged smartly, the only such cleanup is the
>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>
>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>> initializations.  Breaks the invariant.
>>>
>>> I see two sane solutions:
>>>
>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>    initializations itself.  compress_threads_save_cleanup() copes only
>>>    with fully initialized comp_param[i].  This is how things work before
>>>    your series.
>>>
>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>    comp_param[i], i.e. does the right thing for each goto exit in
>>>    compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>>    clean up partial initializations.
>>>
>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>
>>> 3. A confusing mix of the two.
>>>
>>> Don't.
>> Thanks for the detail analysis! :)
>> Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
>> But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function..
>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup?
>>
>> [1]
>> qemu_mutex_lock(&comp_param[i].mutex);
>>             comp_param[i].quit = true;
>>             qemu_cond_signal(&comp_param[i].cond);
>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>
>> qemu_mutex_destroy(&comp_param[i].mutex);
>>             qemu_cond_destroy(&comp_param[i].cond);
>>             deflateEnd(&comp_param[i].stream);
>>             g_free(comp_param[i].originbuf);
>>             qemu_fclose(comp_param[i].file);
>>             comp_param[i].file = NULL;
> Have you considered creating the thread earlier, e.g. right after
> initializing compression with deflateInit()?
I am afraid we can not do this, as the members of comp_param[i], like 
file/done/quit/mutex/cond
will be used later in the new created thread: do_data_[de]compress via 
qemu_thread_create().


Thus it seems we have to accept the above long [1] if we do want to 
clean up partial initialization
in xxx_setup(). :(

BTW, there is no other argument can be used except the 
"(compress_threads+i)->thread" to
differentiate whether should we join() the thread, just in case we want 
to change the
xxx_cleanup() function.

  *


  *



Have a nice day, thanks
Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-08  8:43         ` Markus Armbruster
@ 2019-01-10 13:29           ` Fei Li
  2019-01-11  2:49             ` Peter Xu
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-10 13:29 UTC (permalink / raw)
  To: Markus Armbruster, Peter Xu, Michael S. Tsirkin, Marcel Apfelbaum
  Cc: Jiri Slaby, shirley17fei, qemu-devel


在 2019/1/8 下午4:43, Markus Armbruster 写道:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote:
>>> On 07. 01. 19, 18:29, Markus Armbruster wrote:
>>>>     static void pci_edu_uninit(PCIDevice *pdev)
>>>>     {
>>>>         EduState *edu = EDU(pdev);
>>>>
>>>>         qemu_mutex_lock(&edu->thr_mutex);
>>>>         edu->stopping = true;
>>>>         qemu_mutex_unlock(&edu->thr_mutex);
>>>>         qemu_cond_signal(&edu->thr_cond);
>>>>         qemu_thread_join(&edu->thread);
>>>>
>>>>         qemu_cond_destroy(&edu->thr_cond);
>>>>         qemu_mutex_destroy(&edu->thr_mutex);
>>>>
>>>>         timer_del(&edu->dma_timer);
>>>>     }
>>>>
>>>> Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\
>>> I don't know, the MSI support was added in:
>>> commit eabb5782f70b4a10975b24ccd7129929a05ac932
>>> Author: Peter Xu <peterx@redhat.com>
>>> Date:   Wed Sep 28 21:03:39 2016 +0800
>>>
>>>      hw/misc/edu: support MSI interrupt
>>>
>>> Hence CCing Peter.
>> Hi, Jiri, Markus, Fei,
>>
>> IMHO msi_uninit() is optional since it only operates on the config
>> space of the device to remove the capability or fix up the flags
>> without really doing any real destruction of objects so nothing will
>> be leaked (unlike msix_uninit, which should be required).
> Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless
> bug, or a harmful bug?

Kindly ping. :)

If corresponding change is needed, I'd like to do the update in the next 
version.

>>                                                             But I do
>> agree that calling msi_uninit() could be even nicer here.
>>
>> Anyone would like to post a patch? Or should I?
> Please coordinate fixing this with Fei Li.
Have a nice day, thanks all
Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-10 13:24                 ` Fei Li
@ 2019-01-10 16:06                   ` Markus Armbruster
  2019-01-11 14:01                     ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-10 16:06 UTC (permalink / raw)
  To: Fei Li; +Cc: Stefan Weil, qemu-devel, shirley17fei

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/10 下午5:20, Markus Armbruster 写道:
>> fei <lifei1214@126.com> writes:
>>
>>>> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
>>>>
>>>> Fei Li <lifei1214@126.com> writes:
>>>>
>>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>>> fei <lifei1214@126.com> writes:
>>>>>>
>>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>>>>
>>>>>>>> Fei Li <fli@suse.com> writes:
>>>>>>>>
>>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>>
>>>>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>>>>> ---
>>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>>      int err;
>>>>>>>>>      void *ret;
>>>>>>>>>
>>>>>>>>> +    if (!thread->thread) {
>>>>>>>>> +        return NULL;
>>>>>>>>> +    }
>>>>>>>> How can this happen?
>>>>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>>>> explanation.  You also wrote there "I will remove this patch in next
>>>>>> version"; looks like you've since changed your mind.
>>>>> Emm, issues left over from history.. The background is I was hurry to
>>>>> make those five
>>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>>> the real
>>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>>> the segmentation
>>>>> fault after we modified qemu_thread_create() related functions
>>>>> although it has got a
>>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>>> "remove..." sentence
>>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>>> only four patches.
>>>>> But later I got a message that these five patches are not that urgent
>>>>> to catch qemu v3.1,
>>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>>> a better review.
>>>>>
>>>>> Sorry for the trouble, I need to explain it without involving too much
>>>>> background..
>>>>>
>>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>>> to join()
>>>>> the total number of threads if caller fails. This is not a problem
>>>>> until applying the
>>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>>> fails while trying to create the last do_data_compress thread,
>>>>> segmentation fault
>>>>> will occur when join() is called (sadly there's not enough condition
>>>>> to filter this
>>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>>
>>>>> Hope the above makes it clear. :)
>>>> Alright, let's have a look at compress_threads_save_setup():
>>>>
>>>>     static int compress_threads_save_setup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression()) {
>>>>             return 0;
>>>>         }
>>>>         thread_count = migrate_compress_threads();
>>>>         compress_threads = g_new0(QemuThread, thread_count);
>>>>         comp_param = g_new0(CompressParam, thread_count);
>>>>         qemu_cond_init(&comp_done_cond);
>>>>         qemu_mutex_init(&comp_done_lock);
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>>             if (!comp_param[i].originbuf) {
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             if (deflateInit(&comp_param[i].stream,
>>>>                             migrate_compress_level()) != Z_OK) {
>>>>                 g_free(comp_param[i].originbuf);
>>>>                 goto exit;
>>>>             }
>>>>
>>>>             /* comp_param[i].file is just used as a dummy buffer to save data,
>>>>              * set its ops to empty.
>>>>              */
>>>>             comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>>             comp_param[i].done = true;
>>>>             comp_param[i].quit = false;
>>>>             qemu_mutex_init(&comp_param[i].mutex);
>>>>             qemu_cond_init(&comp_param[i].cond);
>>>>             qemu_thread_create(compress_threads + i, "compress",
>>>>                                do_data_compress, comp_param + i,
>>>>                                QEMU_THREAD_JOINABLE);
>>>>         }
>>>>         return 0;
>>>>
>>>>     exit:
>>>>         compress_threads_save_cleanup();
>>>>         return -1;
>>>>     }
>>>>
>>>> At label exit, we have @i threads, all fully initialized.  That's an
>>>> invariant.
>>>>
>>>> compress_threads_save_cleanup() finds the threads to clean up by
>>>> checking comp_param[i].file:
>>>>
>>>>     static void compress_threads_save_cleanup(void)
>>>>     {
>>>>         int i, thread_count;
>>>>
>>>>         if (!migrate_use_compression() || !comp_param) {
>>>>             return;
>>>>         }
>>>>
>>>>         thread_count = migrate_compress_threads();
>>>>         for (i = 0; i < thread_count; i++) {
>>>>             /*
>>>>              * we use it as a indicator which shows if the thread is
>>>>              * properly init'd or not
>>>>              */
>>>> --->        if (!comp_param[i].file) {
>>>> --->            break;
>>>> --->        }
>>>>
>>>>             qemu_mutex_lock(&comp_param[i].mutex);
>>>>             comp_param[i].quit = true;
>>>>             qemu_cond_signal(&comp_param[i].cond);
>>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>>
>>>>             qemu_thread_join(compress_threads + i);
>>>>             qemu_mutex_destroy(&comp_param[i].mutex);
>>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>>             deflateEnd(&comp_param[i].stream);
>>>>             g_free(comp_param[i].originbuf);
>>>>             qemu_fclose(comp_param[i].file);
>>>>             comp_param[i].file = NULL;
>>>>         }
>>>>         qemu_mutex_destroy(&comp_done_lock);
>>>>         qemu_cond_destroy(&comp_done_cond);
>>>>         g_free(compress_threads);
>>>>         g_free(comp_param);
>>>>         compress_threads = NULL;
>>>>         comp_param = NULL;
>>>>     }
>>>>
>>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>>> *any* cleanup.
>>>>
>>>> To maintain the invariant, compress_threads_save_setup() carefully
>>>> cleans up any partial initializations itself before a goto exit.  Since
>>>> the code is arranged smartly, the only such cleanup is the
>>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>>
>>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>>> initializations.  Breaks the invariant.
>>>>
>>>> I see two sane solutions:
>>>>
>>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>>    initializations itself.  compress_threads_save_cleanup() copes only
>>>>    with fully initialized comp_param[i].  This is how things work before
>>>>    your series.
>>>>
>>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>>    comp_param[i], i.e. does the right thing for each goto exit in
>>>>    compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>>>    clean up partial initializations.
>>>>
>>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>>
>>>> 3. A confusing mix of the two.
>>>>
>>>> Don't.
>>> Thanks for the detail analysis! :)
>>> Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
>>> But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function..
>>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup?
>>>
>>> [1]
>>> qemu_mutex_lock(&comp_param[i].mutex);
>>>             comp_param[i].quit = true;
>>>             qemu_cond_signal(&comp_param[i].cond);
>>>             qemu_mutex_unlock(&comp_param[i].mutex);
>>>
>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>>             qemu_cond_destroy(&comp_param[i].cond);
>>>             deflateEnd(&comp_param[i].stream);
>>>             g_free(comp_param[i].originbuf);
>>>             qemu_fclose(comp_param[i].file);
>>>             comp_param[i].file = NULL;
>> Have you considered creating the thread earlier, e.g. right after
>> initializing compression with deflateInit()?
> I am afraid we can not do this, as the members of comp_param[i], like
> file/done/quit/mutex/cond
> will be used later in the new created thread: do_data_[de]compress via
> qemu_thread_create().

You're right.

> Thus it seems we have to accept the above long [1] if we do want to
> clean up partial initialization
> in xxx_setup(). :(
>
> BTW, there is no other argument can be used except the
> "(compress_threads+i)->thread" to
> differentiate whether should we join() the thread, just in case we
> want to change the
> xxx_cleanup() function.

We can try to make compress_threads_save_cleanup() cope with partially
initialized comp_param[i].  Let's have a look at its members:

    bool done;                          // no cleanup
    bool quit;                          // see [2]
    bool zero_page;                     // no cleanup
    QEMUFile *file;                     // qemu_fclose() if non-null
    QemuMutex mutex;                    // see [1]
    QemuCond cond;                      // see [1]
    RAMBlock *block;                    // no cleanup (must be null)
    ram_addr_t offset;                  // no cleanup

    /* internally used fields */
    z_stream stream;                    // see [3]
    uint8_t *originbuf;                 // unconditional g_free()

[1]: we could do something like

    if (comp_param[i].mutex.initialized) {
        qemu_mutex_destroy(&comp_param[i].mutex);
    }
    if (comp_param[i].cond.initialized) {
        qemu_cond_destroy(&comp_param[i].cond);
    }

but that would be unclean.  Instead, I'd initialize these guys first, so
we can clean them up unconditionally.

[2] This is used to make the thread terminate.  Must be done before we
call qemu_thread_join().  I think it can safely be done always, as long
as long as .mutex and .cond are initialized.  Trivial if we initialize
them first.

[3]: I can't see a squeaky clean way to detect whether .stream has been
initialized with deflateInit().  Here's a slightly unclean way:
deflateInit() sets .stream.msg to null on success, and to non-null on
failure.  We can make it non-null until we're ready to call
deflateInit(), then have compress_threads_save_cleanup() clean up
.stream when it's null.  If that's too unclean for your or your
reviewers' taste, add a boolean @stream_initialized flag.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-10 13:29           ` Fei Li
@ 2019-01-11  2:49             ` Peter Xu
  2019-01-11 13:19               ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Xu @ 2019-01-11  2:49 UTC (permalink / raw)
  To: Fei Li
  Cc: Markus Armbruster, Michael S. Tsirkin, Marcel Apfelbaum,
	Jiri Slaby, shirley17fei, qemu-devel

On Thu, Jan 10, 2019 at 09:29:38PM +0800, Fei Li wrote:
> 
> 在 2019/1/8 下午4:43, Markus Armbruster 写道:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote:
> > > > On 07. 01. 19, 18:29, Markus Armbruster wrote:
> > > > >     static void pci_edu_uninit(PCIDevice *pdev)
> > > > >     {
> > > > >         EduState *edu = EDU(pdev);
> > > > > 
> > > > >         qemu_mutex_lock(&edu->thr_mutex);
> > > > >         edu->stopping = true;
> > > > >         qemu_mutex_unlock(&edu->thr_mutex);
> > > > >         qemu_cond_signal(&edu->thr_cond);
> > > > >         qemu_thread_join(&edu->thread);
> > > > > 
> > > > >         qemu_cond_destroy(&edu->thr_cond);
> > > > >         qemu_mutex_destroy(&edu->thr_mutex);
> > > > > 
> > > > >         timer_del(&edu->dma_timer);
> > > > >     }
> > > > > 
> > > > > Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\
> > > > I don't know, the MSI support was added in:
> > > > commit eabb5782f70b4a10975b24ccd7129929a05ac932
> > > > Author: Peter Xu <peterx@redhat.com>
> > > > Date:   Wed Sep 28 21:03:39 2016 +0800
> > > > 
> > > >      hw/misc/edu: support MSI interrupt
> > > > 
> > > > Hence CCing Peter.
> > > Hi, Jiri, Markus, Fei,
> > > 
> > > IMHO msi_uninit() is optional since it only operates on the config
> > > space of the device to remove the capability or fix up the flags
> > > without really doing any real destruction of objects so nothing will
> > > be leaked (unlike msix_uninit, which should be required).
> > Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless
> > bug, or a harmful bug?
> 
> Kindly ping. :)
> 
> If corresponding change is needed, I'd like to do the update in the next
> version.

Fei,

If you're going to post the edu patch, please post it as a standalone
patch.  More patches mean harder that the series could be accepted
quickly.  So it would be good to split patches sometimes especially if
they are irrelevant.

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-11  2:49             ` Peter Xu
@ 2019-01-11 13:19               ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-11 13:19 UTC (permalink / raw)
  To: Peter Xu
  Cc: Markus Armbruster, Michael S. Tsirkin, Marcel Apfelbaum,
	Jiri Slaby, shirley17fei, qemu-devel


在 2019/1/11 上午10:49, Peter Xu 写道:
> On Thu, Jan 10, 2019 at 09:29:38PM +0800, Fei Li wrote:
>> 在 2019/1/8 下午4:43, Markus Armbruster 写道:
>>> Peter Xu <peterx@redhat.com> writes:
>>>
>>>> On Tue, Jan 08, 2019 at 07:14:11AM +0100, Jiri Slaby wrote:
>>>>> On 07. 01. 19, 18:29, Markus Armbruster wrote:
>>>>>>      static void pci_edu_uninit(PCIDevice *pdev)
>>>>>>      {
>>>>>>          EduState *edu = EDU(pdev);
>>>>>>
>>>>>>          qemu_mutex_lock(&edu->thr_mutex);
>>>>>>          edu->stopping = true;
>>>>>>          qemu_mutex_unlock(&edu->thr_mutex);
>>>>>>          qemu_cond_signal(&edu->thr_cond);
>>>>>>          qemu_thread_join(&edu->thread);
>>>>>>
>>>>>>          qemu_cond_destroy(&edu->thr_cond);
>>>>>>          qemu_mutex_destroy(&edu->thr_mutex);
>>>>>>
>>>>>>          timer_del(&edu->dma_timer);
>>>>>>      }
>>>>>>
>>>>>> Preexisting: pci_edu_uninit() neglects to call msi_uninit().  Jiri?\
>>>>> I don't know, the MSI support was added in:
>>>>> commit eabb5782f70b4a10975b24ccd7129929a05ac932
>>>>> Author: Peter Xu <peterx@redhat.com>
>>>>> Date:   Wed Sep 28 21:03:39 2016 +0800
>>>>>
>>>>>       hw/misc/edu: support MSI interrupt
>>>>>
>>>>> Hence CCing Peter.
>>>> Hi, Jiri, Markus, Fei,
>>>>
>>>> IMHO msi_uninit() is optional since it only operates on the config
>>>> space of the device to remove the capability or fix up the flags
>>>> without really doing any real destruction of objects so nothing will
>>>> be leaked (unlike msix_uninit, which should be required).
>>> Michael, Marcel, is neglecting to call msi_uninit() okay, a harmless
>>> bug, or a harmful bug?
>> Kindly ping. :)
>>
>> If corresponding change is needed, I'd like to do the update in the next
>> version.
> Fei,
>
> If you're going to post the edu patch, please post it as a standalone
> patch.  More patches mean harder that the series could be accepted
> quickly.  So it would be good to split patches sometimes especially if
> they are irrelevant.
>
> Regards,
>
Ok, thanks for this helpful suggestion. Will send this patch alone in 
the next version. :)

Have a nice day
Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault
  2019-01-10 16:06                   ` Markus Armbruster
@ 2019-01-11 14:01                     ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-11 14:01 UTC (permalink / raw)
  To: Markus Armbruster, Peter Xu, Dr. David Alan Gilbert, quintela
  Cc: Stefan Weil, qemu-devel


在 2019/1/11 上午12:06, Markus Armbruster 写道:
> Fei Li <lifei1214@126.com> writes:
>
>> 在 2019/1/10 下午5:20, Markus Armbruster 写道:
>>> fei <lifei1214@126.com> writes:
>>>
>>>>> 在 2019年1月9日,23:24,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>
>>>>> Fei Li <lifei1214@126.com> writes:
>>>>>
>>>>>>> 在 2019/1/9 上午1:29, Markus Armbruster 写道:
>>>>>>> fei <lifei1214@126.com> writes:
>>>>>>>
>>>>>>>>> 在 2019年1月8日,01:55,Markus Armbruster <armbru@redhat.com> 写道:
>>>>>>>>>
>>>>>>>>> Fei Li <fli@suse.com> writes:
>>>>>>>>>
>>>>>>>>>> To avoid the segmentation fault in qemu_thread_join(), just directly
>>>>>>>>>> return when the QemuThread *thread failed to be created in either
>>>>>>>>>> qemu-thread-posix.c or qemu-thread-win32.c.
>>>>>>>>>>
>>>>>>>>>> Cc: Stefan Weil <sw@weilnetz.de>
>>>>>>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>>>>>>> Reviewed-by: Fam Zheng <famz@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>> util/qemu-thread-posix.c | 3 +++
>>>>>>>>>> util/qemu-thread-win32.c | 2 +-
>>>>>>>>>> 2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>>>>>>>>>> index 39834b0551..3548935dac 100644
>>>>>>>>>> --- a/util/qemu-thread-posix.c
>>>>>>>>>> +++ b/util/qemu-thread-posix.c
>>>>>>>>>> @@ -571,6 +571,9 @@ void *qemu_thread_join(QemuThread *thread)
>>>>>>>>>>       int err;
>>>>>>>>>>       void *ret;
>>>>>>>>>>
>>>>>>>>>> +    if (!thread->thread) {
>>>>>>>>>> +        return NULL;
>>>>>>>>>> +    }
>>>>>>>>> How can this happen?
>>>>>>>> I think I have answered this earlier, please check the following link to see whether it helps:
>>>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2018-11/msg06554.html
>>>>>>> Thanks for the pointer.  Unfortunately, I don't understand your
>>>>>>> explanation.  You also wrote there "I will remove this patch in next
>>>>>>> version"; looks like you've since changed your mind.
>>>>>> Emm, issues left over from history.. The background is I was hurry to
>>>>>> make those five
>>>>>> Reviewed-by patches be merged, including this v9 16/16 patch but not
>>>>>> the real
>>>>>> qemu_thread_create() modification. But actually this patch is to fix
>>>>>> the segmentation
>>>>>> fault after we modified qemu_thread_create() related functions
>>>>>> although it has got a
>>>>>> Reviewed-by earlier. :) Thus to not make troube, I wrote the
>>>>>> "remove..." sentence
>>>>>> to separate it from those 5 Reviewed-by patches, and were plan to send
>>>>>> only four patches.
>>>>>> But later I got a message that these five patches are not that urgent
>>>>>> to catch qemu v3.1,
>>>>>> thus I joined the earlier 5 R-b patches into the later v8 & v9 to have
>>>>>> a better review.
>>>>>>
>>>>>> Sorry for the trouble, I need to explain it without involving too much
>>>>>> background..
>>>>>>
>>>>>> Back at the farm: in our current qemu code, some cleanups use a loop
>>>>>> to join()
>>>>>> the total number of threads if caller fails. This is not a problem
>>>>>> until applying the
>>>>>> qemu_thread_create() modification. E.g. when compress_threads_save_setup()
>>>>>> fails while trying to create the last do_data_compress thread,
>>>>>> segmentation fault
>>>>>> will occur when join() is called (sadly there's not enough condition
>>>>>> to filter this
>>>>>> unsuccessful created thread) as this thread is actually not be created.
>>>>>>
>>>>>> Hope the above makes it clear. :)
>>>>> Alright, let's have a look at compress_threads_save_setup():
>>>>>
>>>>>      static int compress_threads_save_setup(void)
>>>>>      {
>>>>>          int i, thread_count;
>>>>>
>>>>>          if (!migrate_use_compression()) {
>>>>>              return 0;
>>>>>          }
>>>>>          thread_count = migrate_compress_threads();
>>>>>          compress_threads = g_new0(QemuThread, thread_count);
>>>>>          comp_param = g_new0(CompressParam, thread_count);
>>>>>          qemu_cond_init(&comp_done_cond);
>>>>>          qemu_mutex_init(&comp_done_lock);
>>>>>          for (i = 0; i < thread_count; i++) {
>>>>>              comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE);
>>>>>              if (!comp_param[i].originbuf) {
>>>>>                  goto exit;
>>>>>              }
>>>>>
>>>>>              if (deflateInit(&comp_param[i].stream,
>>>>>                              migrate_compress_level()) != Z_OK) {
>>>>>                  g_free(comp_param[i].originbuf);
>>>>>                  goto exit;
>>>>>              }
>>>>>
>>>>>              /* comp_param[i].file is just used as a dummy buffer to save data,
>>>>>               * set its ops to empty.
>>>>>               */
>>>>>              comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
>>>>>              comp_param[i].done = true;
>>>>>              comp_param[i].quit = false;
>>>>>              qemu_mutex_init(&comp_param[i].mutex);
>>>>>              qemu_cond_init(&comp_param[i].cond);
>>>>>              qemu_thread_create(compress_threads + i, "compress",
>>>>>                                 do_data_compress, comp_param + i,
>>>>>                                 QEMU_THREAD_JOINABLE);
>>>>>          }
>>>>>          return 0;
>>>>>
>>>>>      exit:
>>>>>          compress_threads_save_cleanup();
>>>>>          return -1;
>>>>>      }
>>>>>
>>>>> At label exit, we have @i threads, all fully initialized.  That's an
>>>>> invariant.
>>>>>
>>>>> compress_threads_save_cleanup() finds the threads to clean up by
>>>>> checking comp_param[i].file:
>>>>>
>>>>>      static void compress_threads_save_cleanup(void)
>>>>>      {
>>>>>          int i, thread_count;
>>>>>
>>>>>          if (!migrate_use_compression() || !comp_param) {
>>>>>              return;
>>>>>          }
>>>>>
>>>>>          thread_count = migrate_compress_threads();
>>>>>          for (i = 0; i < thread_count; i++) {
>>>>>              /*
>>>>>               * we use it as a indicator which shows if the thread is
>>>>>               * properly init'd or not
>>>>>               */
>>>>> --->        if (!comp_param[i].file) {
>>>>> --->            break;
>>>>> --->        }
>>>>>
>>>>>              qemu_mutex_lock(&comp_param[i].mutex);
>>>>>              comp_param[i].quit = true;
>>>>>              qemu_cond_signal(&comp_param[i].cond);
>>>>>              qemu_mutex_unlock(&comp_param[i].mutex);
>>>>>
>>>>>              qemu_thread_join(compress_threads + i);
>>>>>              qemu_mutex_destroy(&comp_param[i].mutex);
>>>>>              qemu_cond_destroy(&comp_param[i].cond);
>>>>>              deflateEnd(&comp_param[i].stream);
>>>>>              g_free(comp_param[i].originbuf);
>>>>>              qemu_fclose(comp_param[i].file);
>>>>>              comp_param[i].file = NULL;
>>>>>          }
>>>>>          qemu_mutex_destroy(&comp_done_lock);
>>>>>          qemu_cond_destroy(&comp_done_cond);
>>>>>          g_free(compress_threads);
>>>>>          g_free(comp_param);
>>>>>          compress_threads = NULL;
>>>>>          comp_param = NULL;
>>>>>      }
>>>>>
>>>>> Due to the invariant, a comp_param[i] with a null .file doesn't need
>>>>> *any* cleanup.
>>>>>
>>>>> To maintain the invariant, compress_threads_save_setup() carefully
>>>>> cleans up any partial initializations itself before a goto exit.  Since
>>>>> the code is arranged smartly, the only such cleanup is the
>>>>> g_free(comp_param[i].originbuf) before the second goto exit.
>>>>>
>>>>> Your PATCH 13 adds a third goto exit, but neglects to clean up partial
>>>>> initializations.  Breaks the invariant.
>>>>>
>>>>> I see two sane solutions:
>>>>>
>>>>> 1. compress_threads_save_setup() carefully cleans up partial
>>>>>     initializations itself.  compress_threads_save_cleanup() copes only
>>>>>     with fully initialized comp_param[i].  This is how things work before
>>>>>     your series.
>>>>>
>>>>> 2. compress_threads_save_cleanup() copes with partially initialized
>>>>>     comp_param[i], i.e. does the right thing for each goto exit in
>>>>>     compress_threads_save_setup().  compress_threads_save_setup() doesn't
>>>>>     clean up partial initializations.
>>>>>
>>>>> Your PATCH 13 together with the fixup PATCH 16 does
>>>>>
>>>>> 3. A confusing mix of the two.
>>>>>
>>>>> Don't.
>>>> Thanks for the detail analysis! :)
>>>> Emm.. Actually I have thought to do the cleanup in the setup() function for the third ‘goto exit’ [1],  which is a partial initialization.
>>>> But due to the below [1] is too long and seems not neat (I notice that most cleanups for each thread are in the xxx_cleanup() function), I turned to modify the join() function..
>>>> Is the long [1] acceptable when the third ‘goto exit’ is called, or is there any other better way to do the cleanup?
>>>>
>>>> [1]
>>>> qemu_mutex_lock(&comp_param[i].mutex);
>>>>              comp_param[i].quit = true;
>>>>              qemu_cond_signal(&comp_param[i].cond);
>>>>              qemu_mutex_unlock(&comp_param[i].mutex);
>>>>
>>>> qemu_mutex_destroy(&comp_param[i].mutex);
>>>>              qemu_cond_destroy(&comp_param[i].cond);
>>>>              deflateEnd(&comp_param[i].stream);
>>>>              g_free(comp_param[i].originbuf);
>>>>              qemu_fclose(comp_param[i].file);
>>>>              comp_param[i].file = NULL;
>>> Have you considered creating the thread earlier, e.g. right after
>>> initializing compression with deflateInit()?
>> I am afraid we can not do this, as the members of comp_param[i], like
>> file/done/quit/mutex/cond
>> will be used later in the new created thread: do_data_[de]compress via
>> qemu_thread_create().
> You're right.
>
>> Thus it seems we have to accept the above long [1] if we do want to
>> clean up partial initialization
>> in xxx_setup(). :(
>>
>> BTW, there is no other argument can be used except the
>> "(compress_threads+i)->thread" to
>> differentiate whether should we join() the thread, just in case we
>> want to change the
>> xxx_cleanup() function.
> We can try to make compress_threads_save_cleanup() cope with partially
> initialized comp_param[i].  Let's have a look at its members:
>
>      bool done;                          // no cleanup
>      bool quit;                          // see [2]
>      bool zero_page;                     // no cleanup
>      QEMUFile *file;                     // qemu_fclose() if non-null
>      QemuMutex mutex;                    // see [1]
>      QemuCond cond;                      // see [1]
>      RAMBlock *block;                    // no cleanup (must be null)
>      ram_addr_t offset;                  // no cleanup
>
>      /* internally used fields */
>      z_stream stream;                    // see [3]
>      uint8_t *originbuf;                 // unconditional g_free()
>
> [1]: we could do something like
>
>      if (comp_param[i].mutex.initialized) {
>          qemu_mutex_destroy(&comp_param[i].mutex);
>      }
>      if (comp_param[i].cond.initialized) {
>          qemu_cond_destroy(&comp_param[i].cond);
>      }
>
> but that would be unclean.  Instead, I'd initialize these guys first, so
> we can clean them up unconditionally.
>
> [2] This is used to make the thread terminate.  Must be done before we
> call qemu_thread_join().  I think it can safely be done always, as long
> as long as .mutex and .cond are initialized.  Trivial if we initialize
> them first.
Thanks for the detail analysis, it helps a lot! I translate the above 
[1] & [2] to "move the
below three '+' ahead of the initialization of comp_param[i].file" for 
xxx_setup():

+        qemu_mutex_init(&comp_param[i].mutex);
+        qemu_cond_init(&comp_param[i].cond);
+        comp_param[i].quit = false;
          /* comp_param[i].file is just used as a dummy buffer to save data,
           * set its ops to empty.
           */
          comp_param[i].file = qemu_fopen_ops(NULL, &empty_ops);
          comp_param[i].done = true;
          if (!qemu_thread_create(compress_threads + i, "compress",
          ...

And accordingly, do the corresponding change for the xxx_cleanup().

> [3]: I can't see a squeaky clean way to detect whether .stream has been
> initialized with deflateInit().  Here's a slightly unclean way:
> deflateInit() sets .stream.msg to null on success, and to non-null on
> failure.  We can make it non-null until we're ready to call
> deflateInit(), then have compress_threads_save_cleanup() clean up
> .stream when it's null.  If that's too unclean for your or your
> reviewers' taste, add a boolean @stream_initialized flag.

Emm, I am not sure either. Let's cc the migration maintainers to have 
their opinions.

Have a nice day, thanks for the review. :)
Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-07 17:29   ` Markus Armbruster
  2019-01-08  6:14     ` Jiri Slaby
@ 2019-01-13 15:44     ` Fei Li
  2019-01-14 12:36       ` Markus Armbruster
  1 sibling, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-13 15:44 UTC (permalink / raw)
  To: Markus Armbruster, Fei Li; +Cc: qemu-devel, shirley17fei, Jiri Slaby


在 2019/1/8 上午1:29, Markus Armbruster 写道:
> Fei Li <fli@suse.com> writes:
>
>> Utilize the existed errp to propagate the error instead of the
>> temporary &error_abort.
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Jiri Slaby <jslaby@suse.cz>
>> Signed-off-by: Fei Li <fli@suse.com>
>> ---
>>   hw/misc/edu.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/misc/edu.c b/hw/misc/edu.c
>> index 3f4ba7ded3..011fe6e0b7 100644
>> --- a/hw/misc/edu.c
>> +++ b/hw/misc/edu.c
>> @@ -356,9 +356,10 @@ static void pci_edu_realize(PCIDevice *pdev, Error **errp)
>>   
>>       qemu_mutex_init(&edu->thr_mutex);
>>       qemu_cond_init(&edu->thr_cond);
>> -    /* TODO: let the further caller handle the error instead of abort() here */
>> -    qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
>> -                       edu, QEMU_THREAD_JOINABLE, &error_abort);
>> +    if (!qemu_thread_create(&edu->thread, "edu", edu_fact_thread,
>> +                            edu, QEMU_THREAD_JOINABLE, errp)) {
>> +        return;
> You need to clean up everything that got initialized so far.  You might
> want to call qemu_thread_create() earlier so you have less to clean up.

Just to make sure about how to do the cleanup. I notice that in device_set_realized(),
the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails.

         if (dc->realize) {
             dc->realize(dev, &local_err);
         }

         if (local_err != NULL) {
             goto fail;
         }

Is this on purpose? (Maybe due to some devices' realize() do their own cleanup
when fails? Sorry for the unsure, it is such a common function that I did not
check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices.

Have a nice day
Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2019-01-08 16:18     ` fei
@ 2019-01-13 16:16       ` Fei Li
  2019-01-14 12:53         ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-13 16:16 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Stefan Hajnoczi, qemu-devel, shirley17fei


在 2019/1/9 上午12:18, fei 写道:
>
>> 在 2019年1月8日,01:50,Markus Armbruster <armbru@redhat.com> 写道:
>>
>> Fei Li <fli@suse.com> writes:
>>
>>> For iothread_complete: utilize the existed errp to propagate the
>>> error and do the corresponding cleanup to replace the temporary
>>> &error_abort.
>>>
>>> For qemu_signalfd_compat: add a local_err to hold the error, and
>>> return the corresponding error code to replace the temporary
>>> &error_abort.
>> I'd split the patch.
> Ok.
>>> Cc: Markus Armbruster <armbru@redhat.com>
>>> Cc: Eric Blake <eblake@redhat.com>
>>> Signed-off-by: Fei Li <fli@suse.com>
>>> ---
>>> iothread.c      | 17 +++++++++++------
>>> util/compatfd.c | 11 ++++++++---
>>> 2 files changed, 19 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/iothread.c b/iothread.c
>>> index 8e8aa01999..7335dacf0b 100644
>>> --- a/iothread.c
>>> +++ b/iothread.c
>>> @@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>                                  &local_error);
>>>      if (local_error) {
>>>          error_propagate(errp, local_error);
>>> -        aio_context_unref(iothread->ctx);
>>> -        iothread->ctx = NULL;
>>> -        return;
>>> +        goto fail;
>>>      }
>>>
>>>      qemu_mutex_init(&iothread->init_done_lock);
>>> @@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>       */
>>>      name = object_get_canonical_path_component(OBJECT(obj));
>>>      thread_name = g_strdup_printf("IO %s", name);
>>> -    /* TODO: let the further caller handle the error instead of abort() here */
>>> -    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>> -                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
>>> +    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>> +                            iothread, QEMU_THREAD_JOINABLE, errp)) {
>>> +        g_free(thread_name);
>>> +        g_free(name);
>> I suspect you're missing cleanup here:
>>
>>            qemu_cond_destroy(&iothread->init_done_cond);
>>            qemu_mutex_destroy(&iothread->init_done_lock);
> I remember I checked the code, when ucc->complete() fails, there’s a finalize() function to do the destroy.

To be specific, the qemu_xxx_destroy() is called by

object_unref() => object_finalize() => object_deinit() => 
type->instance_finalize(obj); (that is, iothread_instance_finalize).

For the iothread_complete(), it is only called in 
user_creatable_complete() as ucc->complete().
I checked the code, when callers of user_creatable_complete() fails, all 
of them will call
object_unref() to call the qemu_xxx_destroy(), except one &error_abort 
case (e.i. desugar_shm()).
> But did not test all the callers, so let’s wait for Stefan’s feedback. :)
But again, I did not do all the test. Correct me if I am wrong. :)
>> But I'm not 100% sure, to be honest.  Stefan, can you help?
>>
>>
>>> +        goto fail;
>>> +    }
>>>      g_free(thread_name);
>>>      g_free(name);
>>>
>> I'd avoid the code duplication like this:
>>
>>        thread_ok = qemu_thread_create(&iothread->thread, thread_name,
>>                                       iothread_run, iothread,
>>                                       QEMU_THREAD_JOINABLE, errp);
>>        g_free(thread_name);
>>        g_free(name);
>>        if (!thread_ok) {
>>            qemu_cond_destroy(&iothread->init_done_cond);
>>            qemu_mutex_destroy(&iothread->init_done_lock);
>>            goto fail;
>>        }

Ok, thanks.

Have a nice day
Fei
>> Matter of taste.
>>
>> Hmm, iothread.c has no maintainer.  Stefan, you created it, would you be
>> willing to serve as maintainer?
>>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-13 15:44     ` Fei Li
@ 2019-01-14 12:36       ` Markus Armbruster
  2019-01-14 13:38         ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-14 12:36 UTC (permalink / raw)
  To: Fei Li; +Cc: Fei Li, Jiri Slaby, qemu-devel, shirley17fei

Fei Li <lifei1214@126.com> writes:

> Just to make sure about how to do the cleanup. I notice that in device_set_realized(),
> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails.
>
>         if (dc->realize) {
>             dc->realize(dev, &local_err);
>         }
>
>         if (local_err != NULL) {
>             goto fail;
>         }
>
> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup
> when fails? Sorry for the unsure, it is such a common function that I did not
> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices.

Yes, this is on purpose.

When a realize() method fails, it must revert everything it has done so
far.  Results in sane "either succeed completely, or fail and do
nothing" semantics.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2019-01-13 16:16       ` Fei Li
@ 2019-01-14 12:53         ` Markus Armbruster
  2019-01-14 13:52           ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-14 12:53 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, Stefan Hajnoczi, shirley17fei

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/9 上午12:18, fei 写道:
>>
>>> 在 2019年1月8日,01:50,Markus Armbruster <armbru@redhat.com> 写道:
>>>
>>> Fei Li <fli@suse.com> writes:
>>>
>>>> For iothread_complete: utilize the existed errp to propagate the
>>>> error and do the corresponding cleanup to replace the temporary
>>>> &error_abort.
>>>>
>>>> For qemu_signalfd_compat: add a local_err to hold the error, and
>>>> return the corresponding error code to replace the temporary
>>>> &error_abort.
>>> I'd split the patch.
>> Ok.
>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>> Cc: Eric Blake <eblake@redhat.com>
>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>> ---
>>>> iothread.c      | 17 +++++++++++------
>>>> util/compatfd.c | 11 ++++++++---
>>>> 2 files changed, 19 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/iothread.c b/iothread.c
>>>> index 8e8aa01999..7335dacf0b 100644
>>>> --- a/iothread.c
>>>> +++ b/iothread.c
>>>> @@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>>                                  &local_error);
>>>>      if (local_error) {
>>>>          error_propagate(errp, local_error);
>>>> -        aio_context_unref(iothread->ctx);
>>>> -        iothread->ctx = NULL;
>>>> -        return;
>>>> +        goto fail;
>>>>      }
>>>>
>>>>      qemu_mutex_init(&iothread->init_done_lock);
>>>> @@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>>       */
>>>>      name = object_get_canonical_path_component(OBJECT(obj));
>>>>      thread_name = g_strdup_printf("IO %s", name);
>>>> -    /* TODO: let the further caller handle the error instead of abort() here */
>>>> -    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>>> -                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
>>>> +    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>>> +                            iothread, QEMU_THREAD_JOINABLE, errp)) {
>>>> +        g_free(thread_name);
>>>> +        g_free(name);
>>> I suspect you're missing cleanup here:
>>>
>>>            qemu_cond_destroy(&iothread->init_done_cond);
>>>            qemu_mutex_destroy(&iothread->init_done_lock);
>> I remember I checked the code, when ucc->complete() fails, there’s a finalize() function to do the destroy.
>
> To be specific, the qemu_xxx_destroy() is called by
>
> object_unref() => object_finalize() => object_deinit() =>
> type->instance_finalize(obj); (that is, iothread_instance_finalize).
>
> For the iothread_complete(), it is only called in
> user_creatable_complete() as ucc->complete().
> I checked the code, when callers of user_creatable_complete() fails,
> all of them will call
> object_unref() to call the qemu_xxx_destroy(), except one &error_abort
> case (e.i. desugar_shm()).

I'm not familiar with iothread.c.  But like anyone capable of reading C,
I can find out stuff.

iothread_instance_finalize() guards its cleanups.  In particular, it
cleans up ->init_done_cond and init_done_lock only when ->thread_id !=
-1.

iothread_instance_init() initializes ->thread_id = -1.

iothread_run() sets it to the actual thread ID.

When iothread_instance_complete() succeeds, it has waited for
->thread_id to become != -1, in the /* Wait for initialization to
complete */ loop.

When it fails, ->thread_id is still -1.

Therefore, you cannot rely on iothread_instance_finalize() for cleaning
up ->init_done_lock and ->init_done_cond on iothread_instance_complete()
failure.

I'm pretty sure you could've figured this out yourself instead of
relying on me.

[...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-14 12:36       ` Markus Armbruster
@ 2019-01-14 13:38         ` Fei Li
  2019-01-15 12:55           ` Markus Armbruster
  0 siblings, 1 reply; 74+ messages in thread
From: Fei Li @ 2019-01-14 13:38 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fei Li, Jiri Slaby, qemu-devel, shirley17fei


在 2019/1/14 下午8:36, Markus Armbruster 写道:
> Fei Li <lifei1214@126.com> writes:
>
>> Just to make sure about how to do the cleanup. I notice that in device_set_realized(),
>> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails.
Sorry that I am still uncertain.. I guess the code below I pasted was 
misleading,
actually I want to stress the *dc->unrealize() is not called when 
dc->realize() fails*
and the incomplete below "goto fail" does not include the dc->unrealize(),
but instead the dc->unrealize() is included in later child_realize_fail: 
& post_realize_fail:.


Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is 
either should be
called in the common function: device_set_realized() in a unified way, 
that is

         if (local_err != NULL) {
+          if (dc->unrealize) {
+              dc->unrealize(dev, local_err);
+          }
             goto fail;
         }

or do the unrealize() locally for each device earlier when dc->realize() 
fails.
But I checked several dc->realize() function, they did not call unrealize()
when fails. Besides, it may mean verbose code if unrealize() locally.
Thus I think the above way is the right way to do the cleanup when 
realize() fails.
>>
>>          if (dc->realize) {
>>              dc->realize(dev, &local_err);
>>          }
>>
>>          if (local_err != NULL) {
>>              goto fail;
>>          }
>>
>> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup
>> when fails? Sorry for the unsure, it is such a common function that I did not
>> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices.
> Yes, this is on purpose.
>
> When a realize() method fails, it must revert everything it has done so
> far.  Results in sane "either succeed completely, or fail and do
> nothing" semantics.

Have a nice day, thanks

Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat
  2019-01-14 12:53         ` Markus Armbruster
@ 2019-01-14 13:52           ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-14 13:52 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-devel, Stefan Hajnoczi, shirley17fei


在 2019/1/14 下午8:53, Markus Armbruster 写道:
> Fei Li <lifei1214@126.com> writes:
>
>> 在 2019/1/9 上午12:18, fei 写道:
>>>> 在 2019年1月8日,01:50,Markus Armbruster <armbru@redhat.com> 写道:
>>>>
>>>> Fei Li <fli@suse.com> writes:
>>>>
>>>>> For iothread_complete: utilize the existed errp to propagate the
>>>>> error and do the corresponding cleanup to replace the temporary
>>>>> &error_abort.
>>>>>
>>>>> For qemu_signalfd_compat: add a local_err to hold the error, and
>>>>> return the corresponding error code to replace the temporary
>>>>> &error_abort.
>>>> I'd split the patch.
>>> Ok.
>>>>> Cc: Markus Armbruster <armbru@redhat.com>
>>>>> Cc: Eric Blake <eblake@redhat.com>
>>>>> Signed-off-by: Fei Li <fli@suse.com>
>>>>> ---
>>>>> iothread.c      | 17 +++++++++++------
>>>>> util/compatfd.c | 11 ++++++++---
>>>>> 2 files changed, 19 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/iothread.c b/iothread.c
>>>>> index 8e8aa01999..7335dacf0b 100644
>>>>> --- a/iothread.c
>>>>> +++ b/iothread.c
>>>>> @@ -164,9 +164,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>>>                                   &local_error);
>>>>>       if (local_error) {
>>>>>           error_propagate(errp, local_error);
>>>>> -        aio_context_unref(iothread->ctx);
>>>>> -        iothread->ctx = NULL;
>>>>> -        return;
>>>>> +        goto fail;
>>>>>       }
>>>>>
>>>>>       qemu_mutex_init(&iothread->init_done_lock);
>>>>> @@ -178,9 +176,12 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
>>>>>        */
>>>>>       name = object_get_canonical_path_component(OBJECT(obj));
>>>>>       thread_name = g_strdup_printf("IO %s", name);
>>>>> -    /* TODO: let the further caller handle the error instead of abort() here */
>>>>> -    qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>>>> -                       iothread, QEMU_THREAD_JOINABLE, &error_abort);
>>>>> +    if (!qemu_thread_create(&iothread->thread, thread_name, iothread_run,
>>>>> +                            iothread, QEMU_THREAD_JOINABLE, errp)) {
>>>>> +        g_free(thread_name);
>>>>> +        g_free(name);
>>>> I suspect you're missing cleanup here:
>>>>
>>>>             qemu_cond_destroy(&iothread->init_done_cond);
>>>>             qemu_mutex_destroy(&iothread->init_done_lock);
>>> I remember I checked the code, when ucc->complete() fails, there’s a finalize() function to do the destroy.
>> To be specific, the qemu_xxx_destroy() is called by
>>
>> object_unref() => object_finalize() => object_deinit() =>
>> type->instance_finalize(obj); (that is, iothread_instance_finalize).
>>
>> For the iothread_complete(), it is only called in
>> user_creatable_complete() as ucc->complete().
>> I checked the code, when callers of user_creatable_complete() fails,
>> all of them will call
>> object_unref() to call the qemu_xxx_destroy(), except one &error_abort
>> case (e.i. desugar_shm()).
> I'm not familiar with iothread.c.  But like anyone capable of reading C,
> I can find out stuff.
>
> iothread_instance_finalize() guards its cleanups.  In particular, it
> cleans up ->init_done_cond and init_done_lock only when ->thread_id !=
> -1.
Ah, sorry that I overlooked the "if (iothread->thread_id != -1)".
So embarrassed, and sorry for the trouble.. You are right, and I
will add the qemu_xxx_destroy() in the next version. ;)


Have a nice day, thanks so much!
Fei
>
> iothread_instance_init() initializes ->thread_id = -1.
>
> iothread_run() sets it to the actual thread ID.
>
> When iothread_instance_complete() succeeds, it has waited for
> ->thread_id to become != -1, in the /* Wait for initialization to
> complete */ loop.
>
> When it fails, ->thread_id is still -1.
>
> Therefore, you cannot rely on iothread_instance_finalize() for cleaning
> up ->init_done_lock and ->init_done_cond on iothread_instance_complete()
> failure.
>
> I'm pretty sure you could've figured this out yourself instead of
> relying on me.
>
> [...]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-14 13:38         ` Fei Li
@ 2019-01-15 12:55           ` Markus Armbruster
  2019-01-16  4:43             ` Fei Li
  0 siblings, 1 reply; 74+ messages in thread
From: Markus Armbruster @ 2019-01-15 12:55 UTC (permalink / raw)
  To: Fei Li; +Cc: Jiri Slaby, qemu-devel, shirley17fei

Fei Li <lifei1214@126.com> writes:

> 在 2019/1/14 下午8:36, Markus Armbruster 写道:
>> Fei Li <lifei1214@126.com> writes:
>>
>>> Just to make sure about how to do the cleanup. I notice that in device_set_realized(),
>>> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails.
> Sorry that I am still uncertain.. I guess the code below I pasted was
> misleading,
> actually I want to stress the *dc->unrealize() is not called when
> dc->realize() fails*
> and the incomplete below "goto fail" does not include the dc->unrealize(),
> but instead the dc->unrealize() is included in later
> child_realize_fail: & post_realize_fail:.
>
>
> Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is
> either should be
> called in the common function: device_set_realized() in a unified way,
> that is
>
>         if (local_err != NULL) {
> +          if (dc->unrealize) {
> +              dc->unrealize(dev, local_err);
> +          }
>             goto fail;
>         }
>
> or do the unrealize() locally for each device earlier when
> dc->realize() fails.
>
> But I checked several dc->realize() function, they did not call unrealize()
> when fails. Besides, it may mean verbose code if unrealize() locally.
> Thus I think the above way is the right way to do the cleanup when
> realize() fails.

The realize() method is specified to either succeed completely or fail
completely, i.e. fail and do nothing.  The "either succeed completely or
fail completely" aspect of the specification is sane and perfectly
ordinary.

How a concrete implementation accomplishes "fail completely" is up to
the implementation.

An implementation may choose to structure its FOO_realize() and
FOO_unrealize() in a way that lets FOO_realize() call FOO_unrealize() to
clean up on failure.

An implementation may also choose to clean up differently.

This freedom of choice is by design.

Changing the specification now would involve auditing and updating all
realize() and unrealize() methods.  Not going to happen without an
extremely compelling reason.

>>>
>>>          if (dc->realize) {
>>>              dc->realize(dev, &local_err);
>>>          }
>>>
>>>          if (local_err != NULL) {
>>>              goto fail;
>>>          }
>>>
>>> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup
>>> when fails? Sorry for the unsure, it is such a common function that I did not
>>> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices.
>> Yes, this is on purpose.
>>
>> When a realize() method fails, it must revert everything it has done so
>> far.  Results in sane "either succeed completely, or fail and do
>> nothing" semantics.
>
> Have a nice day, thanks
>
> Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize
  2019-01-15 12:55           ` Markus Armbruster
@ 2019-01-16  4:43             ` Fei Li
  0 siblings, 0 replies; 74+ messages in thread
From: Fei Li @ 2019-01-16  4:43 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Jiri Slaby, qemu-devel, shirley17fei


在 2019/1/15 下午8:55, Markus Armbruster 写道:
> Fei Li <lifei1214@126.com> writes:
>
>> 在 2019/1/14 下午8:36, Markus Armbruster 写道:
>>> Fei Li <lifei1214@126.com> writes:
>>>
>>>> Just to make sure about how to do the cleanup. I notice that in device_set_realized(),
>>>> the current code does not call "dc->unrealize(dev, NULL);" when dc->realize() fails.
>> Sorry that I am still uncertain.. I guess the code below I pasted was
>> misleading,
>> actually I want to stress the *dc->unrealize() is not called when
>> dc->realize() fails*
>> and the incomplete below "goto fail" does not include the dc->unrealize(),
>> but instead the dc->unrealize() is included in later
>> child_realize_fail: & post_realize_fail:.
>>
>>
>> Emm, IMHO, I think when dc->realize() fails, the dc->unrealize() is
>> either should be
>> called in the common function: device_set_realized() in a unified way,
>> that is
>>
>>          if (local_err != NULL) {
>> +          if (dc->unrealize) {
>> +              dc->unrealize(dev, local_err);
>> +          }
>>              goto fail;
>>          }
>>
>> or do the unrealize() locally for each device earlier when
>> dc->realize() fails.
>>
>> But I checked several dc->realize() function, they did not call unrealize()
>> when fails. Besides, it may mean verbose code if unrealize() locally.
>> Thus I think the above way is the right way to do the cleanup when
>> realize() fails.
> The realize() method is specified to either succeed completely or fail
> completely, i.e. fail and do nothing.  The "either succeed completely or
> fail completely" aspect of the specification is sane and perfectly
> ordinary.
>
> How a concrete implementation accomplishes "fail completely" is up to
> the implementation.
>
> An implementation may choose to structure its FOO_realize() and
> FOO_unrealize() in a way that lets FOO_realize() call FOO_unrealize() to
> clean up on failure.
>
> An implementation may also choose to clean up differently.
>
> This freedom of choice is by design.
>
> Changing the specification now would involve auditing and updating all
> realize() and unrealize() methods.  Not going to happen without an
> extremely compelling reason.

Ok, now I see. Thanks for the detail explanation. :)

Will add the cleanup in the next version.

Have a nice day
Fei
>>>>           if (dc->realize) {
>>>>               dc->realize(dev, &local_err);
>>>>           }
>>>>
>>>>           if (local_err != NULL) {
>>>>               goto fail;
>>>>           }
>>>>
>>>> Is this on purpose? (Maybe due to some devices' realize() do their own cleanup
>>>> when fails? Sorry for the unsure, it is such a common function that I did not
>>>> check all. :( ) Or else, I prefer to do the cleanup in a unified manner, e.g. call "dc->unrealize(dev, NULL);" which is the pci_qdev_unrealize() for pci devices.
>>> Yes, this is on purpose.
>>>
>>> When a realize() method fails, it must revert everything it has done so
>>> far.  Results in sane "either succeed completely, or fail and do
>>> nothing" semantics.
>> Have a nice day, thanks
>>
>> Fei

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2019-01-16  4:44 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-25 14:04 [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 01/16] Fix segmentation fault when qemu_signal_init fails Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 02/16] migration: fix the multifd code when receiving less channels Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 03/16] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
2019-01-07 16:50   ` Markus Armbruster
2019-01-08 15:58     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 04/16] migration: add more error handling for postcopy_ram_enable_notify Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 05/16] migration: unify error handling for process_incoming_migration_co Fei Li
2019-01-03 11:25   ` Dr. David Alan Gilbert
2019-01-03 13:27     ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 06/16] qemu_thread: Make qemu_thread_create() handle errors properly Fei Li
2019-01-07 17:18   ` Markus Armbruster
2019-01-08 15:55     ` fei
2019-01-08 17:07       ` Markus Armbruster
2019-01-09 13:19         ` Fei Li
2019-01-09 14:36           ` Markus Armbruster
2019-01-09 14:42             ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 07/16] qemu_thread: supplement error handling for qemu_X_start_vcpu Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 08/16] qemu_thread: supplement error handling for qmp_dump_guest_memory Fei Li
2019-01-07 17:21   ` Markus Armbruster
2019-01-08 16:00     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 09/16] qemu_thread: supplement error handling for pci_edu_realize Fei Li
2019-01-07 17:29   ` Markus Armbruster
2019-01-08  6:14     ` Jiri Slaby
2019-01-08  6:51       ` Peter Xu
2019-01-08  8:43         ` Markus Armbruster
2019-01-10 13:29           ` Fei Li
2019-01-11  2:49             ` Peter Xu
2019-01-11 13:19               ` Fei Li
2019-01-13 15:44     ` Fei Li
2019-01-14 12:36       ` Markus Armbruster
2019-01-14 13:38         ` Fei Li
2019-01-15 12:55           ` Markus Armbruster
2019-01-16  4:43             ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 10/16] qemu_thread: supplement error handling for h_resize_hpt_prepare Fei Li
2019-01-02  2:36   ` David Gibson
2019-01-02  6:44     ` 李菲
2019-01-03  3:43       ` David Gibson
2019-01-03 13:41         ` Fei Li
2019-01-04  5:21           ` David Gibson
2019-01-04  6:20             ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 11/16] qemu_thread: supplement error handling for emulated_realize Fei Li
2019-01-07 17:31   ` Markus Armbruster
2019-01-09 13:21     ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 12/16] qemu_thread: supplement error handling for iothread_complete/qemu_signalfd_compat Fei Li
2019-01-07 17:50   ` Markus Armbruster
2019-01-08 16:18     ` fei
2019-01-13 16:16       ` Fei Li
2019-01-14 12:53         ` Markus Armbruster
2019-01-14 13:52           ` Fei Li
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 13/16] qemu_thread: supplement error handling for migration Fei Li
2019-01-03 12:35   ` Dr. David Alan Gilbert
2019-01-03 12:47     ` Fei Li
2019-01-09 15:26   ` Markus Armbruster
2019-01-09 16:01     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 14/16] qemu_thread: supplement error handling for vnc_start_worker_thread Fei Li
2019-01-07 17:54   ` Markus Armbruster
2019-01-08 16:24     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 15/16] qemu_thread: supplement error handling for touch_all_pages Fei Li
2019-01-07 18:13   ` Markus Armbruster
2019-01-09 16:13     ` fei
2018-12-25 14:04 ` [Qemu-devel] [PATCH for-4.0 v9 16/16] qemu_thread_join: fix segmentation fault Fei Li
2019-01-07 17:55   ` Markus Armbruster
2019-01-08 16:50     ` fei
2019-01-08 17:29       ` Markus Armbruster
2019-01-09 14:01         ` Fei Li
2019-01-09 15:24           ` Markus Armbruster
2019-01-09 15:57             ` fei
2019-01-10  9:20               ` Markus Armbruster
2019-01-10 13:24                 ` Fei Li
2019-01-10 16:06                   ` Markus Armbruster
2019-01-11 14:01                     ` Fei Li
2019-01-02 13:46 ` [Qemu-devel] [PATCH for-4.0 v9 00/16] qemu_thread_create: propagate the error to callers to handle no-reply
2019-01-07 12:44   ` Fei Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.