All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues
@ 2018-11-28 10:33 Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails Fei Li
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel

These five patches are almost get the Reviewed-by and extracted from
previous "[PATCH RFC v7 0/9] qemu_thread_create: propagate errors to
callers to check."  The mentioned patch series have waited on one
multifd issue for a while and still needs a further discussion.

Thus separate(send) these five almost-done patches and hope they can
be merged for the next tag. Thanks for the review. :)


Fei Li (5):
  Fix segmentation fault when qemu_signal_init fails
  qemu_thread_join: fix segmentation fault
  migration: fix the multifd code when receiving less channels
  migration: remove unused &local_err parameter in multifd_save_cleanup
  migration: add more error handling for postcopy_ram_enable_notify

 migration/channel.c      | 11 ++++++-----
 migration/migration.c    | 14 ++++++++------
 migration/migration.h    |  2 +-
 migration/postcopy-ram.c |  1 +
 migration/ram.c          | 18 ++++++++++--------
 migration/ram.h          |  4 ++--
 migration/savevm.c       |  1 +
 util/main-loop.c         |  8 ++++----
 util/qemu-thread-posix.c |  3 +++
 util/qemu-thread-win32.c |  2 +-
 10 files changed, 37 insertions(+), 27 deletions(-)

-- 
2.13.7

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails
  2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
@ 2018-11-28 10:33 ` Fei Li
  2018-11-28 12:53   ` Markus Armbruster
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault Fei Li
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Markus Armbruster, Fam Zheng

When qemu_signal_init() fails in qemu_init_main_loop(), we return
without setting an error.  Its callers crash then when they try to
report the error with error_report_err().

To avoid such segmentation fault, add a new Error parameter to make
the call trace to propagate the err to the final caller.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 util/main-loop.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/util/main-loop.c b/util/main-loop.c
index affe0403c5..443cb4cfe8 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -71,7 +71,7 @@ static void sigfd_handler(void *opaque)
     }
 }
 
-static int qemu_signal_init(void)
+static int qemu_signal_init(Error **errp)
 {
     int sigfd;
     sigset_t set;
@@ -96,7 +96,7 @@ static int qemu_signal_init(void)
     sigdelset(&set, SIG_IPI);
     sigfd = qemu_signalfd(&set);
     if (sigfd == -1) {
-        fprintf(stderr, "failed to create signalfd\n");
+        error_setg_errno(errp, errno, "failed to create signalfd");
         return -errno;
     }
 
@@ -109,7 +109,7 @@ static int qemu_signal_init(void)
 
 #else /* _WIN32 */
 
-static int qemu_signal_init(void)
+static int qemu_signal_init(Error **errp)
 {
     return 0;
 }
@@ -148,7 +148,7 @@ int qemu_init_main_loop(Error **errp)
 
     init_clocks(qemu_timer_notify_cb);
 
-    ret = qemu_signal_init();
+    ret = qemu_signal_init(errp);
     if (ret) {
         return ret;
     }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault
  2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails Fei Li
@ 2018-11-28 10:33 ` Fei Li
  2018-11-29 14:32   ` Philippe Mathieu-Daudé
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels Fei Li
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel

To avoid the segmentation fault in qemu_thread_join(), just directly
return when the QemuThread *thread failed to be created in either
qemu-thread-posix.c or qemu-thread-win32.c.

Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
---
 util/qemu-thread-posix.c | 3 +++
 util/qemu-thread-win32.c | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 865e476df5..b9ab5a4711 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -558,6 +558,9 @@ void *qemu_thread_join(QemuThread *thread)
     int err;
     void *ret;
 
+    if (!thread->thread) {
+        return NULL;
+    }
     err = pthread_join(thread->thread, &ret);
     if (err) {
         error_exit(err, __func__);
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 4a363ca675..1a27e1cf6f 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -366,7 +366,7 @@ void *qemu_thread_join(QemuThread *thread)
     HANDLE handle;
 
     data = thread->data;
-    if (data->mode == QEMU_THREAD_DETACHED) {
+    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {
         return NULL;
     }
 
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels
  2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault Fei Li
@ 2018-11-28 10:33 ` Fei Li
  2018-11-28 12:17   ` Peter Xu
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 4/5] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 5/5] migration: add more error handling for postcopy_ram_enable_notify Fei Li
  4 siblings, 1 reply; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr . David Alan Gilbert, Peter Xu

In our current code, when multifd is used during migration, if there
is an error before the destination receives all new channels, the
source keeps running, however the destination does not exit but keeps
waiting until the source is killed deliberately.

Fix this by dumping the specific error and let users decide whether
to quit from the destination side when failing to receive packet via
some channel.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 migration/channel.c   | 11 ++++++-----
 migration/migration.c |  9 +++++++--
 migration/migration.h |  2 +-
 migration/ram.c       |  7 ++++++-
 migration/ram.h       |  2 +-
 5 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 33e0e9b82f..20e4c8e2dc 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -30,6 +30,7 @@
 void migration_channel_process_incoming(QIOChannel *ioc)
 {
     MigrationState *s = migrate_get_current();
+    Error *local_err = NULL;
 
     trace_migration_set_incoming_channel(
         ioc, object_get_typename(OBJECT(ioc)));
@@ -38,13 +39,13 @@ void migration_channel_process_incoming(QIOChannel *ioc)
         *s->parameters.tls_creds &&
         !object_dynamic_cast(OBJECT(ioc),
                              TYPE_QIO_CHANNEL_TLS)) {
-        Error *local_err = NULL;
         migration_tls_channel_process_incoming(s, ioc, &local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        }
     } else {
-        migration_ioc_process_incoming(ioc);
+        migration_ioc_process_incoming(ioc, &local_err);
+    }
+
+    if (local_err) {
+        error_report_err(local_err);
     }
 }
 
diff --git a/migration/migration.c b/migration/migration.c
index 49ffb9997a..72106bddf0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -541,7 +541,7 @@ void migration_fd_process_incoming(QEMUFile *f)
     migration_incoming_process();
 }
 
-void migration_ioc_process_incoming(QIOChannel *ioc)
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     bool start_migration;
@@ -563,9 +563,14 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
          */
         start_migration = !migrate_use_multifd();
     } else {
+        Error *local_err = NULL;
         /* Multiple connections */
         assert(migrate_use_multifd());
-        start_migration = multifd_recv_new_channel(ioc);
+        start_migration = multifd_recv_new_channel(ioc, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            return;
+        }
     }
 
     if (start_migration) {
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d8b6..02b7304610 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -229,7 +229,7 @@ struct MigrationState
 void migrate_set_state(int *state, int old_state, int new_state);
 
 void migration_fd_process_incoming(QEMUFile *f);
-void migration_ioc_process_incoming(QIOChannel *ioc);
+void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
 void migration_incoming_process(void);
 
 bool  migration_has_all_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec4d8..e13b9349d0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1323,7 +1323,7 @@ bool multifd_recv_all_channels_created(void)
 }
 
 /* Return true if multifd is ready for the migration, otherwise false */
-bool multifd_recv_new_channel(QIOChannel *ioc)
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp)
 {
     MultiFDRecvParams *p;
     Error *local_err = NULL;
@@ -1331,6 +1331,10 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
 
     id = multifd_recv_initial_packet(ioc, &local_err);
     if (id < 0) {
+        error_propagate_prepend(errp, local_err,
+                                "failed to receive packet"
+                                " via multifd channel %d: ",
+                                multifd_recv_state->count);
         multifd_recv_terminate_threads(local_err);
         return false;
     }
@@ -1340,6 +1344,7 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
         error_setg(&local_err, "multifd: received id '%d' already setup'",
                    id);
         multifd_recv_terminate_threads(local_err);
+        error_propagate(errp, local_err);
         return false;
     }
     p->c = ioc;
diff --git a/migration/ram.h b/migration/ram.h
index 83ff1bc11a..046d3074be 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -47,7 +47,7 @@ int multifd_save_cleanup(Error **errp);
 int multifd_load_setup(void);
 int multifd_load_cleanup(Error **errp);
 bool multifd_recv_all_channels_created(void);
-bool multifd_recv_new_channel(QIOChannel *ioc);
+bool multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
 
 uint64_t ram_pagesize_summary(void);
 int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH RFC 4/5] migration: remove unused &local_err parameter in multifd_save_cleanup
  2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
                   ` (2 preceding siblings ...)
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels Fei Li
@ 2018-11-28 10:33 ` Fei Li
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 5/5] migration: add more error handling for postcopy_ram_enable_notify Fei Li
  4 siblings, 0 replies; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel

Always call migrate_set_error() to set the error state without relying
on whether multifd_save_cleanup() succeeds.  As the passed &local_err
is never used in multifd_save_cleanup(), remove it. And make the
function be: void multifd_save_cleanup(void).

Signed-off-by: Fei Li <fli@suse.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c |  5 +----
 migration/ram.c       | 11 ++++-------
 migration/ram.h       |  2 +-
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 72106bddf0..0537fc0c26 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1386,7 +1386,6 @@ static void migrate_fd_cleanup(void *opaque)
     qemu_savevm_state_cleanup();
 
     if (s->to_dst_file) {
-        Error *local_err = NULL;
         QEMUFile *tmp;
 
         trace_migrate_fd_cleanup();
@@ -1397,9 +1396,7 @@ static void migrate_fd_cleanup(void *opaque)
         }
         qemu_mutex_lock_iothread();
 
-        if (multifd_save_cleanup(&local_err) != 0) {
-            error_report_err(local_err);
-        }
+        multifd_save_cleanup();
         qemu_mutex_lock(&s->qemu_file_lock);
         tmp = s->to_dst_file;
         s->to_dst_file = NULL;
diff --git a/migration/ram.c b/migration/ram.c
index e13b9349d0..c44cb6f56d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -917,13 +917,12 @@ static void multifd_send_terminate_threads(Error *err)
     }
 }
 
-int multifd_save_cleanup(Error **errp)
+void multifd_save_cleanup(void)
 {
     int i;
-    int ret = 0;
 
     if (!migrate_use_multifd()) {
-        return 0;
+        return;
     }
     multifd_send_terminate_threads(NULL);
     for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -953,7 +952,6 @@ int multifd_save_cleanup(Error **errp)
     multifd_send_state->pages = NULL;
     g_free(multifd_send_state);
     multifd_send_state = NULL;
-    return ret;
 }
 
 static void multifd_send_sync_main(void)
@@ -1071,9 +1069,8 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
     Error *local_err = NULL;
 
     if (qio_task_propagate_error(task, &local_err)) {
-        if (multifd_save_cleanup(&local_err) != 0) {
-            migrate_set_error(migrate_get_current(), local_err);
-        }
+        migrate_set_error(migrate_get_current(), local_err);
+        multifd_save_cleanup();
     } else {
         p->c = QIO_CHANNEL(sioc);
         qio_channel_set_delay(p->c, false);
diff --git a/migration/ram.h b/migration/ram.h
index 046d3074be..936177b3e9 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -43,7 +43,7 @@ uint64_t ram_bytes_remaining(void);
 uint64_t ram_bytes_total(void);
 
 int multifd_save_setup(void);
-int multifd_save_cleanup(Error **errp);
+void multifd_save_cleanup(void);
 int multifd_load_setup(void);
 int multifd_load_cleanup(Error **errp);
 bool multifd_recv_all_channels_created(void);
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH RFC 5/5] migration: add more error handling for postcopy_ram_enable_notify
  2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
                   ` (3 preceding siblings ...)
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 4/5] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
@ 2018-11-28 10:33 ` Fei Li
  4 siblings, 0 replies; 13+ messages in thread
From: Fei Li @ 2018-11-28 10:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr . David Alan Gilbert

Call postcopy_ram_incoming_cleanup() to do the cleanup when
postcopy_ram_enable_notify fails. Besides, report the error
message when qemu_ram_foreach_migratable_block() fails.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Fei Li <fli@suse.com>
---
 migration/postcopy-ram.c | 1 +
 migration/savevm.c       | 1 +
 2 files changed, 2 insertions(+)

diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index e5c02a32c5..fa09dba534 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -1117,6 +1117,7 @@ int postcopy_ram_enable_notify(MigrationIncomingState *mis)
 
     /* Mark so that we get notified of accesses to unwritten areas */
     if (qemu_ram_foreach_migratable_block(ram_block_enable_notify, mis)) {
+        error_report("ram_block_enable_notify failed");
         return -1;
     }
 
diff --git a/migration/savevm.c b/migration/savevm.c
index ef707b8c43..0d76232931 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1728,6 +1728,7 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
      */
     if (migrate_postcopy_ram()) {
         if (postcopy_ram_enable_notify(mis)) {
+            postcopy_ram_incoming_cleanup(mis);
             return -1;
         }
     }
-- 
2.13.7

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels Fei Li
@ 2018-11-28 12:17   ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2018-11-28 12:17 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, Dr . David Alan Gilbert

On Wed, Nov 28, 2018 at 06:33:06PM +0800, Fei Li wrote:
> In our current code, when multifd is used during migration, if there
> is an error before the destination receives all new channels, the
> source keeps running, however the destination does not exit but keeps
> waiting until the source is killed deliberately.
> 
> Fix this by dumping the specific error and let users decide whether
> to quit from the destination side when failing to receive packet via
> some channel.
> 
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails Fei Li
@ 2018-11-28 12:53   ` Markus Armbruster
  2018-11-29  6:16     ` Fei Li
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2018-11-28 12:53 UTC (permalink / raw)
  To: Fei Li; +Cc: qemu-devel, Fam Zheng

Fei Li <fli@suse.com> writes:

> When qemu_signal_init() fails in qemu_init_main_loop(), we return
> without setting an error.  Its callers crash then when they try to
> report the error with error_report_err().

Yes, that's a bug.  Broken in 2f78e491d7b, v2.2.0.  Has escaped notice
since qemu_signalfd() is quite unlikely to fail.  Could go into 3.1 as a
bug fix, but I think punting it to the next release is just fine.

> To avoid such segmentation fault, add a new Error parameter to make
> the call trace to propagate the err to the final caller.
>

Let's add:

  Fixes: 2f78e491d7b46542158ce0b8132ee4e05bc0ade4

> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Fam Zheng <famz@redhat.com>
> Signed-off-by: Fei Li <fli@suse.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> ---
>  util/main-loop.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/util/main-loop.c b/util/main-loop.c
> index affe0403c5..443cb4cfe8 100644
> --- a/util/main-loop.c
> +++ b/util/main-loop.c
> @@ -71,7 +71,7 @@ static void sigfd_handler(void *opaque)
>      }
>  }
>  
> -static int qemu_signal_init(void)
> +static int qemu_signal_init(Error **errp)
>  {
>      int sigfd;
>      sigset_t set;
> @@ -96,7 +96,7 @@ static int qemu_signal_init(void)
>      sigdelset(&set, SIG_IPI);
>      sigfd = qemu_signalfd(&set);
>      if (sigfd == -1) {
> -        fprintf(stderr, "failed to create signalfd\n");
> +        error_setg_errno(errp, errno, "failed to create signalfd");
>          return -errno;
>      }
>  
> @@ -109,7 +109,7 @@ static int qemu_signal_init(void)
>  
>  #else /* _WIN32 */
>  
> -static int qemu_signal_init(void)
> +static int qemu_signal_init(Error **errp)
>  {
>      return 0;
>  }
> @@ -148,7 +148,7 @@ int qemu_init_main_loop(Error **errp)
>  
>      init_clocks(qemu_timer_notify_cb);
>  
> -    ret = qemu_signal_init();
> +    ret = qemu_signal_init(errp);
>      if (ret) {
>          return ret;
>      }

Reviewed-by: Markus Armbruster <armbru@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails
  2018-11-28 12:53   ` Markus Armbruster
@ 2018-11-29  6:16     ` Fei Li
  2018-11-29  8:35       ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Fei Li @ 2018-11-29  6:16 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fam Zheng, qemu-devel



On 11/28/2018 08:53 PM, Markus Armbruster wrote:
> Fei Li <fli@suse.com> writes:
>
>> When qemu_signal_init() fails in qemu_init_main_loop(), we return
>> without setting an error.  Its callers crash then when they try to
>> report the error with error_report_err().
> Yes, that's a bug.  Broken in 2f78e491d7b, v2.2.0.  Has escaped notice
> since qemu_signalfd() is quite unlikely to fail.  Could go into 3.1 as a
> bug fix, but I think punting it to the next release is just fine.
Thanks. :) BTW, should I send the next version only includes patch 1/5
and 2/5 separately so that you can merge? (I guess Dave will help to
merge the other three migration related patches)
>
>> To avoid such segmentation fault, add a new Error parameter to make
>> the call trace to propagate the err to the final caller.
>>
> Let's add:
>
>    Fixes: 2f78e491d7b46542158ce0b8132ee4e05bc0ade4
ok.
>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Fam Zheng <famz@redhat.com>
>> Signed-off-by: Fei Li <fli@suse.com>
>> Reviewed-by: Fam Zheng <famz@redhat.com>
>> ---
>>   util/main-loop.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/util/main-loop.c b/util/main-loop.c
>> index affe0403c5..443cb4cfe8 100644
>> --- a/util/main-loop.c
>> +++ b/util/main-loop.c
>> @@ -71,7 +71,7 @@ static void sigfd_handler(void *opaque)
>>       }
>>   }
>>   
>> -static int qemu_signal_init(void)
>> +static int qemu_signal_init(Error **errp)
>>   {
>>       int sigfd;
>>       sigset_t set;
>> @@ -96,7 +96,7 @@ static int qemu_signal_init(void)
>>       sigdelset(&set, SIG_IPI);
>>       sigfd = qemu_signalfd(&set);
>>       if (sigfd == -1) {
>> -        fprintf(stderr, "failed to create signalfd\n");
>> +        error_setg_errno(errp, errno, "failed to create signalfd");
>>           return -errno;
>>       }
>>   
>> @@ -109,7 +109,7 @@ static int qemu_signal_init(void)
>>   
>>   #else /* _WIN32 */
>>   
>> -static int qemu_signal_init(void)
>> +static int qemu_signal_init(Error **errp)
>>   {
>>       return 0;
>>   }
>> @@ -148,7 +148,7 @@ int qemu_init_main_loop(Error **errp)
>>   
>>       init_clocks(qemu_timer_notify_cb);
>>   
>> -    ret = qemu_signal_init();
>> +    ret = qemu_signal_init(errp);
>>       if (ret) {
>>           return ret;
>>       }
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
Thanks again for the review.

Have a nice day
Fei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails
  2018-11-29  6:16     ` Fei Li
@ 2018-11-29  8:35       ` Markus Armbruster
  2018-11-29  9:15         ` Fei Li
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2018-11-29  8:35 UTC (permalink / raw)
  To: Fei Li; +Cc: Fam Zheng, qemu-devel

Fei Li <fli@suse.com> writes:

> On 11/28/2018 08:53 PM, Markus Armbruster wrote:
>> Fei Li <fli@suse.com> writes:
>>
>>> When qemu_signal_init() fails in qemu_init_main_loop(), we return
>>> without setting an error.  Its callers crash then when they try to
>>> report the error with error_report_err().
>> Yes, that's a bug.  Broken in 2f78e491d7b, v2.2.0.  Has escaped notice
>> since qemu_signalfd() is quite unlikely to fail.  Could go into 3.1 as a
>> bug fix, but I think punting it to the next release is just fine.
> Thanks. :) BTW, should I send the next version only includes patch 1/5
> and 2/5 separately so that you can merge? (I guess Dave will help to
> merge the other three migration related patches)

I can pick patches out of a series for merging, and I trust Dave can,
too.  But keeping unrelated fixes separate is a good idea.  I can see
three groups: PATCH 1 (main loop), PATCH 2 (thread abstraction), PATCH
3-5 (migration).

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails
  2018-11-29  8:35       ` Markus Armbruster
@ 2018-11-29  9:15         ` Fei Li
  0 siblings, 0 replies; 13+ messages in thread
From: Fei Li @ 2018-11-29  9:15 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Fam Zheng, qemu-devel



On 11/29/2018 04:35 PM, Markus Armbruster wrote:
> Fei Li <fli@suse.com> writes:
>
>> On 11/28/2018 08:53 PM, Markus Armbruster wrote:
>>> Fei Li <fli@suse.com> writes:
>>>
>>>> When qemu_signal_init() fails in qemu_init_main_loop(), we return
>>>> without setting an error.  Its callers crash then when they try to
>>>> report the error with error_report_err().
>>> Yes, that's a bug.  Broken in 2f78e491d7b, v2.2.0.  Has escaped notice
>>> since qemu_signalfd() is quite unlikely to fail.  Could go into 3.1 as a
>>> bug fix, but I think punting it to the next release is just fine.
>> Thanks. :) BTW, should I send the next version only includes patch 1/5
>> and 2/5 separately so that you can merge? (I guess Dave will help to
>> merge the other three migration related patches)
> I can pick patches out of a series for merging, and I trust Dave can,
> too.  But keeping unrelated fixes separate is a good idea.  I can see
> three groups: PATCH 1 (main loop), PATCH 2 (thread abstraction), PATCH
> 3-5 (migration).
>
> [...]
Ok, thanks for the detail explanation. I once thought the first two patches
are belong to one group, that's why I tried to separate them.
But considering too many groups they are belong to and maintainers are
willing to pick, I'd like to keep sending them within one patch series in
the next version. :)

Have a nice day, thanks a lot
Fei

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault
  2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault Fei Li
@ 2018-11-29 14:32   ` Philippe Mathieu-Daudé
  2018-11-30 11:38     ` Fei Li
  0 siblings, 1 reply; 13+ messages in thread
From: Philippe Mathieu-Daudé @ 2018-11-29 14:32 UTC (permalink / raw)
  To: Fei Li, qemu-devel

Hi Fei,

On 28/11/18 11:33, Fei Li wrote:
> To avoid the segmentation fault in qemu_thread_join(), just directly
> return when the QemuThread *thread failed to be created in either
> qemu-thread-posix.c or qemu-thread-win32.c.
> 
> Signed-off-by: Fei Li <fli@suse.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> ---
>  util/qemu-thread-posix.c | 3 +++
>  util/qemu-thread-win32.c | 2 +-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index 865e476df5..b9ab5a4711 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -558,6 +558,9 @@ void *qemu_thread_join(QemuThread *thread)
>      int err;
>      void *ret;
>  
> +    if (!thread->thread) {
> +        return NULL;
> +    }
>      err = pthread_join(thread->thread, &ret);
>      if (err) {
>          error_exit(err, __func__);
> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
> index 4a363ca675..1a27e1cf6f 100644
> --- a/util/qemu-thread-win32.c
> +++ b/util/qemu-thread-win32.c
> @@ -366,7 +366,7 @@ void *qemu_thread_join(QemuThread *thread)
>      HANDLE handle;
>  
>      data = thread->data;
> -    if (data->mode == QEMU_THREAD_DETACHED) {
> +    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {

Isn't it a way to hide a problem from the caller?

qemu_thread_create() always initializes thread->data.

If the thread doesn't exist you shouldn't try to join() it.

If you try to fix a bug, I'd use here:

       assert(thread->data);

Regards,

Phil.

>          return NULL;
>      }
>  
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault
  2018-11-29 14:32   ` Philippe Mathieu-Daudé
@ 2018-11-30 11:38     ` Fei Li
  0 siblings, 0 replies; 13+ messages in thread
From: Fei Li @ 2018-11-30 11:38 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel



On 11/29/2018 10:32 PM, Philippe Mathieu-Daudé wrote:
> Hi Fei,
>
> On 28/11/18 11:33, Fei Li wrote:
>> To avoid the segmentation fault in qemu_thread_join(), just directly
>> return when the QemuThread *thread failed to be created in either
>> qemu-thread-posix.c or qemu-thread-win32.c.
>>
>> Signed-off-by: Fei Li <fli@suse.com>
>> Reviewed-by: Fam Zheng <famz@redhat.com>
>> ---
>>   util/qemu-thread-posix.c | 3 +++
>>   util/qemu-thread-win32.c | 2 +-
>>   2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
>> index 865e476df5..b9ab5a4711 100644
>> --- a/util/qemu-thread-posix.c
>> +++ b/util/qemu-thread-posix.c
>> @@ -558,6 +558,9 @@ void *qemu_thread_join(QemuThread *thread)
>>       int err;
>>       void *ret;
>>   
>> +    if (!thread->thread) {
>> +        return NULL;
>> +    }
>>       err = pthread_join(thread->thread, &ret);
>>       if (err) {
>>           error_exit(err, __func__);
>> diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
>> index 4a363ca675..1a27e1cf6f 100644
>> --- a/util/qemu-thread-win32.c
>> +++ b/util/qemu-thread-win32.c
>> @@ -366,7 +366,7 @@ void *qemu_thread_join(QemuThread *thread)
>>       HANDLE handle;
>>   
>>       data = thread->data;
>> -    if (data->mode == QEMU_THREAD_DETACHED) {
>> +    if (data == NULL || data->mode == QEMU_THREAD_DETACHED) {
> Isn't it a way to hide a problem from the caller?
>
> qemu_thread_create() always initializes thread->data.
>
> If the thread doesn't exist you shouldn't try to join() it.
>
> If you try to fix a bug, I'd use here:
>
>         assert(thread->data);
>
> Regards,
>
> Phil.
Emm, this patch is actually should not be abstracted separately...
It should be along with the "[PATCH RFC v7 9/9] qemu_thread_create:
propagate errors to callers to check."

For the 9/9 patch, when qemu_thread_create() fails, we would like
the callers to handle the error instead of directly abort in order to robust
our qemu code (e.g. do not let qemu crash when fails to hot-plug a cpu).
And only within that patch, we need to check whether the thread has been
created: if not just return NULL, which means no handling for that thread.
This is because for many cleanups, we just use a loop to join() all the
thread_count. e.g. when compress_threads_save_setup() fails while
trying to create the second/third do_data_compress thread. (Supposing
there are more than three compress threads).

I will remove this patch in next version. Thanks for pointing this out. :)

Have a nice day
Fei
>
>>           return NULL;
>>       }
>>   
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-11-30 11:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-28 10:33 [Qemu-devel] [PATCH RFC 0/5] fix some segmentation faults and migration issues Fei Li
2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 1/5] Fix segmentation fault when qemu_signal_init fails Fei Li
2018-11-28 12:53   ` Markus Armbruster
2018-11-29  6:16     ` Fei Li
2018-11-29  8:35       ` Markus Armbruster
2018-11-29  9:15         ` Fei Li
2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 2/5] qemu_thread_join: fix segmentation fault Fei Li
2018-11-29 14:32   ` Philippe Mathieu-Daudé
2018-11-30 11:38     ` Fei Li
2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 3/5] migration: fix the multifd code when receiving less channels Fei Li
2018-11-28 12:17   ` Peter Xu
2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 4/5] migration: remove unused &local_err parameter in multifd_save_cleanup Fei Li
2018-11-28 10:33 ` [Qemu-devel] [PATCH RFC 5/5] migration: add more error handling for postcopy_ram_enable_notify Fei Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.