* [Qemu-devel] [PATCH] migration: fix migration shutdown
@ 2019-04-03 8:38 Yury Kotov
2019-04-03 19:05 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 5+ messages in thread
From: Yury Kotov @ 2019-04-03 8:38 UTC (permalink / raw)
To: Dr. David Alan Gilbert, Juan Quintela
Cc: Alex Bennée, Evgeny Yakovlev, qemu-devel
It fixes heap-use-after-free which was found by clang's ASAN.
Control flow of this use-after-free:
main_thread:
* Got SIGTERM and completes main loop
* Calls migration_shutdown
- migrate_fd_cancel (so, migration_thread begins to complete)
- object_unref(OBJECT(current_migration));
migration_thread:
* migration_iteration_finish -> schedule cleanup bh
* object_unref(OBJECT(s)); (Now, current_migration is freed)
* exits
main_thread:
* Calls vm_shutdown -> drain bdrvs -> main loop
-> cleanup_bh -> use after free
If you want to reproduce, these couple of sleeps will help:
vl.c:4613:
migration_shutdown();
+ sleep(2);
migration.c:3269:
+ sleep(1);
trace_migration_thread_after_loop();
migration_iteration_finish(s);
Original output:
qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
=================================================================
==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
#0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
#1 0x5555594fde0a in aio_bh_call util/async.c:90:5
#2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
#3 0x555559524783 in aio_poll util/aio-posix.c:725:17
#4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
#5 0x5555573bddf6 in virtio_blk_data_plane_stop
hw/block/dataplane/virtio-blk.c:282:5
#6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
#7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
#8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
#9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
#10 0x555557c36713 in vm_state_notify vl.c:1605:9
#11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
#12 0x55555716eeff in vm_shutdown cpus.c:1092:12
#13 0x555557c4283e in main vl.c:4617:5
#14 0x7fffdfdb482f in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
#15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
0x61900001d210 is located 144 bytes inside of 952-byte region
[0x61900001d180,0x61900001d538)
freed by thread T6 (live_migration) here:
#0 0x555556f76782 in __interceptor_free
/tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
#1 0x555558d5fa94 in object_finalize qom/object.c:618:9
#2 0x555558d57651 in object_unref qom/object.c:1068:9
#3 0x555558a55588 in migration_thread migration/migration.c:3272:5
#4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
#5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
previously allocated by thread T0 (qemu-vm-0) here:
#0 0x555556f76b03 in __interceptor_malloc
/tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
#1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
#2 0x555558d58031 in object_new qom/object.c:640:12
#3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
#4 0x555557c41398 in main vl.c:4320:5
#5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
#0 0x555556f5f0dd in pthread_create
/tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
#1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
#2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
#3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
#4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
#5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
#6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
#7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
#8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
#9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
#10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
#11 0x5555594fde0a in aio_bh_call util/async.c:90:5
#12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
#13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
#14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
#15 0x7ffff6ede196 in g_main_context_dispatch
(/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
in migrate_fd_cleanup
Shadow bytes around the buggy address:
0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==31958==ABORTING
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
migration/migration.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 609e0df5d0..54d82bb88b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
int *current_active_state,
int new_state);
static void migrate_fd_cancel(MigrationState *s);
+static void migrate_join_thread(MigrationState *s);
+static void migrate_fd_cleanup(void *opaque);
void migration_object_init(void)
{
@@ -176,6 +178,17 @@ void migration_shutdown(void)
* stop the migration using this structure
*/
migrate_fd_cancel(current_migration);
+ /*
+ * migration_thread schedules cleanup_bh, but migration_shutdown is called
+ * from the end of main, so cleanu_up may not be called because main-loop is
+ * complete. So, wait for the complition of migration_thread, cancel bh and
+ * call cleanup ourselves.
+ */
+ migrate_join_thread(current_migration);
+ if (current_migration->cleanup_bh) {
+ qemu_bh_cancel(current_migration->cleanup_bh);
+ migrate_fd_cleanup(current_migration);
+ }
object_unref(OBJECT(current_migration));
}
@@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
}
}
+static void migrate_join_thread(MigrationState *s)
+{
+ qemu_mutex_unlock_iothread();
+ if (s->migration_thread_running) {
+ qemu_thread_join(&s->thread);
+ s->migration_thread_running = false;
+ }
+ qemu_mutex_lock_iothread();
+}
+
static void migrate_fd_cleanup(void *opaque)
{
MigrationState *s = opaque;
@@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
QEMUFile *tmp;
trace_migrate_fd_cleanup();
- qemu_mutex_unlock_iothread();
- if (s->migration_thread_running) {
- qemu_thread_join(&s->thread);
- s->migration_thread_running = false;
- }
- qemu_mutex_lock_iothread();
+ migrate_join_thread(s);
multifd_save_cleanup();
qemu_mutex_lock(&s->qemu_file_lock);
--
2.21.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
2019-04-03 8:38 [Qemu-devel] [PATCH] migration: fix migration shutdown Yury Kotov
@ 2019-04-03 19:05 ` Dr. David Alan Gilbert
2019-04-04 8:12 ` Yury Kotov
0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2019-04-03 19:05 UTC (permalink / raw)
To: Yury Kotov; +Cc: Juan Quintela, Alex Bennée, Evgeny Yakovlev, qemu-devel
* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> It fixes heap-use-after-free which was found by clang's ASAN.
>
> Control flow of this use-after-free:
> main_thread:
> * Got SIGTERM and completes main loop
> * Calls migration_shutdown
> - migrate_fd_cancel (so, migration_thread begins to complete)
> - object_unref(OBJECT(current_migration));
>
> migration_thread:
> * migration_iteration_finish -> schedule cleanup bh
> * object_unref(OBJECT(s)); (Now, current_migration is freed)
> * exits
>
> main_thread:
> * Calls vm_shutdown -> drain bdrvs -> main loop
> -> cleanup_bh -> use after free
>
> If you want to reproduce, these couple of sleeps will help:
> vl.c:4613:
> migration_shutdown();
> + sleep(2);
> migration.c:3269:
> + sleep(1);
> trace_migration_thread_after_loop();
> migration_iteration_finish(s);
Fun; I hadn't realised you could get more main loop after exiting
the main loop.
Is this fallout from my 892ae715b6b or is this just another case it
didn't catch?
I guess it moved the shutdown early, so it probably is fallout, and
so probably does need to go to 4.0 ?
> Original output:
> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
> =================================================================
> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
> #5 0x5555573bddf6 in virtio_blk_data_plane_stop
> hw/block/dataplane/virtio-blk.c:282:5
> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
> #10 0x555557c36713 in vm_state_notify vl.c:1605:9
> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
> #13 0x555557c4283e in main vl.c:4617:5
> #14 0x7fffdfdb482f in __libc_start_main
> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
>
> 0x61900001d210 is located 144 bytes inside of 952-byte region
> [0x61900001d180,0x61900001d538)
> freed by thread T6 (live_migration) here:
> #0 0x555556f76782 in __interceptor_free
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
> #2 0x555558d57651 in object_unref qom/object.c:1068:9
> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
>
> previously allocated by thread T0 (qemu-vm-0) here:
> #0 0x555556f76b03 in __interceptor_malloc
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
> #2 0x555558d58031 in object_new qom/object.c:640:12
> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
> #4 0x555557c41398 in main vl.c:4320:5
> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>
> Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
> #0 0x555556f5f0dd in pthread_create
> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
> #15 0x7ffff6ede196 in g_main_context_dispatch
> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
>
> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
> in migrate_fd_cleanup
> Shadow bytes around the buggy address:
> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> Shadow byte legend (one shadow byte represents 8 application bytes):
> Addressable: 00
> Partially addressable: 01 02 03 04 05 06 07
> Heap left redzone: fa
> Freed heap region: fd
> Stack left redzone: f1
> Stack mid redzone: f2
> Stack right redzone: f3
> Stack after return: f5
> Stack use after scope: f8
> Global redzone: f9
> Global init order: f6
> Poisoned by user: f7
> Container overflow: fc
> Array cookie: ac
> Intra object redzone: bb
> ASan internal: fe
> Left alloca redzone: ca
> Right alloca redzone: cb
> Shadow gap: cc
> ==31958==ABORTING
>
> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> ---
> migration/migration.c | 30 ++++++++++++++++++++++++------
> 1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 609e0df5d0..54d82bb88b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
> int *current_active_state,
> int new_state);
> static void migrate_fd_cancel(MigrationState *s);
> +static void migrate_join_thread(MigrationState *s);
> +static void migrate_fd_cleanup(void *opaque);
>
> void migration_object_init(void)
> {
> @@ -176,6 +178,17 @@ void migration_shutdown(void)
> * stop the migration using this structure
> */
> migrate_fd_cancel(current_migration);
> + /*
> + * migration_thread schedules cleanup_bh, but migration_shutdown is called
> + * from the end of main, so cleanu_up may not be called because main-loop is
> + * complete. So, wait for the complition of migration_thread, cancel bh and
> + * call cleanup ourselves.
> + */
(Some typos)
> + migrate_join_thread(current_migration);
I'll admit to being a bit nervous that something in here could be
hanging and thus we could hang here rather than die as SIGTERM is
expecting us to.
> + if (current_migration->cleanup_bh) {
> + qemu_bh_cancel(current_migration->cleanup_bh);
Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
> + migrate_fd_cleanup(current_migration);
> + }
> object_unref(OBJECT(current_migration));
> }
>
> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
> }
> }
>
> +static void migrate_join_thread(MigrationState *s)
> +{
> + qemu_mutex_unlock_iothread();
> + if (s->migration_thread_running) {
> + qemu_thread_join(&s->thread);
> + s->migration_thread_running = false;
> + }
> + qemu_mutex_lock_iothread();
> +}
> +
> static void migrate_fd_cleanup(void *opaque)
> {
> MigrationState *s = opaque;
> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
> QEMUFile *tmp;
>
> trace_migrate_fd_cleanup();
> - qemu_mutex_unlock_iothread();
> - if (s->migration_thread_running) {
> - qemu_thread_join(&s->thread);
> - s->migration_thread_running = false;
> - }
> - qemu_mutex_lock_iothread();
> + migrate_join_thread(s);
>
> multifd_save_cleanup();
> qemu_mutex_lock(&s->qemu_file_lock);
Dave
> --
> 2.21.0
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
2019-04-03 19:05 ` Dr. David Alan Gilbert
@ 2019-04-04 8:12 ` Yury Kotov
2019-04-05 9:07 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 5+ messages in thread
From: Yury Kotov @ 2019-04-04 8:12 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Alex Bennée, qemu-devel, Evgeny Yakovlev, Juan Quintela
03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>> It fixes heap-use-after-free which was found by clang's ASAN.
>>
>> Control flow of this use-after-free:
>> main_thread:
>> * Got SIGTERM and completes main loop
>> * Calls migration_shutdown
>> - migrate_fd_cancel (so, migration_thread begins to complete)
>> - object_unref(OBJECT(current_migration));
>>
>> migration_thread:
>> * migration_iteration_finish -> schedule cleanup bh
>> * object_unref(OBJECT(s)); (Now, current_migration is freed)
>> * exits
>>
>> main_thread:
>> * Calls vm_shutdown -> drain bdrvs -> main loop
>> -> cleanup_bh -> use after free
>>
>> If you want to reproduce, these couple of sleeps will help:
>> vl.c:4613:
>> migration_shutdown();
>> + sleep(2);
>> migration.c:3269:
>> + sleep(1);
>> trace_migration_thread_after_loop();
>> migration_iteration_finish(s);
>
> Fun; I hadn't realised you could get more main loop after exiting
> the main loop.
>
> Is this fallout from my 892ae715b6b or is this just another case it
> didn't catch?
> I guess it moved the shutdown early, so it probably is fallout, and
> so probably does need to go to 4.0 ?
>
I think this is closer to another case than fallout. Because without
892ae715b6b, reproducing of UAF at exit was much easier.
Another way to fix it - remove unref from migration_shutdown
(but keep migrate_fd_cancel) and restore migration_object_finalize.
>> Original output:
>> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
>> =================================================================
>> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
>> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
>> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
>> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
>> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
>> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
>> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
>> #5 0x5555573bddf6 in virtio_blk_data_plane_stop
>> hw/block/dataplane/virtio-blk.c:282:5
>> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
>> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
>> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
>> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
>> #10 0x555557c36713 in vm_state_notify vl.c:1605:9
>> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
>> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
>> #13 0x555557c4283e in main vl.c:4617:5
>> #14 0x7fffdfdb482f in __libc_start_main
>> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
>>
>> 0x61900001d210 is located 144 bytes inside of 952-byte region
>> [0x61900001d180,0x61900001d538)
>> freed by thread T6 (live_migration) here:
>> #0 0x555556f76782 in __interceptor_free
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
>> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
>> #2 0x555558d57651 in object_unref qom/object.c:1068:9
>> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
>> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
>> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
>>
>> previously allocated by thread T0 (qemu-vm-0) here:
>> #0 0x555556f76b03 in __interceptor_malloc
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
>> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
>> #2 0x555558d58031 in object_new qom/object.c:640:12
>> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
>> #4 0x555557c41398 in main vl.c:4320:5
>> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>>
>> Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
>> #0 0x555556f5f0dd in pthread_create
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
>> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
>> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
>> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
>> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
>> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
>> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
>> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
>> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
>> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
>> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
>> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
>> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
>> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
>> #15 0x7ffff6ede196 in g_main_context_dispatch
>> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
>>
>> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
>> in migrate_fd_cleanup
>> Shadow bytes around the buggy address:
>> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
>> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> Shadow byte legend (one shadow byte represents 8 application bytes):
>> Addressable: 00
>> Partially addressable: 01 02 03 04 05 06 07
>> Heap left redzone: fa
>> Freed heap region: fd
>> Stack left redzone: f1
>> Stack mid redzone: f2
>> Stack right redzone: f3
>> Stack after return: f5
>> Stack use after scope: f8
>> Global redzone: f9
>> Global init order: f6
>> Poisoned by user: f7
>> Container overflow: fc
>> Array cookie: ac
>> Intra object redzone: bb
>> ASan internal: fe
>> Left alloca redzone: ca
>> Right alloca redzone: cb
>> Shadow gap: cc
>> ==31958==ABORTING
>>
>> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
>> ---
>> migration/migration.c | 30 ++++++++++++++++++++++++------
>> 1 file changed, 24 insertions(+), 6 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 609e0df5d0..54d82bb88b 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
>> int *current_active_state,
>> int new_state);
>> static void migrate_fd_cancel(MigrationState *s);
>> +static void migrate_join_thread(MigrationState *s);
>> +static void migrate_fd_cleanup(void *opaque);
>>
>> void migration_object_init(void)
>> {
>> @@ -176,6 +178,17 @@ void migration_shutdown(void)
>> * stop the migration using this structure
>> */
>> migrate_fd_cancel(current_migration);
>> + /*
>> + * migration_thread schedules cleanup_bh, but migration_shutdown is called
>> + * from the end of main, so cleanu_up may not be called because main-loop is
>> + * complete. So, wait for the complition of migration_thread, cancel bh and
>> + * call cleanup ourselves.
>> + */
>
> (Some typos)
>
Ok :)
>> + migrate_join_thread(current_migration);
>
> I'll admit to being a bit nervous that something in here could be
> hanging and thus we could hang here rather than die as SIGTERM is
> expecting us to.
>
I think it's only way to be sure we havn't other races with migration_thread.
And this joining is cheaper than vm_shutdown IMHO.
>> + if (current_migration->cleanup_bh) {
>> + qemu_bh_cancel(current_migration->cleanup_bh);
>
> Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
>
Agree, it's odd.
>> + migrate_fd_cleanup(current_migration);
>> + }
>> object_unref(OBJECT(current_migration));
>> }
>>
>> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
>> }
>> }
>>
>> +static void migrate_join_thread(MigrationState *s)
>> +{
>> + qemu_mutex_unlock_iothread();
>> + if (s->migration_thread_running) {
>> + qemu_thread_join(&s->thread);
>> + s->migration_thread_running = false;
>> + }
>> + qemu_mutex_lock_iothread();
>> +}
>> +
>> static void migrate_fd_cleanup(void *opaque)
>> {
>> MigrationState *s = opaque;
>> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
>> QEMUFile *tmp;
>>
>> trace_migrate_fd_cleanup();
>> - qemu_mutex_unlock_iothread();
>> - if (s->migration_thread_running) {
>> - qemu_thread_join(&s->thread);
>> - s->migration_thread_running = false;
>> - }
>> - qemu_mutex_lock_iothread();
>> + migrate_join_thread(s);
>>
>> multifd_save_cleanup();
>> qemu_mutex_lock(&s->qemu_file_lock);
>
> Dave
>
>> --
>> 2.21.0
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
2019-04-04 8:12 ` Yury Kotov
@ 2019-04-05 9:07 ` Dr. David Alan Gilbert
2019-04-08 11:36 ` Yury Kotov
0 siblings, 1 reply; 5+ messages in thread
From: Dr. David Alan Gilbert @ 2019-04-05 9:07 UTC (permalink / raw)
To: Yury Kotov; +Cc: Alex Bennée, qemu-devel, Evgeny Yakovlev, Juan Quintela
* Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> 03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
> >> It fixes heap-use-after-free which was found by clang's ASAN.
> >>
> >> Control flow of this use-after-free:
> >> main_thread:
> >> * Got SIGTERM and completes main loop
> >> * Calls migration_shutdown
> >> - migrate_fd_cancel (so, migration_thread begins to complete)
> >> - object_unref(OBJECT(current_migration));
> >>
> >> migration_thread:
> >> * migration_iteration_finish -> schedule cleanup bh
> >> * object_unref(OBJECT(s)); (Now, current_migration is freed)
> >> * exits
> >>
> >> main_thread:
> >> * Calls vm_shutdown -> drain bdrvs -> main loop
> >> -> cleanup_bh -> use after free
> >>
> >> If you want to reproduce, these couple of sleeps will help:
> >> vl.c:4613:
> >> migration_shutdown();
> >> + sleep(2);
> >> migration.c:3269:
> >> + sleep(1);
> >> trace_migration_thread_after_loop();
> >> migration_iteration_finish(s);
> >
> > Fun; I hadn't realised you could get more main loop after exiting
> > the main loop.
> >
> > Is this fallout from my 892ae715b6b or is this just another case it
> > didn't catch?
> > I guess it moved the shutdown early, so it probably is fallout, and
> > so probably does need to go to 4.0 ?
> >
> I think this is closer to another case than fallout. Because without
> 892ae715b6b, reproducing of UAF at exit was much easier.
> Another way to fix it - remove unref from migration_shutdown
> (but keep migrate_fd_cancel) and restore migration_object_finalize.
OK, so in that case I'd rather leave this to 4.1 rather than rush it in
4.0rc's - I'd rather only take regressions at this point.
Dave
> >> Original output:
> >> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
> >> =================================================================
> >> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
> >> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
> >> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
> >> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
> >> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
> >> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> >> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
> >> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
> >> #5 0x5555573bddf6 in virtio_blk_data_plane_stop
> >> hw/block/dataplane/virtio-blk.c:282:5
> >> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
> >> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
> >> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
> >> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
> >> #10 0x555557c36713 in vm_state_notify vl.c:1605:9
> >> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
> >> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
> >> #13 0x555557c4283e in main vl.c:4617:5
> >> #14 0x7fffdfdb482f in __libc_start_main
> >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
> >> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
> >>
> >> 0x61900001d210 is located 144 bytes inside of 952-byte region
> >> [0x61900001d180,0x61900001d538)
> >> freed by thread T6 (live_migration) here:
> >> #0 0x555556f76782 in __interceptor_free
> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
> >> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
> >> #2 0x555558d57651 in object_unref qom/object.c:1068:9
> >> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
> >> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
> >> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
> >>
> >> previously allocated by thread T0 (qemu-vm-0) here:
> >> #0 0x555556f76b03 in __interceptor_malloc
> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
> >> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
> >> #2 0x555558d58031 in object_new qom/object.c:640:12
> >> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
> >> #4 0x555557c41398 in main vl.c:4320:5
> >> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
> >>
> >> Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
> >> #0 0x555556f5f0dd in pthread_create
> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
> >> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
> >> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
> >> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
> >> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
> >> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
> >> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
> >> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
> >> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
> >> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
> >> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
> >> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
> >> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
> >> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
> >> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
> >> #15 0x7ffff6ede196 in g_main_context_dispatch
> >> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
> >>
> >> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
> >> in migrate_fd_cleanup
> >> Shadow bytes around the buggy address:
> >> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> >> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> >> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> >> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> >> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
> >> Shadow byte legend (one shadow byte represents 8 application bytes):
> >> Addressable: 00
> >> Partially addressable: 01 02 03 04 05 06 07
> >> Heap left redzone: fa
> >> Freed heap region: fd
> >> Stack left redzone: f1
> >> Stack mid redzone: f2
> >> Stack right redzone: f3
> >> Stack after return: f5
> >> Stack use after scope: f8
> >> Global redzone: f9
> >> Global init order: f6
> >> Poisoned by user: f7
> >> Container overflow: fc
> >> Array cookie: ac
> >> Intra object redzone: bb
> >> ASan internal: fe
> >> Left alloca redzone: ca
> >> Right alloca redzone: cb
> >> Shadow gap: cc
> >> ==31958==ABORTING
> >>
> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> >> ---
> >> migration/migration.c | 30 ++++++++++++++++++++++++------
> >> 1 file changed, 24 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index 609e0df5d0..54d82bb88b 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
> >> int *current_active_state,
> >> int new_state);
> >> static void migrate_fd_cancel(MigrationState *s);
> >> +static void migrate_join_thread(MigrationState *s);
> >> +static void migrate_fd_cleanup(void *opaque);
> >>
> >> void migration_object_init(void)
> >> {
> >> @@ -176,6 +178,17 @@ void migration_shutdown(void)
> >> * stop the migration using this structure
> >> */
> >> migrate_fd_cancel(current_migration);
> >> + /*
> >> + * migration_thread schedules cleanup_bh, but migration_shutdown is called
> >> + * from the end of main, so cleanu_up may not be called because main-loop is
> >> + * complete. So, wait for the complition of migration_thread, cancel bh and
> >> + * call cleanup ourselves.
> >> + */
> >
> > (Some typos)
> >
> Ok :)
>
> >> + migrate_join_thread(current_migration);
> >
> > I'll admit to being a bit nervous that something in here could be
> > hanging and thus we could hang here rather than die as SIGTERM is
> > expecting us to.
> >
> I think it's only way to be sure we havn't other races with migration_thread.
> And this joining is cheaper than vm_shutdown IMHO.
>
> >> + if (current_migration->cleanup_bh) {
> >> + qemu_bh_cancel(current_migration->cleanup_bh);
> >
> > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
> >
> Agree, it's odd.
>
> >> + migrate_fd_cleanup(current_migration);
> >> + }
> >> object_unref(OBJECT(current_migration));
> >> }
> >>
> >> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
> >> }
> >> }
> >>
> >> +static void migrate_join_thread(MigrationState *s)
> >> +{
> >> + qemu_mutex_unlock_iothread();
> >> + if (s->migration_thread_running) {
> >> + qemu_thread_join(&s->thread);
> >> + s->migration_thread_running = false;
> >> + }
> >> + qemu_mutex_lock_iothread();
> >> +}
> >> +
> >> static void migrate_fd_cleanup(void *opaque)
> >> {
> >> MigrationState *s = opaque;
> >> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
> >> QEMUFile *tmp;
> >>
> >> trace_migrate_fd_cleanup();
> >> - qemu_mutex_unlock_iothread();
> >> - if (s->migration_thread_running) {
> >> - qemu_thread_join(&s->thread);
> >> - s->migration_thread_running = false;
> >> - }
> >> - qemu_mutex_lock_iothread();
> >> + migrate_join_thread(s);
> >>
> >> multifd_save_cleanup();
> >> qemu_mutex_lock(&s->qemu_file_lock);
> >
> > Dave
> >
> >> --
> >> 2.21.0
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
2019-04-05 9:07 ` Dr. David Alan Gilbert
@ 2019-04-08 11:36 ` Yury Kotov
0 siblings, 0 replies; 5+ messages in thread
From: Yury Kotov @ 2019-04-08 11:36 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Alex Bennée, qemu-devel, Evgeny Yakovlev, Juan Quintela
Hi,
I've sent another patch to fix this UAF:
"migration: Fix use-after-free during process exit"
It's more simple and fixes only the regression.
Regards,
Yury
05.04.2019, 12:07, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
> * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>> 03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>:
>> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote:
>> >> It fixes heap-use-after-free which was found by clang's ASAN.
>> >>
>> >> Control flow of this use-after-free:
>> >> main_thread:
>> >> * Got SIGTERM and completes main loop
>> >> * Calls migration_shutdown
>> >> - migrate_fd_cancel (so, migration_thread begins to complete)
>> >> - object_unref(OBJECT(current_migration));
>> >>
>> >> migration_thread:
>> >> * migration_iteration_finish -> schedule cleanup bh
>> >> * object_unref(OBJECT(s)); (Now, current_migration is freed)
>> >> * exits
>> >>
>> >> main_thread:
>> >> * Calls vm_shutdown -> drain bdrvs -> main loop
>> >> -> cleanup_bh -> use after free
>> >>
>> >> If you want to reproduce, these couple of sleeps will help:
>> >> vl.c:4613:
>> >> migration_shutdown();
>> >> + sleep(2);
>> >> migration.c:3269:
>> >> + sleep(1);
>> >> trace_migration_thread_after_loop();
>> >> migration_iteration_finish(s);
>> >
>> > Fun; I hadn't realised you could get more main loop after exiting
>> > the main loop.
>> >
>> > Is this fallout from my 892ae715b6b or is this just another case it
>> > didn't catch?
>> > I guess it moved the shutdown early, so it probably is fallout, and
>> > so probably does need to go to 4.0 ?
>> >
>> I think this is closer to another case than fallout. Because without
>> 892ae715b6b, reproducing of UAF at exit was much easier.
>> Another way to fix it - remove unref from migration_shutdown
>> (but keep migrate_fd_cancel) and restore migration_object_finalize.
>
> OK, so in that case I'd rather leave this to 4.1 rather than rush it in
> 4.0rc's - I'd rather only take regressions at this point.
>
> Dave
>
>> >> Original output:
>> >> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
>> >> =================================================================
>> >> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
>> >> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
>> >> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
>> >> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
>> >> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
>> >> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>> >> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
>> >> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
>> >> #5 0x5555573bddf6 in virtio_blk_data_plane_stop
>> >> hw/block/dataplane/virtio-blk.c:282:5
>> >> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
>> >> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
>> >> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
>> >> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
>> >> #10 0x555557c36713 in vm_state_notify vl.c:1605:9
>> >> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
>> >> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
>> >> #13 0x555557c4283e in main vl.c:4617:5
>> >> #14 0x7fffdfdb482f in __libc_start_main
>> >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>> >> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)
>> >>
>> >> 0x61900001d210 is located 144 bytes inside of 952-byte region
>> >> [0x61900001d180,0x61900001d538)
>> >> freed by thread T6 (live_migration) here:
>> >> #0 0x555556f76782 in __interceptor_free
>> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
>> >> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
>> >> #2 0x555558d57651 in object_unref qom/object.c:1068:9
>> >> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
>> >> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
>> >> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
>> >>
>> >> previously allocated by thread T0 (qemu-vm-0) here:
>> >> #0 0x555556f76b03 in __interceptor_malloc
>> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
>> >> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
>> >> #2 0x555558d58031 in object_new qom/object.c:640:12
>> >> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
>> >> #4 0x555557c41398 in main vl.c:4320:5
>> >> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>> >>
>> >> Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
>> >> #0 0x555556f5f0dd in pthread_create
>> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
>> >> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
>> >> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
>> >> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
>> >> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
>> >> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
>> >> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
>> >> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
>> >> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
>> >> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
>> >> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
>> >> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
>> >> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>> >> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
>> >> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
>> >> #15 0x7ffff6ede196 in g_main_context_dispatch
>> >> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
>> >>
>> >> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
>> >> in migrate_fd_cleanup
>> >> Shadow bytes around the buggy address:
>> >> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> >> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> >> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> >> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> >> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>> >> Shadow byte legend (one shadow byte represents 8 application bytes):
>> >> Addressable: 00
>> >> Partially addressable: 01 02 03 04 05 06 07
>> >> Heap left redzone: fa
>> >> Freed heap region: fd
>> >> Stack left redzone: f1
>> >> Stack mid redzone: f2
>> >> Stack right redzone: f3
>> >> Stack after return: f5
>> >> Stack use after scope: f8
>> >> Global redzone: f9
>> >> Global init order: f6
>> >> Poisoned by user: f7
>> >> Container overflow: fc
>> >> Array cookie: ac
>> >> Intra object redzone: bb
>> >> ASan internal: fe
>> >> Left alloca redzone: ca
>> >> Right alloca redzone: cb
>> >> Shadow gap: cc
>> >> ==31958==ABORTING
>> >>
>> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
>> >> ---
>> >> migration/migration.c | 30 ++++++++++++++++++++++++------
>> >> 1 file changed, 24 insertions(+), 6 deletions(-)
>> >>
>> >> diff --git a/migration/migration.c b/migration/migration.c
>> >> index 609e0df5d0..54d82bb88b 100644
>> >> --- a/migration/migration.c
>> >> +++ b/migration/migration.c
>> >> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
>> >> int *current_active_state,
>> >> int new_state);
>> >> static void migrate_fd_cancel(MigrationState *s);
>> >> +static void migrate_join_thread(MigrationState *s);
>> >> +static void migrate_fd_cleanup(void *opaque);
>> >>
>> >> void migration_object_init(void)
>> >> {
>> >> @@ -176,6 +178,17 @@ void migration_shutdown(void)
>> >> * stop the migration using this structure
>> >> */
>> >> migrate_fd_cancel(current_migration);
>> >> + /*
>> >> + * migration_thread schedules cleanup_bh, but migration_shutdown is called
>> >> + * from the end of main, so cleanu_up may not be called because main-loop is
>> >> + * complete. So, wait for the complition of migration_thread, cancel bh and
>> >> + * call cleanup ourselves.
>> >> + */
>> >
>> > (Some typos)
>> >
>> Ok :)
>>
>> >> + migrate_join_thread(current_migration);
>> >
>> > I'll admit to being a bit nervous that something in here could be
>> > hanging and thus we could hang here rather than die as SIGTERM is
>> > expecting us to.
>> >
>> I think it's only way to be sure we havn't other races with migration_thread.
>> And this joining is cheaper than vm_shutdown IMHO.
>>
>> >> + if (current_migration->cleanup_bh) {
>> >> + qemu_bh_cancel(current_migration->cleanup_bh);
>> >
>> > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
>> >
>> Agree, it's odd.
>>
>> >> + migrate_fd_cleanup(current_migration);
>> >> + }
>> >> object_unref(OBJECT(current_migration));
>> >> }
>> >>
>> >> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s)
>> >> }
>> >> }
>> >>
>> >> +static void migrate_join_thread(MigrationState *s)
>> >> +{
>> >> + qemu_mutex_unlock_iothread();
>> >> + if (s->migration_thread_running) {
>> >> + qemu_thread_join(&s->thread);
>> >> + s->migration_thread_running = false;
>> >> + }
>> >> + qemu_mutex_lock_iothread();
>> >> +}
>> >> +
>> >> static void migrate_fd_cleanup(void *opaque)
>> >> {
>> >> MigrationState *s = opaque;
>> >> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
>> >> QEMUFile *tmp;
>> >>
>> >> trace_migrate_fd_cleanup();
>> >> - qemu_mutex_unlock_iothread();
>> >> - if (s->migration_thread_running) {
>> >> - qemu_thread_join(&s->thread);
>> >> - s->migration_thread_running = false;
>> >> - }
>> >> - qemu_mutex_lock_iothread();
>> >> + migrate_join_thread(s);
>> >>
>> >> multifd_save_cleanup();
>> >> qemu_mutex_lock(&s->qemu_file_lock);
>> >
>> > Dave
>> >
>> >> --
>> >> 2.21.0
>> > --
>> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-04-08 11:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-03 8:38 [Qemu-devel] [PATCH] migration: fix migration shutdown Yury Kotov
2019-04-03 19:05 ` Dr. David Alan Gilbert
2019-04-04 8:12 ` Yury Kotov
2019-04-05 9:07 ` Dr. David Alan Gilbert
2019-04-08 11:36 ` Yury Kotov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.