qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: Stefan Reiter <s.reiter@proxmox.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"jsnow@redhat.com" <jsnow@redhat.com>,
	Dietmar Maurer <dietmar@proxmox.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: backup transaction with io-thread core dumps
Date: Wed, 25 Mar 2020 15:23:02 +0300	[thread overview]
Message-ID: <76edbead-7ccf-833b-a3f2-15dff7f0748c@virtuozzo.com> (raw)
In-Reply-To: <e1a10c51-cf7f-0724-f61a-f6a82ebd9da2@proxmox.com>

25.03.2020 14:40, Stefan Reiter wrote:
> On 24/03/2020 17:49, Dietmar Maurer wrote:
>> A more serious issue is that I also get a hang inside the poll loop
>> when the VM is under load. For example, running "stress -d 5" inside
>> the VM (Debian Buster).
>>
>> Then running a simply drive-backup like:
>>
>> { "execute": "drive-backup", "arguments": { "device": "drive-scsi0", "sync": "full", "target": "backup-scsi0.raw" } }
>>
>> At the end of the backup, VM hangs, gdb says:
>>
>> (gdb) bt
>> #0  0x00007ffff5cb3916 in __GI_ppoll (fds=0x7fff63d35c40, nfds=2, timeout=<optimized out>,
>>      timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
>> #1  0x0000555555c5f2c9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
>>      at /usr/include/x86_64-linux-gnu/bits/poll2.h:77
>> #2  0x0000555555c5f2c9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1)
>>      at util/qemu-timer.c:335
>> #3  0x0000555555c61ab1 in fdmon_poll_wait (ctx=0x7fffe8905e80, ready_list=0x7fffffffdc68, timeout=-1)
>>      at util/fdmon-poll.c:79
>> #4  0x0000555555c61097 in aio_poll (ctx=0x7fffe8905e80, blocking=blocking@entry=true) at util/aio-posix.c:589
>> #5  0x0000555555bc2243 in bdrv_do_drained_begin
>>      (poll=<optimized out>, ignore_bds_parents=false, parent=0x0, recursive=false, bs=0x7fff671c11c0) at block/io.c:429
>> #6  0x0000555555bc2243 in bdrv_do_drained_begin
>>      (bs=0x7fff671c11c0, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>)
>>      at block/io.c:395
>> #7  0x0000555555bdd132 in bdrv_backup_top_drop (bs=0x7fff671c11c0) at block/backup-top.c:273
>> #8  0x0000555555bd883c in backup_clean (job=0x7fffe5609a00) at block/backup.c:114
>> #9  0x0000555555b6d46d in job_clean (job=0x7fffe5609a00) at job.c:657
>> #10 0x0000555555b6d46d in job_finalize_single (job=0x7fffe5609a00) at job.c:673
>> #11 0x0000555555b6d46d in job_finalize_single (job=0x7fffe5609a00) at job.c:661
>> #12 0x0000555555b6dedc in job_txn_apply (txn=<optimized out>, fn=0x555555b6d420 <job_finalize_single>) at job.c:145
>> #13 0x0000555555b6dedc in job_do_finalize (job=0x7fffe5609a00) at job.c:782
>> #14 0x0000555555b6e131 in job_completed_txn_success (job=0x7fffe5609a00) at job.c:832
>> #15 0x0000555555b6e131 in job_completed (job=0x7fffe5609a00) at job.c:845
>> #16 0x0000555555b6e131 in job_completed (job=0x7fffe5609a00) at job.c:836
>> #17 0x0000555555b6e190 in job_exit (opaque=0x7fffe5609a00) at job.c:864
>> #18 0x0000555555c5d855 in aio_bh_call (bh=0x7fffe721a3c0) at util/async.c:164
>> #19 0x0000555555c5d855 in aio_bh_poll (ctx=ctx@entry=0x7fffe8905e80) at util/async.c:164
>> #20 0x0000555555c60ede in aio_dispatch (ctx=0x7fffe8905e80) at util/aio-posix.c:380
>> #21 0x0000555555c5d73e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
>>      at util/async.c:298
>> #22 0x00007ffff7c7ef2e in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
>> #23 0x0000555555c60148 in glib_pollfds_poll () at util/main-loop.c:219
>> #24 0x0000555555c60148 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
>> #25 0x0000555555c60148 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
>> #26 0x00005555558fc579 in qemu_main_loop () at /home/dietmar/pve5-devel/mirror_qemu/softmmu/vl.c:1665
>> #27 0x0000555555800c2e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
>>      at /home/dietmar/pve5-devel/mirror_qemu/softmmu/main.c:49
>>
>> The VM uses a similar hardware setup with io-threads as the previous
>> example.
>>
>> Is somebody able to reproduce that? And ideas whats wrong here?
>>
> 
> Can't speak for the original issue, but this one I can reproduce exactly as you say: main thread hangs at the end of a backup job of a drive that uses IO threads, when said drive is under load. Single disk, raw-file backed.
> 
> I've bisected the issue to commit 00e30f05de1d ("block/backup: use backup-top instead of write notifiers") - CC'ing Vladimir, since he wrote that one. Before that patch, backup works fine.
> 
> 
> For completeness, here's the trace I get on my machine (latest master) when everything's stuck, Thread 1 is the main thread, 3 the IO thread for the drive:
> 
> Thread 1 (Thread 0x7ffff5d20680 (LWP 958009)):
> #0    0x00007ffff6cf2916 in __GI_ppoll (fds=0x7ffff4a0c2c0, nfds=0x2, timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1    0x0000555555dba2dd in qemu_poll_ns (fds=0x7ffff4a0c2c0, nfds=0x2, timeout=0xffffffffffffffff) at util/qemu-timer.c:335
> #2    0x0000555555dbdb1d in fdmon_poll_wait (ctx=0x7ffff49e3380, ready_list=0x7fffffffdab0, timeout=0xffffffffffffffff) at util/fdmon-poll.c:79
> #3    0x0000555555dbd48d in aio_poll (ctx=0x7ffff49e3380, blocking=0x1) at util/aio-posix.c:589
> #4    0x0000555555cf5a83 in bdrv_do_drained_begin (bs=0x7fffd1577180, recursive=0x0, parent=0x0, ignore_bds_parents=0x0, poll=0x1) at block/io.c:429

So we wait for some request, as we want to drain... And we are in BDRV_POLL_WHILE loop, which should release aio context before aio_poll() call. So it's strange that thread 3 hangs.
Do we loop in BDRV_POLL_WHILE, or we are really hang here?

You may print bs->tracked_requests, to see which requests we are waiting for. Requests has co pointers to their coroutines, which may be examined by

gdb> source scripts/qemu-gdb.py
gdb> qemu coroutine <coroutine-pointer>


> #5    0x0000555555cf5aef in bdrv_drained_begin (bs=0x7fffd1577180) at block/io.c:435
> #6    0x0000555555d1b135 in bdrv_backup_top_drop (bs=0x7fffd1577180) at block/backup-top.c:273
> #7    0x0000555555d14734 in backup_clean (job=0x7ffff49fba00) at block/backup.c:132
> #8    0x0000555555c81be5 in job_clean (job=0x7ffff49fba00) at job.c:656
> #9    0x0000555555c81c60 in job_finalize_single (job=0x7ffff49fba00) at job.c:672
> #10   0x0000555555c80830 in job_txn_apply (txn=0x7fffd1576a80, fn=0x555555c81be8 <job_finalize_single>) at job.c:145
> #11   0x0000555555c82017 in job_do_finalize (job=0x7ffff49fba00) at job.c:781
> #12   0x0000555555c821b4 in job_completed_txn_success (job=0x7ffff49fba00) at job.c:831
> #13   0x0000555555c82254 in job_completed (job=0x7ffff49fba00) at job.c:844
> #14   0x0000555555c822a3 in job_exit (opaque=0x7ffff49fba00) at job.c:863
> #15   0x0000555555db78a0 in aio_bh_call (bh=0x7ffff3c0d240) at util/async.c:136
> #16   0x0000555555db79aa in aio_bh_poll (ctx=0x7ffff49e3380) at util/async.c:164
> #17   0x0000555555dbcdae in aio_dispatch (ctx=0x7ffff49e3380) at util/aio-posix.c:380
> #18   0x0000555555db7dbc in aio_ctx_dispatch (source=0x7ffff49e3380, callback=0x0, user_data=0x0) at util/async.c:298
> #19   0x00007ffff7c8ff2e in g_main_context_dispatch + 0x2ae () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #20   0x0000555555dbb360 in glib_pollfds_poll () at util/main-loop.c:219
> #21   0x0000555555dbb3da in os_host_main_loop_wait (timeout=0x3b9aca00) at util/main-loop.c:242
> #22   0x0000555555dbb4df in main_loop_wait (nonblocking=0x0) at util/main-loop.c:518
> #23   0x0000555555947478 in qemu_main_loop () at /root/qemu/softmmu/vl.c:1665
> #24   0x0000555555d58ddf in main (argc=0x15, argv=0x7fffffffe058, envp=0x7fffffffe108) at /root/qemu/softmmu/main.c:49
> 
> Thread 3 (Thread 0x7ffff497e700 (LWP 958039)):

Print the mutex content, to see who holds it.

> #0    0x00007ffff6dd629c in __lll_lock_wait + 0x1c () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
> #1    0x00007ffff6dcf7d1 in __GI___pthread_mutex_lock (mutex=0x7ffff49e36e0) at ../nptl/pthread_mutex_lock.c:115
> #2    0x0000555555dc0ad7 in qemu_mutex_lock_impl (mutex=0x7ffff49e36e0, file=0x555555f66501 "util/async.c", line=0x254) at util/qemu-thread-posix.c:78
> #3    0x0000555555db872e in aio_context_acquire (ctx=0x7ffff49e3680) at util/async.c:596
> #4    0x0000555555db905e in thread_pool_completion_bh (opaque=0x7ffff3c0c000) at util/thread-pool.c:167
> #5    0x0000555555db78a0 in aio_bh_call (bh=0x7ffff49d7f20) at util/async.c:136
> #6    0x0000555555db79aa in aio_bh_poll (ctx=0x7ffff49e3680) at util/async.c:164
> #7    0x0000555555dbd673 in aio_poll (ctx=0x7ffff49e3680, blocking=0x1) at util/aio-posix.c:638
> #8    0x00005555559d7611 in iothread_run (opaque=0x7ffff49b1c80) at iothread.c:75
> #9    0x0000555555dc1650 in qemu_thread_start (args=0x7ffff58ea480) at util/qemu-thread-posix.c:519
> #10   0x00007ffff6dccfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
> #11   0x00007ffff6cfd4cf in clone + 0x3f () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
>>> On March 24, 2020 2:47 PM Max Reitz <mreitz@redhat.com> wrote:
>>>
>>> Hi Dietmar,
>>>
>>> I assume this is with master and has popped up only recently?
>>>
>>> Maybe it has something to do with the recent mutex patches by Stefan, so
>>> I’m Cc-ing him.
>>>
>>> Max
>>>
>>> On 24.03.20 14:33, Dietmar Maurer wrote:
>>>> spoke too soon - the error is still there, sigh
>>>>> This is fixed with this patch:
>>>>>
>>>>> https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07249.html
>>>>>
>>>>> thanks!
>>>>>
>>>>>> On March 24, 2020 12:13 PM Dietmar Maurer <dietmar@proxmox.com> wrote:
>>>>>>
>>>>>> I get a core dump with backup transactions when using io-threads.
>>>>>>
>>>>>> To reproduce, create and start a VM with:
>>>>>>
>>>>>> # qemu-img create disk1.raw 100M
>>>>>> # qemu-img create disk2.raw 100M
>>>>>> #./x86_64-softmmu/qemu-system-x86_64 -chardev 'socket,id=qmp,path=/var/run/qemu-test.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -pidfile /var/run/qemu-server/108.pid  -m 512 -object 'iothread,id=iothread-virtioscsi0' -object 'iothread,id=iothread-virtioscsi1'  -device 'virtio-scsi-pci,id=virtioscsi0,iothread=iothread-virtioscsi0' -drive 'file=disk1.raw,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0' -device 'virtio-scsi-pci,id=virtioscsi1,iothread=iothread-virtioscsi1' -drive 'file=disk2.raw,if=none,id=drive-scsi1,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=virtioscsi1.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi1,id=scsi1'
>>>>>>
>>>>>> Then open socat to the qmp socket
>>>>>> # socat /var/run/qemu-test.qmp -
>>>>>>
>>>>>> And run the following qmp command:
>>>>>>
>>>>>> { "execute": "qmp_capabilities", "arguments": {} }
>>>>>> { "execute": "transaction", "arguments":  { "actions": [{ "type": "drive-backup", "data": { "device": "drive-scsi0", "sync": "full", "target": "backup-sysi0.raw" }}, { "type": "drive-backup", "data": { "device": "drive-scsi1", "sync": "full", "target": "backup-scsi1.raw" }}], "properties": { "completion-mode": "grouped" } } }
>>>>>>
>>>>>> The VM will core dump:
>>>>>>
>>>>>> qemu: qemu_mutex_unlock_impl: Operation not permitted
>>>>>> Aborted (core dumped)
>>>>>> (gdb) bt
>>>>>> #0  0x00007f099d5037bb in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>>>>>> #1  0x00007f099d4ee535 in __GI_abort () at abort.c:79
>>>>>> #2  0x000055c04e39525e in error_exit (err=<optimized out>, msg=msg@entry=0x55c04e5122e0 <__func__.16544> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
>>>>>> #3  0x000055c04e395813 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x7f09903154e0, file=file@entry=0x55c04e51129f "util/async.c", line=line@entry=601)
>>>>>>      at util/qemu-thread-posix.c:108
>>>>>> #4  0x000055c04e38f8e5 in aio_context_release (ctx=ctx@entry=0x7f0990315480) at util/async.c:601
>>>>>> #5  0x000055c04e299073 in bdrv_set_aio_context_ignore (bs=0x7f0929a76500, new_context=new_context@entry=0x7f0990315000, ignore=ignore@entry=0x7ffe08fa7400)
>>>>>>      at block.c:6238
>>>>>> #6  0x000055c04e2990cc in bdrv_set_aio_context_ignore (bs=bs@entry=0x7f092af47900, new_context=new_context@entry=0x7f0990315000, ignore=ignore@entry=0x7ffe08fa7400)
>>>>>>      at block.c:6211
>>>>>> #7  0x000055c04e299443 in bdrv_child_try_set_aio_context (bs=bs@entry=0x7f092af47900, ctx=0x7f0990315000, ignore_child=ignore_child@entry=0x0, errp=errp@entry=0x0)
>>>>>>      at block.c:6324
>>>>>> #8  0x000055c04e299576 in bdrv_try_set_aio_context (errp=0x0, ctx=<optimized out>, bs=0x7f092af47900) at block.c:6333
>>>>>> #9  0x000055c04e299576 in bdrv_replace_child (child=child@entry=0x7f09902ef5e0, new_bs=new_bs@entry=0x0) at block.c:2551
>>>>>> #10 0x000055c04e2995ff in bdrv_detach_child (child=0x7f09902ef5e0) at block.c:2666
>>>>>> #11 0x000055c04e299ec9 in bdrv_root_unref_child (child=<optimized out>) at block.c:2677
>>>>>> #12 0x000055c04e29f3fe in block_job_remove_all_bdrv (job=job@entry=0x7f0927c18800) at blockjob.c:191
>>>>>> #13 0x000055c04e29f429 in block_job_free (job=0x7f0927c18800) at blockjob.c:88
>>>>>> #14 0x000055c04e2a0909 in job_unref (job=0x7f0927c18800) at job.c:359
>>>>>> #15 0x000055c04e2a0909 in job_unref (job=0x7f0927c18800) at job.c:351
>>>>>> #16 0x000055c04e2a0b68 in job_conclude (job=0x7f0927c18800) at job.c:620
>>>>>> #17 0x000055c04e2a0b68 in job_finalize_single (job=0x7f0927c18800) at job.c:688
>>>>>> #18 0x000055c04e2a0b68 in job_finalize_single (job=0x7f0927c18800) at job.c:660
>>>>>> #19 0x000055c04e2a14fc in job_txn_apply (txn=<optimized out>, fn=0x55c04e2a0a50 <job_finalize_single>) at job.c:145
>>>>>> #20 0x000055c04e2a14fc in job_do_finalize (job=0x7f0927c1c200) at job.c:781
>>>>>> #21 0x000055c04e2a1751 in job_completed_txn_success (job=0x7f0927c1c200) at job.c:831
>>>>>> #22 0x000055c04e2a1751 in job_completed (job=0x7f0927c1c200) at job.c:844
>>>>>> #23 0x000055c04e2a1751 in job_completed (job=0x7f0927c1c200) at job.c:835
>>>>>> #24 0x000055c04e2a17b0 in job_exit (opaque=0x7f0927c1c200) at job.c:863
>>>>>> #25 0x000055c04e38ee75 in aio_bh_call (bh=0x7f098ec52000) at util/async.c:164
>>>>>> #26 0x000055c04e38ee75 in aio_bh_poll (ctx=ctx@entry=0x7f0990315000) at util/async.c:164
>>>>>> #27 0x000055c04e3924fe in aio_dispatch (ctx=0x7f0990315000) at util/aio-posix.c:380
>>>>>> #28 0x000055c04e38ed5e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:298
>>>>>> #29 0x00007f099f020f2e in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
>>>>>> #30 0x000055c04e391768 in glib_pollfds_poll () at util/main-loop.c:219
>>>>>> #31 0x000055c04e391768 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
>>>>>> #32 0x000055c04e391768 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
>>>>>> #33 0x000055c04e032329 in qemu_main_loop () at /home/dietmar/pve5-devel/mirror_qemu/softmmu/vl.c:1665
>>>>>> #34 0x000055c04df36a8e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/dietmar/pve5-devel/mirror_qemu/softmmu/main.c:49
> 


-- 
Best regards,
Vladimir


  reply	other threads:[~2020-03-25 12:23 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24 11:13 backup transaction with io-thread core dumps Dietmar Maurer
2020-03-24 13:30 ` Dietmar Maurer
2020-03-24 13:33   ` Dietmar Maurer
2020-03-24 13:44     ` John Snow
2020-03-24 14:00       ` Dietmar Maurer
2020-03-24 13:47     ` Max Reitz
2020-03-24 14:02       ` Dietmar Maurer
2020-03-24 16:49       ` Dietmar Maurer
2020-03-25 11:40         ` Stefan Reiter
2020-03-25 12:23           ` Vladimir Sementsov-Ogievskiy [this message]
2020-03-25 15:50             ` [PATCH] backup: don't acquire aio_context in backup_clean Stefan Reiter
2020-03-26  5:54               ` Vladimir Sementsov-Ogievskiy
2020-03-26  9:43                 ` Stefan Reiter
2020-03-26 12:46                   ` Vladimir Sementsov-Ogievskiy
2020-03-26 11:53                 ` Sergio Lopez
2020-03-25  8:13       ` backup transaction with io-thread core dumps Sergio Lopez
2020-03-25 11:46         ` Sergio Lopez
2020-03-25 12:29           ` Dietmar Maurer
2020-03-25 12:39             ` Sergio Lopez
2020-03-25 15:40               ` Dietmar Maurer
2020-03-26  7:50                 ` Sergio Lopez
2020-03-26  8:14                   ` Dietmar Maurer
2020-03-26  9:23                     ` Dietmar Maurer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=76edbead-7ccf-833b-a3f2-15dff7f0748c@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=dietmar@proxmox.com \
    --cc=jsnow@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=s.reiter@proxmox.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).