QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Dietmar Maurer <dietmar@proxmox.com>
To: Stefan Reiter <s.reiter@proxmox.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, slp@redhat.com,
	mreitz@redhat.com, stefanha@redhat.com, jsnow@redhat.com
Subject: Re: [PATCH v2 0/3] Fix some AIO context locking in jobs
Date: Fri, 27 Mar 2020 11:00:48 +0100 (CET)
Message-ID: <626397594.17.1585303248535@webmail.proxmox.com> (raw)
In-Reply-To: <20200326155628.859862-1-s.reiter@proxmox.com>

> I *think* the second patch also fixes the hangs on backup abort that I and
> Dietmar noticed in v1, but I'm not sure, they we're somewhat intermittent
> before too.

After more test, I am 100% sure the bug (or another one) is still there. 
Here is how to trigger:

1. use latest qemu sources from githup
2. apply those 3 patches from Stefan
2. create a VM with virtio-scsis-single drive using io-thread
3. inside VM install Debian buster
4. inside VM, run "stress -d 5"

Then run a series of backups, aborting them after a few seconds:

# start loop 

qmp: { "execute": "drive-backup", "arguments": { "device": "drive-scsi0", "sync": "full", "target": "backup-scsi0.raw" } }

sleep 3 second

qmp: { "execute": "'block-job-cancel", "arguments": { "device": "drive-scsi0" } }

# end loop

After several iterations (mostly < 50) the VM freezes (this time somewhere 
inside drive_backup_prepare):

(gdb) bt
#0  0x00007f61ea09e916 in __GI_ppoll (fds=0x7f6158130c40, nfds=2, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0)
    at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x000055f708401c79 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
    at /usr/include/x86_64-linux-gnu/bits/poll2.h:77
#2  0x000055f708401c79 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=-1) at util/qemu-timer.c:335
#3  0x000055f708404461 in fdmon_poll_wait (ctx=0x7f61dcd05e80, ready_list=0x7ffc4e7fbde8, timeout=-1) at util/fdmon-poll.c:79
#4  0x000055f708403a47 in aio_poll (ctx=0x7f61dcd05e80, blocking=blocking@entry=true) at util/aio-posix.c:589
#5  0x000055f708364c03 in bdrv_do_drained_begin (poll=<optimized out>, ignore_bds_parents=false, parent=0x0, recursive=false, bs=0x7f61dcd4c500)
    at block/io.c:429
#6  0x000055f708364c03 in bdrv_do_drained_begin
    (bs=0x7f61dcd4c500, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:395
#7  0x000055f7081016f9 in drive_backup_prepare (common=0x7f61d9a0c280, errp=0x7ffc4e7fbf28) at blockdev.c:1755
#8  0x000055f708103e6a in qmp_transaction (dev_list=dev_list@entry=0x7ffc4e7fbfa0, has_props=has_props@entry=false, props=0x7f61d9a304e8, 
    props@entry=0x0, errp=errp@entry=0x7ffc4e7fbfd8) at blockdev.c:2401
#9  0x000055f708105322 in blockdev_do_action (errp=0x7ffc4e7fbfd8, action=0x7ffc4e7fbf90) at blockdev.c:1054
#10 0x000055f708105322 in qmp_drive_backup (backup=backup@entry=0x7ffc4e7fbfe0, errp=errp@entry=0x7ffc4e7fbfd8) at blockdev.c:3129
#11 0x000055f7082c0101 in qmp_marshal_drive_backup (args=<optimized out>, ret=<optimized out>, errp=0x7ffc4e7fc0b8)
    at qapi/qapi-commands-block-core.c:555
#12 0x000055f7083b7338 in qmp_dispatch (cmds=0x55f708904000 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
    at qapi/qmp-dispatch.c:155
#13 0x000055f7082a1bd1 in monitor_qmp_dispatch (mon=0x7f61dcd15d80, req=<optimized out>) at monitor/qmp.c:145
#14 0x000055f7082a23ba in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
#15 0x000055f708400205 in aio_bh_call (bh=0x7f61dd28f960) at util/async.c:164
#16 0x000055f708400205 in aio_bh_poll (ctx=ctx@entry=0x7f61dd33ef80) at util/async.c:164
#17 0x000055f70840388e in aio_dispatch (ctx=0x7f61dd33ef80) at util/aio-posix.c:380
#18 0x000055f7084000ee in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:298
#19 0x00007f61ec069f2e in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x000055f708402af8 in glib_pollfds_poll () at util/main-loop.c:219
#21 0x000055f708402af8 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#22 0x000055f708402af8 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
#23 0x000055f70809e589 in qemu_main_loop () at /home/dietmar/pve5-devel/mirror_qemu/softmmu/vl.c:1665
#24 0x000055f707fa2c3e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at /home/dietmar/pve5-devel/mirror_qemu/softmmu/main.c:49

      parent reply index

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 15:56 Stefan Reiter
2020-03-26 15:56 ` [PATCH v2 1/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
2020-03-27  6:00   ` Vladimir Sementsov-Ogievskiy
2020-03-26 15:56 ` [PATCH v2 2/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
2020-03-26 15:56 ` [PATCH v2 3/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
2020-03-26 17:11 ` [PATCH v2 0/3] Fix some AIO context locking in jobs no-reply
2020-03-26 17:18 ` no-reply
2020-03-27  6:07 ` Dietmar Maurer
2020-03-27  6:13   ` Dietmar Maurer
2020-03-27  7:21   ` Dietmar Maurer
2020-03-27 10:00 ` Dietmar Maurer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=626397594.17.1585303248535@webmail.proxmox.com \
    --to=dietmar@proxmox.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=s.reiter@proxmox.com \
    --cc=slp@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git