QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org
Cc: kwolf@redhat.com, wencongyang2@huawei.com,
	xiechanglong.d@gmail.com, qemu-devel@nongnu.org,
	armbru@redhat.com, den@openvz.org, jsnow@redhat.com
Subject: Re: [PATCH v4 00/23] backup performance: block_status + async
Date: Wed, 20 Jan 2021 17:00:30 +0100
Message-ID: <c1800da5-0c5e-3588-dc9f-bee127f8f780@redhat.com> (raw)
In-Reply-To: <8522c1f5-9476-3596-abf0-7f2d83f8844c@redhat.com>

On 20.01.21 16:53, Max Reitz wrote:
> On 20.01.21 15:44, Max Reitz wrote:
>> On 20.01.21 15:34, Max Reitz wrote:
> 
> [...]
> 
>>>  From a glance, it looks to me like two coroutines are created 
>>> simultaneously in two threads, and so one thread sets up a special 
>>> SIGUSR2 action, then another reverts SIGUSR2 to the default, and then 
>>> the first one kills itself with SIGUSR2.
>>>
>>> Not sure what this has to do with backup, though it is interesting 
>>> that backup_loop() runs in two threads.  So perhaps some AioContext 
>>> problem.
>>
>> Oh, 256 runs two backups concurrently.  So it isn’t that interesting, 
>> but perhaps part of the problem still.  (I have no idea, still looking.)
> 
> So this is what I found out:
> 
> coroutine-sigaltstack, when creating a new coroutine, sets up a signal 
> handler for SIGUSR2, then kills itself with SIGUSR2, then uses the 
> signal handler context (with a sigaltstack) for the new coroutine, and 
> then (the signal handler returns after a sigsetjmp()) the old SIGUSR2 
> behavior is restored.
> 
> What I fail to understand is how this is thread-safe.  Setting up signal 
> handlers is a process-wide action.  When one thread changes what SIGUSR2 
> does, this will affect all threads immediately, so when two threads run 
> coroutine-sigaltstack’s qemu_coroutine_new() concurrently, and one 
> thread reverts to the default action before the other has SIGUSR2’ed 
> itself, that later SIGUSR2 will kill the whole process.
> 
> (I suppose it gets even more interesting when one thread has set up the 
> sigaltstack, then the other sets up its own sigaltstack, and then both 
> kill themselves with SIGUSR2, so both coroutines get the same stack...)
> 
> I have no idea why this has never been hit before, but it makes sense 
> why block-copy backup makes it apparent: It creates 64+x coroutines in a 
> very short time span, and 256 makes it do so in two threads concurrently 
> (thanks to launching two backups in two AioContexts in a transaction).
> 
> So...  Looks to me like a bug in coroutine-sigaltstack.  Not sure what 
> to do now, though.  I don’t think we can use block-copy for backup 
> before that backend is fixed.  (And that is assuming that it’s indeed 
> coroutine-sigaltstack’s fault.)
> 
> I’ll try to add some locking, see what it does, and send a mail 
> concerning coroutine-sigaltstack to qemu-devel.

Yeah, adding a mutex around the section where SIGUSR2 handling and the 
sigaltstack are non-default makes the problem go away.

Max



  reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-16 21:46 Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 01/23] qapi: backup: add perf.use-copy-range parameter Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 02/23] block/block-copy: More explicit call_state Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 03/23] block/block-copy: implement block_copy_async Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 04/23] block/block-copy: add max_chunk and max_workers parameters Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 05/23] block/block-copy: add list of all call-states Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 06/23] block/block-copy: add ratelimit to block-copy Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 07/23] block/block-copy: add block_copy_cancel Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 08/23] blockjob: add set_speed to BlockJobDriver Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 09/23] job: call job_enter from job_pause Vladimir Sementsov-Ogievskiy
2021-01-18 13:45   ` Max Reitz
2021-01-18 14:20     ` Max Reitz
2021-04-07 11:19   ` Max Reitz
2021-04-07 11:38     ` Vladimir Sementsov-Ogievskiy
2021-04-21  8:31       ` Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 10/23] qapi: backup: add max-chunk and max-workers to x-perf struct Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 11/23] iotests: 56: prepare for backup over block-copy Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 12/23] iotests: 185: " Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 13/23] iotests: 219: " Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 14/23] iotests: 257: " Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 15/23] block/block-copy: make progress_bytes_callback optional Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 16/23] block/backup: drop extra gotos from backup_run() Vladimir Sementsov-Ogievskiy
2021-01-16 21:46 ` [PATCH v4 17/23] backup: move to block-copy Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 18/23] qapi: backup: disable copy_range by default Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 19/23] block/block-copy: drop unused block_copy_set_progress_callback() Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 20/23] block/block-copy: drop unused argument of block_copy() Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 21/23] simplebench/bench_block_job: use correct shebang line with python3 Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 22/23] simplebench: bench_block_job: add cmd_options argument Vladimir Sementsov-Ogievskiy
2021-01-16 21:47 ` [PATCH v4 23/23] simplebench: add bench-backup.py Vladimir Sementsov-Ogievskiy
2021-01-18 14:01   ` Max Reitz
2021-01-18 15:07 ` [PATCH v4 00/23] backup performance: block_status + async Max Reitz
2021-01-18 17:04   ` Vladimir Sementsov-Ogievskiy
2021-01-19 18:40 ` Max Reitz
2021-01-19 19:22   ` Vladimir Sementsov-Ogievskiy
2021-01-19 19:29     ` Vladimir Sementsov-Ogievskiy
2021-01-20 10:39     ` Max Reitz
2021-01-20 13:50       ` Max Reitz
2021-01-20 14:34         ` Max Reitz
2021-01-20 14:44           ` Max Reitz
2021-01-20 15:53             ` Max Reitz
2021-01-20 16:00               ` Max Reitz [this message]
2021-01-20 16:04               ` Daniel P. Berrangé
2021-01-20 16:40                 ` Max Reitz
2021-01-20 10:20 ` [PATCH v4 11.5/23] iotests/129: Limit backup's max-chunk/max-workers Max Reitz
2021-01-20 11:24   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1800da5-0c5e-3588-dc9f-bee127f8f780@redhat.com \
    --to=mreitz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    --cc=wencongyang2@huawei.com \
    --cc=xiechanglong.d@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git
	git clone --mirror https://lore.kernel.org/qemu-devel/2 qemu-devel/git/2.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git