From: Jens Axboe <axboe@kernel.dk>
To: Bob Liu <bob.liu@oracle.com>
Cc: linux-block@vger.kernel.org, hare@suse.com, ming.lei@redhat.com,
bvanassche@acm.org, hch@lst.de, martin.petersen@oracle.com,
jinpuwang@gmail.com, rpenyaev@suse.de,
Roman Pen <roman.penyaev@profitbricks.com>
Subject: Re: [RESEND PATCH v4] blk-mq: fix hang caused by freeze/unfreeze sequence
Date: Tue, 21 May 2019 06:49:09 -0600 [thread overview]
Message-ID: <bc6ad7f7-0971-3635-9ebf-2c47d0abd712@kernel.dk> (raw)
In-Reply-To: <20190521032555.31993-1-bob.liu@oracle.com>
On 5/20/19 9:25 PM, Bob Liu wrote:
> The following is a description of a hang in blk_mq_freeze_queue_wait().
> The hang happens on attempt to freeze a queue while another task does
> queue unfreeze.
>
> The root cause is an incorrect sequence of percpu_ref_resurrect() and
> percpu_ref_kill() and as a result those two can be swapped:
>
> CPU#0 CPU#1
> ---------------- -----------------
> q1 = blk_mq_init_queue(shared_tags)
>
> q2 = blk_mq_init_queue(shared_tags):
> blk_mq_add_queue_tag_set(shared_tags):
> blk_mq_update_tag_set_depth(shared_tags):
> list_for_each_entry()
> blk_mq_freeze_queue(q1)
> > percpu_ref_kill()
> > blk_mq_freeze_queue_wait()
>
> blk_cleanup_queue(q1)
> blk_mq_freeze_queue(q1)
> > percpu_ref_kill()
> ^^^^^^ freeze_depth can't guarantee the order
>
> blk_mq_unfreeze_queue()
> > percpu_ref_resurrect()
>
> > blk_mq_freeze_queue_wait()
> ^^^^^^ Hang here!!!!
>
> This wrong sequence raises kernel warning:
> percpu_ref_kill_and_confirm called more than once on blk_queue_usage_counter_release!
> WARNING: CPU: 0 PID: 11854 at lib/percpu-refcount.c:336 percpu_ref_kill_and_confirm+0x99/0xb0
>
> But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(),
> which waits for a zero of a q_usage_counter, which never happens
> because percpu-ref was reinited (instead of being killed) and stays in
> PERCPU state forever.
>
> How to reproduce:
> - "insmod null_blk.ko shared_tags=1 nr_devices=0 queue_mode=2"
> - cpu0: python Script.py 0; taskset the corresponding process running on cpu0
> - cpu1: python Script.py 1; taskset the corresponding process running on cpu1
>
> Script.py:
> ------
> #!/usr/bin/python3
>
> import os
> import sys
>
> while True:
> on = "echo 1 > /sys/kernel/config/nullb/%s/power" % sys.argv[1]
> off = "echo 0 > /sys/kernel/config/nullb/%s/power" % sys.argv[1]
> os.system(on)
> os.system(off)
> ------
>
> This bug was first reported and fixed by Roman, previous discussion:
> [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com
> [2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org
> [3] https://patchwork.kernel.org/patch/9268199/
Applied, thanks.
--
Jens Axboe
prev parent reply other threads:[~2019-05-21 12:49 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-21 3:25 [RESEND PATCH v4] blk-mq: fix hang caused by freeze/unfreeze sequence Bob Liu
2019-05-21 6:08 ` Hannes Reinecke
2019-05-21 7:04 ` Christoph Hellwig
2019-05-21 12:49 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc6ad7f7-0971-3635-9ebf-2c47d0abd712@kernel.dk \
--to=axboe@kernel.dk \
--cc=bob.liu@oracle.com \
--cc=bvanassche@acm.org \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=jinpuwang@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=roman.penyaev@profitbricks.com \
--cc=rpenyaev@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).