All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Haris Iqbal <haris.iqbal@ionos.com>
Cc: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	sashal@kernel.org, jack@suse.cz,
	Jinpu Wang <jinpu.wang@ionos.com>,
	Danil Kipnis <danil.kipnis@ionos.com>,
	ming.lei@rehdat.com
Subject: Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel
Date: Wed, 10 Nov 2021 09:38:49 +0800	[thread overview]
Message-ID: <YYsiqUpLdNtshZms@T590> (raw)
In-Reply-To: <CAJpMwyixY_-AbMvtGGMBWBO3+oOEF9fuWHkYpLWDbXo3dcAGfg@mail.gmail.com>

Hello Haris,

On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote:
> Hi,
> 
> We are observing an fio hang with the latest v5.10.78 Linux kernel
> version with RNBD. The setup is as follows,
> 
> On the server side, 16 nullblk devices.
> On the client side, map those 16 block devices through RNBD-RTRS.
> Change the scheduler for those RNBD block devices to mq-deadline.
> 
> Run fios with the following configuration.
> 
> [global]
> description=Emulation of Storage Server Access Pattern
> bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4
> fadvise_hint=0
> rw=randrw:2
> direct=1
> random_distribution=zipf:1.2
> time_based=1
> runtime=60
> ramp_time=1
> ioengine=libaio
> iodepth=128
> iodepth_batch_submit=128
> iodepth_batch_complete=128
> numjobs=1
> group_reporting
> 
> 
> [job1]
> filename=/dev/rnbd0
> [job2]
> filename=/dev/rnbd1
> [job3]
> filename=/dev/rnbd2
> [job4]
> filename=/dev/rnbd3
> [job5]
> filename=/dev/rnbd4
> [job6]
> filename=/dev/rnbd5
> [job7]
> filename=/dev/rnbd6
> [job8]
> filename=/dev/rnbd7
> [job9]
> filename=/dev/rnbd8
> [job10]
> filename=/dev/rnbd9
> [job11]
> filename=/dev/rnbd10
> [job12]
> filename=/dev/rnbd11
> [job13]
> filename=/dev/rnbd12
> [job14]
> filename=/dev/rnbd13
> [job15]
> filename=/dev/rnbd14
> [job16]
> filename=/dev/rnbd15
> 
> Some of the fio threads hangs and the fio never finishes.
> 
> fio fio.ini
> job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T)
> 512B-64.0KiB, ioengine=libaio, iodepth=128
> fio-3.12
> Starting 16 processes
> Jobs: 16 (f=12):
> [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k
> IOPS][eta 04d:07h:4
> Jobs: 15 (f=11):
> [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766
> IOPS][eta 01m:01s]
> Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s]
> 
> We checked the block devices, and there are requests waiting in their
> fifo (not on all devices, just few whose corresponding fio threads are
> hung).
> 
> $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list
> 00000000ce398aec {.op=READ, .cmd_flags=,
> .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1,
> .internal_tag=209}
> 000000005ec82450 {.op=READ, .cmd_flags=,
> .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1,
> .internal_tag=210}
> 
> $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list
> 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE,
> .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1,
> .internal_tag=195}
> 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE,
> .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1,
> .internal_tag=199}
> 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE,
> .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1,
> .internal_tag=217}

Can you post the whole debugfs log for rnbd0? 

(cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;)

> 
> 
> Potential points which fixes the hang
> 
> 1) Using no scheduler (none) on the client side RNBD block devices
> results in no hang.
> 
> 2) In the fio config, changing the line "iodepth_batch_complete=128"
> to the following fixes the hang,
> iodepth_batch_complete_min=1
> iodepth_batch_complete_max=128
> OR,
> iodepth_batch_complete=0
> 
> 3) We also tracked down the version from which the hang started. The
> hang started with v5.10.50, and the following commit was one which
> results in the hang
> 
> commit 512106ae2355813a5eb84e8dc908628d52856890
> Author: Ming Lei <ming.lei@redhat.com>
> Date:   Fri Jun 25 10:02:48 2021 +0800
> 
>     blk-mq: update hctx->dispatch_busy in case of real scheduler
> 
>     [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ]
> 
>     Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
>     starts to support io batching submission by using hctx->dispatch_busy.
> 
>     However, blk_mq_update_dispatch_busy() isn't changed to update
> hctx->dispatch_busy
>     in that commit, so fix the issue by updating hctx->dispatch_busy in case
>     of real scheduler.
> 
>     Reported-by: Jan Kara <jack@suse.cz>
>     Reviewed-by: Jan Kara <jack@suse.cz>
>     Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io")
>     Signed-off-by: Ming Lei <ming.lei@redhat.com>
>     Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com
>     Signed-off-by: Jens Axboe <axboe@kernel.dk>
>     Signed-off-by: Sasha Levin <sashal@kernel.org>
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 00d6ed2fe812..a368eb6dc647 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct
> blk_mq_hw_ctx *hctx, bool busy)
>  {
>         unsigned int ewma;
> 
> -       if (hctx->queue->elevator)
> -               return;
> -
>         ewma = hctx->dispatch_busy;
> 
>         if (!ewma && !busy)
> 
> We reverted the commit and tested and there is no hang.
> 
> 4) Lastly, we tested newer version like 5.13, and there is NO hang in
> that also. Hence, probably some other change fixed it.

Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708
("blk-mq: update hctx->dispatch_busy in case of real scheduler merged")
which is missed to 5.10.y.

And not remember that there is fix for commit cb9516be7708 in mainline.

commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15?

BTW, commit cb9516be7708 should just affect performance, not supposed to cause
hang.

Thanks,
Ming


  reply	other threads:[~2021-11-10  1:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-09  9:32 Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel Haris Iqbal
2021-11-10  1:38 ` Ming Lei [this message]
2021-11-11  8:38   ` Haris Iqbal
2021-11-15 13:01   ` Haris Iqbal
2021-11-15 14:57     ` Ming Lei
2021-11-17 11:50       ` Haris Iqbal
2021-11-17 12:00         ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYsiqUpLdNtshZms@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=danil.kipnis@ionos.com \
    --cc=haris.iqbal@ionos.com \
    --cc=jack@suse.cz \
    --cc=jinpu.wang@ionos.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@rehdat.com \
    --cc=sashal@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.