* Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel @ 2021-11-09 9:32 Haris Iqbal 2021-11-10 1:38 ` Ming Lei 0 siblings, 1 reply; 7+ messages in thread From: Haris Iqbal @ 2021-11-09 9:32 UTC (permalink / raw) To: linux-block, Jens Axboe, sashal, ming.lei, jack; +Cc: Jinpu Wang, Danil Kipnis Hi, We are observing an fio hang with the latest v5.10.78 Linux kernel version with RNBD. The setup is as follows, On the server side, 16 nullblk devices. On the client side, map those 16 block devices through RNBD-RTRS. Change the scheduler for those RNBD block devices to mq-deadline. Run fios with the following configuration. [global] description=Emulation of Storage Server Access Pattern bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 fadvise_hint=0 rw=randrw:2 direct=1 random_distribution=zipf:1.2 time_based=1 runtime=60 ramp_time=1 ioengine=libaio iodepth=128 iodepth_batch_submit=128 iodepth_batch_complete=128 numjobs=1 group_reporting [job1] filename=/dev/rnbd0 [job2] filename=/dev/rnbd1 [job3] filename=/dev/rnbd2 [job4] filename=/dev/rnbd3 [job5] filename=/dev/rnbd4 [job6] filename=/dev/rnbd5 [job7] filename=/dev/rnbd6 [job8] filename=/dev/rnbd7 [job9] filename=/dev/rnbd8 [job10] filename=/dev/rnbd9 [job11] filename=/dev/rnbd10 [job12] filename=/dev/rnbd11 [job13] filename=/dev/rnbd12 [job14] filename=/dev/rnbd13 [job15] filename=/dev/rnbd14 [job16] filename=/dev/rnbd15 Some of the fio threads hangs and the fio never finishes. fio fio.ini job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) 512B-64.0KiB, ioengine=libaio, iodepth=128 fio-3.12 Starting 16 processes Jobs: 16 (f=12): [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k IOPS][eta 04d:07h:4 Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 IOPS][eta 01m:01s] Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] We checked the block devices, and there are requests waiting in their fifo (not on all devices, just few whose corresponding fio threads are hung). $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list 00000000ce398aec {.op=READ, .cmd_flags=, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=209} 000000005ec82450 {.op=READ, .cmd_flags=, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=210} $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=195} 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=199} 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=217} Potential points which fixes the hang 1) Using no scheduler (none) on the client side RNBD block devices results in no hang. 2) In the fio config, changing the line "iodepth_batch_complete=128" to the following fixes the hang, iodepth_batch_complete_min=1 iodepth_batch_complete_max=128 OR, iodepth_batch_complete=0 3) We also tracked down the version from which the hang started. The hang started with v5.10.50, and the following commit was one which results in the hang commit 512106ae2355813a5eb84e8dc908628d52856890 Author: Ming Lei <ming.lei@redhat.com> Date: Fri Jun 25 10:02:48 2021 +0800 blk-mq: update hctx->dispatch_busy in case of real scheduler [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") starts to support io batching submission by using hctx->dispatch_busy. However, blk_mq_update_dispatch_busy() isn't changed to update hctx->dispatch_busy in that commit, so fix the issue by updating hctx->dispatch_busy in case of real scheduler. Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> diff --git a/block/blk-mq.c b/block/blk-mq.c index 00d6ed2fe812..a368eb6dc647 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy) { unsigned int ewma; - if (hctx->queue->elevator) - return; - ewma = hctx->dispatch_busy; if (!ewma && !busy) We reverted the commit and tested and there is no hang. 4) Lastly, we tested newer version like 5.13, and there is NO hang in that also. Hence, probably some other change fixed it. Regards -Haris ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-09 9:32 Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel Haris Iqbal @ 2021-11-10 1:38 ` Ming Lei 2021-11-11 8:38 ` Haris Iqbal 2021-11-15 13:01 ` Haris Iqbal 0 siblings, 2 replies; 7+ messages in thread From: Ming Lei @ 2021-11-10 1:38 UTC (permalink / raw) To: Haris Iqbal Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis, ming.lei Hello Haris, On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > Hi, > > We are observing an fio hang with the latest v5.10.78 Linux kernel > version with RNBD. The setup is as follows, > > On the server side, 16 nullblk devices. > On the client side, map those 16 block devices through RNBD-RTRS. > Change the scheduler for those RNBD block devices to mq-deadline. > > Run fios with the following configuration. > > [global] > description=Emulation of Storage Server Access Pattern > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > fadvise_hint=0 > rw=randrw:2 > direct=1 > random_distribution=zipf:1.2 > time_based=1 > runtime=60 > ramp_time=1 > ioengine=libaio > iodepth=128 > iodepth_batch_submit=128 > iodepth_batch_complete=128 > numjobs=1 > group_reporting > > > [job1] > filename=/dev/rnbd0 > [job2] > filename=/dev/rnbd1 > [job3] > filename=/dev/rnbd2 > [job4] > filename=/dev/rnbd3 > [job5] > filename=/dev/rnbd4 > [job6] > filename=/dev/rnbd5 > [job7] > filename=/dev/rnbd6 > [job8] > filename=/dev/rnbd7 > [job9] > filename=/dev/rnbd8 > [job10] > filename=/dev/rnbd9 > [job11] > filename=/dev/rnbd10 > [job12] > filename=/dev/rnbd11 > [job13] > filename=/dev/rnbd12 > [job14] > filename=/dev/rnbd13 > [job15] > filename=/dev/rnbd14 > [job16] > filename=/dev/rnbd15 > > Some of the fio threads hangs and the fio never finishes. > > fio fio.ini > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > 512B-64.0KiB, ioengine=libaio, iodepth=128 > fio-3.12 > Starting 16 processes > Jobs: 16 (f=12): > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > IOPS][eta 04d:07h:4 > Jobs: 15 (f=11): > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > IOPS][eta 01m:01s] > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > We checked the block devices, and there are requests waiting in their > fifo (not on all devices, just few whose corresponding fio threads are > hung). > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > 00000000ce398aec {.op=READ, .cmd_flags=, > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > .internal_tag=209} > 000000005ec82450 {.op=READ, .cmd_flags=, > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > .internal_tag=210} > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > .internal_tag=195} > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > .internal_tag=199} > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > .internal_tag=217} Can you post the whole debugfs log for rnbd0? (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) > > > Potential points which fixes the hang > > 1) Using no scheduler (none) on the client side RNBD block devices > results in no hang. > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > to the following fixes the hang, > iodepth_batch_complete_min=1 > iodepth_batch_complete_max=128 > OR, > iodepth_batch_complete=0 > > 3) We also tracked down the version from which the hang started. The > hang started with v5.10.50, and the following commit was one which > results in the hang > > commit 512106ae2355813a5eb84e8dc908628d52856890 > Author: Ming Lei <ming.lei@redhat.com> > Date: Fri Jun 25 10:02:48 2021 +0800 > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > starts to support io batching submission by using hctx->dispatch_busy. > > However, blk_mq_update_dispatch_busy() isn't changed to update > hctx->dispatch_busy > in that commit, so fix the issue by updating hctx->dispatch_busy in case > of real scheduler. > > Reported-by: Jan Kara <jack@suse.cz> > Reviewed-by: Jan Kara <jack@suse.cz> > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > Signed-off-by: Ming Lei <ming.lei@redhat.com> > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > Signed-off-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Sasha Levin <sashal@kernel.org> > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 00d6ed2fe812..a368eb6dc647 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > blk_mq_hw_ctx *hctx, bool busy) > { > unsigned int ewma; > > - if (hctx->queue->elevator) > - return; > - > ewma = hctx->dispatch_busy; > > if (!ewma && !busy) > > We reverted the commit and tested and there is no hang. > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > that also. Hence, probably some other change fixed it. Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") which is missed to 5.10.y. And not remember that there is fix for commit cb9516be7708 in mainline. commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? BTW, commit cb9516be7708 should just affect performance, not supposed to cause hang. Thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-10 1:38 ` Ming Lei @ 2021-11-11 8:38 ` Haris Iqbal 2021-11-15 13:01 ` Haris Iqbal 1 sibling, 0 replies; 7+ messages in thread From: Haris Iqbal @ 2021-11-11 8:38 UTC (permalink / raw) To: Ming Lei Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis, ming.lei [-- Attachment #1: Type: text/plain, Size: 8073 bytes --] On Wed, Nov 10, 2021 at 2:39 AM Ming Lei <ming.lei@redhat.com> wrote: > > Hello Haris, > > On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > > Hi, > > > > We are observing an fio hang with the latest v5.10.78 Linux kernel > > version with RNBD. The setup is as follows, > > > > On the server side, 16 nullblk devices. > > On the client side, map those 16 block devices through RNBD-RTRS. > > Change the scheduler for those RNBD block devices to mq-deadline. > > > > Run fios with the following configuration. > > > > [global] > > description=Emulation of Storage Server Access Pattern > > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > > fadvise_hint=0 > > rw=randrw:2 > > direct=1 > > random_distribution=zipf:1.2 > > time_based=1 > > runtime=60 > > ramp_time=1 > > ioengine=libaio > > iodepth=128 > > iodepth_batch_submit=128 > > iodepth_batch_complete=128 > > numjobs=1 > > group_reporting > > > > > > [job1] > > filename=/dev/rnbd0 > > [job2] > > filename=/dev/rnbd1 > > [job3] > > filename=/dev/rnbd2 > > [job4] > > filename=/dev/rnbd3 > > [job5] > > filename=/dev/rnbd4 > > [job6] > > filename=/dev/rnbd5 > > [job7] > > filename=/dev/rnbd6 > > [job8] > > filename=/dev/rnbd7 > > [job9] > > filename=/dev/rnbd8 > > [job10] > > filename=/dev/rnbd9 > > [job11] > > filename=/dev/rnbd10 > > [job12] > > filename=/dev/rnbd11 > > [job13] > > filename=/dev/rnbd12 > > [job14] > > filename=/dev/rnbd13 > > [job15] > > filename=/dev/rnbd14 > > [job16] > > filename=/dev/rnbd15 > > > > Some of the fio threads hangs and the fio never finishes. > > > > fio fio.ini > > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > fio-3.12 > > Starting 16 processes > > Jobs: 16 (f=12): > > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > > IOPS][eta 04d:07h:4 > > Jobs: 15 (f=11): > > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > > IOPS][eta 01m:01s] > > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > > > We checked the block devices, and there are requests waiting in their > > fifo (not on all devices, just few whose corresponding fio threads are > > hung). > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > > 00000000ce398aec {.op=READ, .cmd_flags=, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=209} > > 000000005ec82450 {.op=READ, .cmd_flags=, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=210} > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=195} > > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=199} > > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=217} > > Can you post the whole debugfs log for rnbd0? > > (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) Attached the log file. > > > > > > > Potential points which fixes the hang > > > > 1) Using no scheduler (none) on the client side RNBD block devices > > results in no hang. > > > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > > to the following fixes the hang, > > iodepth_batch_complete_min=1 > > iodepth_batch_complete_max=128 > > OR, > > iodepth_batch_complete=0 > > > > 3) We also tracked down the version from which the hang started. The > > hang started with v5.10.50, and the following commit was one which > > results in the hang > > > > commit 512106ae2355813a5eb84e8dc908628d52856890 > > Author: Ming Lei <ming.lei@redhat.com> > > Date: Fri Jun 25 10:02:48 2021 +0800 > > > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > starts to support io batching submission by using hctx->dispatch_busy. > > > > However, blk_mq_update_dispatch_busy() isn't changed to update > > hctx->dispatch_busy > > in that commit, so fix the issue by updating hctx->dispatch_busy in case > > of real scheduler. > > > > Reported-by: Jan Kara <jack@suse.cz> > > Reviewed-by: Jan Kara <jack@suse.cz> > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 00d6ed2fe812..a368eb6dc647 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > > blk_mq_hw_ctx *hctx, bool busy) > > { > > unsigned int ewma; > > > > - if (hctx->queue->elevator) > > - return; > > - > > ewma = hctx->dispatch_busy; > > > > if (!ewma && !busy) > > > > We reverted the commit and tested and there is no hang. > > > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > > that also. Hence, probably some other change fixed it. > > Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 > ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") > which is missed to 5.10.y. If you mean v5.10.0, then no, I see no hang there. As I mentioned before, there is no hang till v5.10.49. > > And not remember that there is fix for commit cb9516be7708 in mainline. > > commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? > > BTW, commit cb9516be7708 should just affect performance, not supposed to cause > hang. True. It does look like that from the small code change. I wasn't able to test in v5.14 and v5.15 because we are seeing some other errors in those versions, most probably related to the rdma-core/rxe driver. > > Thanks, > Ming > [-- Attachment #2: debugfslog_rnbd0 --] [-- Type: application/octet-stream, Size: 13289 bytes --] ./sched/starved:0 ./sched/batching:7 ./sched/write_next_rq:0000000094da63eb {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=35} ./sched/write_fifo_list:0000000094da63eb {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=35} ./sched/write_fifo_list:000000005bca28b9 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=41} ./sched/write_fifo_list:0000000094151866 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=43} ./sched/write_fifo_list:00000000b8b87018 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=49} ./sched/write_fifo_list:00000000650ab2f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=2} ./rqos/wbt/wb_background:4 ./rqos/wbt/wb_normal:8 ./rqos/wbt/unknown_cnt:0 ./rqos/wbt/min_lat_nsec:2000000 ./rqos/wbt/inflight:0: inflight 0 ./rqos/wbt/inflight:1: inflight 0 ./rqos/wbt/inflight:2: inflight 0 ./rqos/wbt/id:0 ./rqos/wbt/enabled:1 ./rqos/wbt/curr_win_nsec:0 ./hctx3/cpu3/completed:6987 0 ./hctx3/cpu3/merged:0 ./hctx3/cpu3/dispatched:6987 0 ./hctx3/type:default ./hctx3/dispatch_busy:5 ./hctx3/active:0 ./hctx3/run:7514 ./hctx3/queued:6987 ./hctx3/dispatched: 0 4851 ./hctx3/dispatched: 1 2685 ./hctx3/dispatched: 2 217 ./hctx3/dispatched: 4 270 ./hctx3/dispatched: 8 163 ./hctx3/dispatched: 16 31 ./hctx3/dispatched: 32+ 0 ./hctx3/io_poll:considered=0 ./hctx3/io_poll:invoked=0 ./hctx3/io_poll:success=0 ./hctx3/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/sched_tags:nr_tags=256 ./hctx3/sched_tags:nr_reserved_tags=0 ./hctx3/sched_tags:active_queues=0 ./hctx3/sched_tags:bitmap_tags: ./hctx3/sched_tags:depth=256 ./hctx3/sched_tags:busy=0 ./hctx3/sched_tags:cleared=195 ./hctx3/sched_tags:bits_per_word=64 ./hctx3/sched_tags:map_nr=4 ./hctx3/sched_tags:alloc_hint={6, 245, 41, 141} ./hctx3/sched_tags:wake_batch=8 ./hctx3/sched_tags:wake_index=0 ./hctx3/sched_tags:ws_active=0 ./hctx3/sched_tags:ws={ ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags:} ./hctx3/sched_tags:round_robin=0 ./hctx3/sched_tags:min_shallow_depth=4294967295 ./hctx3/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags:nr_tags=512 ./hctx3/tags:nr_reserved_tags=0 ./hctx3/tags:active_queues=0 ./hctx3/tags:bitmap_tags: ./hctx3/tags:depth=512 ./hctx3/tags:busy=0 ./hctx3/tags:cleared=458 ./hctx3/tags:bits_per_word=64 ./hctx3/tags:map_nr=8 ./hctx3/tags:alloc_hint={168, 253, 246, 369} ./hctx3/tags:wake_batch=8 ./hctx3/tags:wake_index=7 ./hctx3/tags:ws_active=0 ./hctx3/tags:ws={ ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags:} ./hctx3/tags:round_robin=0 ./hctx3/tags:min_shallow_depth=4294967295 ./hctx3/ctx_map:00000000: 00 ./hctx3/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx2/cpu2/completed:4585 0 ./hctx2/cpu2/merged:0 ./hctx2/cpu2/dispatched:4590 0 ./hctx2/type:default ./hctx2/dispatch_busy:5 ./hctx2/active:0 ./hctx2/run:4644 ./hctx2/queued:4590 ./hctx2/dispatched: 0 2761 ./hctx2/dispatched: 1 1966 ./hctx2/dispatched: 2 119 ./hctx2/dispatched: 4 153 ./hctx2/dispatched: 8 79 ./hctx2/dispatched: 16 25 ./hctx2/dispatched: 32+ 4 ./hctx2/io_poll:considered=0 ./hctx2/io_poll:invoked=0 ./hctx2/io_poll:success=0 ./hctx2/sched_tags_bitmap:00000000: 0400 0000 080a 0200 0000 0000 0000 0000 ./hctx2/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/sched_tags:nr_tags=256 ./hctx2/sched_tags:nr_reserved_tags=0 ./hctx2/sched_tags:active_queues=0 ./hctx2/sched_tags:bitmap_tags: ./hctx2/sched_tags:depth=256 ./hctx2/sched_tags:busy=5 ./hctx2/sched_tags:cleared=222 ./hctx2/sched_tags:bits_per_word=64 ./hctx2/sched_tags:map_nr=4 ./hctx2/sched_tags:alloc_hint={209, 91, 50, 218} ./hctx2/sched_tags:wake_batch=8 ./hctx2/sched_tags:wake_index=0 ./hctx2/sched_tags:ws_active=0 ./hctx2/sched_tags:ws={ ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags:} ./hctx2/sched_tags:round_robin=0 ./hctx2/sched_tags:min_shallow_depth=4294967295 ./hctx2/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags:nr_tags=512 ./hctx2/tags:nr_reserved_tags=0 ./hctx2/tags:active_queues=0 ./hctx2/tags:bitmap_tags: ./hctx2/tags:depth=512 ./hctx2/tags:busy=0 ./hctx2/tags:cleared=477 ./hctx2/tags:bits_per_word=64 ./hctx2/tags:map_nr=8 ./hctx2/tags:alloc_hint={510, 327, 106, 507} ./hctx2/tags:wake_batch=8 ./hctx2/tags:wake_index=4 ./hctx2/tags:ws_active=0 ./hctx2/tags:ws={ ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags:} ./hctx2/tags:round_robin=0 ./hctx2/tags:min_shallow_depth=4294967295 ./hctx2/ctx_map:00000000: 00 ./hctx2/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx1/cpu1/completed:5742 0 ./hctx1/cpu1/merged:0 ./hctx1/cpu1/dispatched:5742 0 ./hctx1/type:default ./hctx1/dispatch_busy:7 ./hctx1/active:0 ./hctx1/run:5574 ./hctx1/queued:5742 ./hctx1/dispatched: 0 3444 ./hctx1/dispatched: 1 2088 ./hctx1/dispatched: 2 193 ./hctx1/dispatched: 4 240 ./hctx1/dispatched: 8 121 ./hctx1/dispatched: 16 31 ./hctx1/dispatched: 32+ 0 ./hctx1/io_poll:considered=0 ./hctx1/io_poll:invoked=0 ./hctx1/io_poll:success=0 ./hctx1/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/sched_tags:nr_tags=256 ./hctx1/sched_tags:nr_reserved_tags=0 ./hctx1/sched_tags:active_queues=0 ./hctx1/sched_tags:bitmap_tags: ./hctx1/sched_tags:depth=256 ./hctx1/sched_tags:busy=0 ./hctx1/sched_tags:cleared=228 ./hctx1/sched_tags:bits_per_word=64 ./hctx1/sched_tags:map_nr=4 ./hctx1/sched_tags:alloc_hint={15, 176, 25, 203} ./hctx1/sched_tags:wake_batch=8 ./hctx1/sched_tags:wake_index=0 ./hctx1/sched_tags:ws_active=0 ./hctx1/sched_tags:ws={ ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags:} ./hctx1/sched_tags:round_robin=0 ./hctx1/sched_tags:min_shallow_depth=4294967295 ./hctx1/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags:nr_tags=512 ./hctx1/tags:nr_reserved_tags=0 ./hctx1/tags:active_queues=0 ./hctx1/tags:bitmap_tags: ./hctx1/tags:depth=512 ./hctx1/tags:busy=0 ./hctx1/tags:cleared=512 ./hctx1/tags:bits_per_word=64 ./hctx1/tags:map_nr=8 ./hctx1/tags:alloc_hint={253, 248, 391, 272} ./hctx1/tags:wake_batch=8 ./hctx1/tags:wake_index=6 ./hctx1/tags:ws_active=0 ./hctx1/tags:ws={ ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags:} ./hctx1/tags:round_robin=0 ./hctx1/tags:min_shallow_depth=4294967295 ./hctx1/ctx_map:00000000: 00 ./hctx1/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx0/cpu0/completed:9424 0 ./hctx0/cpu0/merged:0 ./hctx0/cpu0/dispatched:9424 0 ./hctx0/type:default ./hctx0/dispatch_busy:7 ./hctx0/active:0 ./hctx0/run:9545 ./hctx0/queued:9424 ./hctx0/dispatched: 0 6052 ./hctx0/dispatched: 1 3435 ./hctx0/dispatched: 2 272 ./hctx0/dispatched: 4 374 ./hctx0/dispatched: 8 216 ./hctx0/dispatched: 16 49 ./hctx0/dispatched: 32+ 1 ./hctx0/io_poll:considered=0 ./hctx0/io_poll:invoked=0 ./hctx0/io_poll:success=0 ./hctx0/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/sched_tags:nr_tags=256 ./hctx0/sched_tags:nr_reserved_tags=0 ./hctx0/sched_tags:active_queues=0 ./hctx0/sched_tags:bitmap_tags: ./hctx0/sched_tags:depth=256 ./hctx0/sched_tags:busy=0 ./hctx0/sched_tags:cleared=233 ./hctx0/sched_tags:bits_per_word=64 ./hctx0/sched_tags:map_nr=4 ./hctx0/sched_tags:alloc_hint={196, 152, 174, 2} ./hctx0/sched_tags:wake_batch=8 ./hctx0/sched_tags:wake_index=0 ./hctx0/sched_tags:ws_active=0 ./hctx0/sched_tags:ws={ ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags:} ./hctx0/sched_tags:round_robin=0 ./hctx0/sched_tags:min_shallow_depth=4294967295 ./hctx0/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags:nr_tags=512 ./hctx0/tags:nr_reserved_tags=0 ./hctx0/tags:active_queues=0 ./hctx0/tags:bitmap_tags: ./hctx0/tags:depth=512 ./hctx0/tags:busy=0 ./hctx0/tags:cleared=511 ./hctx0/tags:bits_per_word=64 ./hctx0/tags:map_nr=8 ./hctx0/tags:alloc_hint={116, 124, 315, 478} ./hctx0/tags:wake_batch=8 ./hctx0/tags:wake_index=5 ./hctx0/tags:ws_active=0 ./hctx0/tags:ws={ ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags:} ./hctx0/tags:round_robin=0 ./hctx0/tags:min_shallow_depth=4294967295 ./hctx0/ctx_map:00000000: 00 ./hctx0/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./write_hints:hint0: 0 ./write_hints:hint1: 0 ./write_hints:hint2: 0 ./write_hints:hint3: 0 ./write_hints:hint4: 0 ./state:SAME_COMP|NONROT|IO_STAT|SAME_FORCE|INIT_DONE|WC|FUA|STATS|REGISTERED|NOWAIT ./pm_only:0 ./poll_stat:read (512 Bytes): samples=0 ./poll_stat:write (512 Bytes): samples=0 ./poll_stat:read (1024 Bytes): samples=0 ./poll_stat:write (1024 Bytes): samples=0 ./poll_stat:read (2048 Bytes): samples=0 ./poll_stat:write (2048 Bytes): samples=0 ./poll_stat:read (4096 Bytes): samples=0 ./poll_stat:write (4096 Bytes): samples=0 ./poll_stat:read (8192 Bytes): samples=0 ./poll_stat:write (8192 Bytes): samples=0 ./poll_stat:read (16384 Bytes): samples=0 ./poll_stat:write (16384 Bytes): samples=0 ./poll_stat:read (32768 Bytes): samples=0 ./poll_stat:write (32768 Bytes): samples=0 ./poll_stat:read (65536 Bytes): samples=0 ./poll_stat:write (65536 Bytes): samples=0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-10 1:38 ` Ming Lei 2021-11-11 8:38 ` Haris Iqbal @ 2021-11-15 13:01 ` Haris Iqbal 2021-11-15 14:57 ` Ming Lei 1 sibling, 1 reply; 7+ messages in thread From: Haris Iqbal @ 2021-11-15 13:01 UTC (permalink / raw) To: Ming Lei Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis, ming.lei [-- Attachment #1: Type: text/plain, Size: 8187 bytes --] On Wed, Nov 10, 2021 at 2:39 AM Ming Lei <ming.lei@redhat.com> wrote: > > Hello Haris, > > On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > > Hi, > > > > We are observing an fio hang with the latest v5.10.78 Linux kernel > > version with RNBD. The setup is as follows, > > > > On the server side, 16 nullblk devices. > > On the client side, map those 16 block devices through RNBD-RTRS. > > Change the scheduler for those RNBD block devices to mq-deadline. > > > > Run fios with the following configuration. > > > > [global] > > description=Emulation of Storage Server Access Pattern > > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > > fadvise_hint=0 > > rw=randrw:2 > > direct=1 > > random_distribution=zipf:1.2 > > time_based=1 > > runtime=60 > > ramp_time=1 > > ioengine=libaio > > iodepth=128 > > iodepth_batch_submit=128 > > iodepth_batch_complete=128 > > numjobs=1 > > group_reporting > > > > > > [job1] > > filename=/dev/rnbd0 > > [job2] > > filename=/dev/rnbd1 > > [job3] > > filename=/dev/rnbd2 > > [job4] > > filename=/dev/rnbd3 > > [job5] > > filename=/dev/rnbd4 > > [job6] > > filename=/dev/rnbd5 > > [job7] > > filename=/dev/rnbd6 > > [job8] > > filename=/dev/rnbd7 > > [job9] > > filename=/dev/rnbd8 > > [job10] > > filename=/dev/rnbd9 > > [job11] > > filename=/dev/rnbd10 > > [job12] > > filename=/dev/rnbd11 > > [job13] > > filename=/dev/rnbd12 > > [job14] > > filename=/dev/rnbd13 > > [job15] > > filename=/dev/rnbd14 > > [job16] > > filename=/dev/rnbd15 > > > > Some of the fio threads hangs and the fio never finishes. > > > > fio fio.ini > > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > fio-3.12 > > Starting 16 processes > > Jobs: 16 (f=12): > > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > > IOPS][eta 04d:07h:4 > > Jobs: 15 (f=11): > > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > > IOPS][eta 01m:01s] > > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > > > We checked the block devices, and there are requests waiting in their > > fifo (not on all devices, just few whose corresponding fio threads are > > hung). > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > > 00000000ce398aec {.op=READ, .cmd_flags=, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=209} > > 000000005ec82450 {.op=READ, .cmd_flags=, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=210} > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=195} > > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=199} > > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > .internal_tag=217} > > Can you post the whole debugfs log for rnbd0? > > (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) Attached the logfile. > > > > > > > Potential points which fixes the hang > > > > 1) Using no scheduler (none) on the client side RNBD block devices > > results in no hang. > > > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > > to the following fixes the hang, > > iodepth_batch_complete_min=1 > > iodepth_batch_complete_max=128 > > OR, > > iodepth_batch_complete=0 > > > > 3) We also tracked down the version from which the hang started. The > > hang started with v5.10.50, and the following commit was one which > > results in the hang > > > > commit 512106ae2355813a5eb84e8dc908628d52856890 > > Author: Ming Lei <ming.lei@redhat.com> > > Date: Fri Jun 25 10:02:48 2021 +0800 > > > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > starts to support io batching submission by using hctx->dispatch_busy. > > > > However, blk_mq_update_dispatch_busy() isn't changed to update > > hctx->dispatch_busy > > in that commit, so fix the issue by updating hctx->dispatch_busy in case > > of real scheduler. > > > > Reported-by: Jan Kara <jack@suse.cz> > > Reviewed-by: Jan Kara <jack@suse.cz> > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 00d6ed2fe812..a368eb6dc647 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > > blk_mq_hw_ctx *hctx, bool busy) > > { > > unsigned int ewma; > > > > - if (hctx->queue->elevator) > > - return; > > - > > ewma = hctx->dispatch_busy; > > > > if (!ewma && !busy) > > > > We reverted the commit and tested and there is no hang. > > > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > > that also. Hence, probably some other change fixed it. > > Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 > ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") > which is missed to 5.10.y. If you mean v5.10.0, then no, I see no hang there. As I mentioned before, there is no hang till v5.10.49. > > And not remember that there is fix for commit cb9516be7708 in mainline. > > commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? > > BTW, commit cb9516be7708 should just affect performance, not supposed to cause > hang. True. It does look like that from the small code change. I wasn't able to test in v5.14 and v5.15 because we are seeing some other errors in those versions, most probably related to the rdma-core/rxe driver. > > Thanks, > Ming > PS: I am sending this again since I keep getting this notification that the delivery failed for my last response. [-- Attachment #2: debugfslog_rnbd0 --] [-- Type: application/octet-stream, Size: 13289 bytes --] ./sched/starved:0 ./sched/batching:7 ./sched/write_next_rq:0000000094da63eb {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=35} ./sched/write_fifo_list:0000000094da63eb {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=35} ./sched/write_fifo_list:000000005bca28b9 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=41} ./sched/write_fifo_list:0000000094151866 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=43} ./sched/write_fifo_list:00000000b8b87018 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=49} ./sched/write_fifo_list:00000000650ab2f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, .internal_tag=2} ./rqos/wbt/wb_background:4 ./rqos/wbt/wb_normal:8 ./rqos/wbt/unknown_cnt:0 ./rqos/wbt/min_lat_nsec:2000000 ./rqos/wbt/inflight:0: inflight 0 ./rqos/wbt/inflight:1: inflight 0 ./rqos/wbt/inflight:2: inflight 0 ./rqos/wbt/id:0 ./rqos/wbt/enabled:1 ./rqos/wbt/curr_win_nsec:0 ./hctx3/cpu3/completed:6987 0 ./hctx3/cpu3/merged:0 ./hctx3/cpu3/dispatched:6987 0 ./hctx3/type:default ./hctx3/dispatch_busy:5 ./hctx3/active:0 ./hctx3/run:7514 ./hctx3/queued:6987 ./hctx3/dispatched: 0 4851 ./hctx3/dispatched: 1 2685 ./hctx3/dispatched: 2 217 ./hctx3/dispatched: 4 270 ./hctx3/dispatched: 8 163 ./hctx3/dispatched: 16 31 ./hctx3/dispatched: 32+ 0 ./hctx3/io_poll:considered=0 ./hctx3/io_poll:invoked=0 ./hctx3/io_poll:success=0 ./hctx3/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/sched_tags:nr_tags=256 ./hctx3/sched_tags:nr_reserved_tags=0 ./hctx3/sched_tags:active_queues=0 ./hctx3/sched_tags:bitmap_tags: ./hctx3/sched_tags:depth=256 ./hctx3/sched_tags:busy=0 ./hctx3/sched_tags:cleared=195 ./hctx3/sched_tags:bits_per_word=64 ./hctx3/sched_tags:map_nr=4 ./hctx3/sched_tags:alloc_hint={6, 245, 41, 141} ./hctx3/sched_tags:wake_batch=8 ./hctx3/sched_tags:wake_index=0 ./hctx3/sched_tags:ws_active=0 ./hctx3/sched_tags:ws={ ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/sched_tags:} ./hctx3/sched_tags:round_robin=0 ./hctx3/sched_tags:min_shallow_depth=4294967295 ./hctx3/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx3/tags:nr_tags=512 ./hctx3/tags:nr_reserved_tags=0 ./hctx3/tags:active_queues=0 ./hctx3/tags:bitmap_tags: ./hctx3/tags:depth=512 ./hctx3/tags:busy=0 ./hctx3/tags:cleared=458 ./hctx3/tags:bits_per_word=64 ./hctx3/tags:map_nr=8 ./hctx3/tags:alloc_hint={168, 253, 246, 369} ./hctx3/tags:wake_batch=8 ./hctx3/tags:wake_index=7 ./hctx3/tags:ws_active=0 ./hctx3/tags:ws={ ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags: {.wait_cnt=8, .wait=inactive}, ./hctx3/tags:} ./hctx3/tags:round_robin=0 ./hctx3/tags:min_shallow_depth=4294967295 ./hctx3/ctx_map:00000000: 00 ./hctx3/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx2/cpu2/completed:4585 0 ./hctx2/cpu2/merged:0 ./hctx2/cpu2/dispatched:4590 0 ./hctx2/type:default ./hctx2/dispatch_busy:5 ./hctx2/active:0 ./hctx2/run:4644 ./hctx2/queued:4590 ./hctx2/dispatched: 0 2761 ./hctx2/dispatched: 1 1966 ./hctx2/dispatched: 2 119 ./hctx2/dispatched: 4 153 ./hctx2/dispatched: 8 79 ./hctx2/dispatched: 16 25 ./hctx2/dispatched: 32+ 4 ./hctx2/io_poll:considered=0 ./hctx2/io_poll:invoked=0 ./hctx2/io_poll:success=0 ./hctx2/sched_tags_bitmap:00000000: 0400 0000 080a 0200 0000 0000 0000 0000 ./hctx2/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/sched_tags:nr_tags=256 ./hctx2/sched_tags:nr_reserved_tags=0 ./hctx2/sched_tags:active_queues=0 ./hctx2/sched_tags:bitmap_tags: ./hctx2/sched_tags:depth=256 ./hctx2/sched_tags:busy=5 ./hctx2/sched_tags:cleared=222 ./hctx2/sched_tags:bits_per_word=64 ./hctx2/sched_tags:map_nr=4 ./hctx2/sched_tags:alloc_hint={209, 91, 50, 218} ./hctx2/sched_tags:wake_batch=8 ./hctx2/sched_tags:wake_index=0 ./hctx2/sched_tags:ws_active=0 ./hctx2/sched_tags:ws={ ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/sched_tags:} ./hctx2/sched_tags:round_robin=0 ./hctx2/sched_tags:min_shallow_depth=4294967295 ./hctx2/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx2/tags:nr_tags=512 ./hctx2/tags:nr_reserved_tags=0 ./hctx2/tags:active_queues=0 ./hctx2/tags:bitmap_tags: ./hctx2/tags:depth=512 ./hctx2/tags:busy=0 ./hctx2/tags:cleared=477 ./hctx2/tags:bits_per_word=64 ./hctx2/tags:map_nr=8 ./hctx2/tags:alloc_hint={510, 327, 106, 507} ./hctx2/tags:wake_batch=8 ./hctx2/tags:wake_index=4 ./hctx2/tags:ws_active=0 ./hctx2/tags:ws={ ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags: {.wait_cnt=8, .wait=inactive}, ./hctx2/tags:} ./hctx2/tags:round_robin=0 ./hctx2/tags:min_shallow_depth=4294967295 ./hctx2/ctx_map:00000000: 00 ./hctx2/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx1/cpu1/completed:5742 0 ./hctx1/cpu1/merged:0 ./hctx1/cpu1/dispatched:5742 0 ./hctx1/type:default ./hctx1/dispatch_busy:7 ./hctx1/active:0 ./hctx1/run:5574 ./hctx1/queued:5742 ./hctx1/dispatched: 0 3444 ./hctx1/dispatched: 1 2088 ./hctx1/dispatched: 2 193 ./hctx1/dispatched: 4 240 ./hctx1/dispatched: 8 121 ./hctx1/dispatched: 16 31 ./hctx1/dispatched: 32+ 0 ./hctx1/io_poll:considered=0 ./hctx1/io_poll:invoked=0 ./hctx1/io_poll:success=0 ./hctx1/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/sched_tags:nr_tags=256 ./hctx1/sched_tags:nr_reserved_tags=0 ./hctx1/sched_tags:active_queues=0 ./hctx1/sched_tags:bitmap_tags: ./hctx1/sched_tags:depth=256 ./hctx1/sched_tags:busy=0 ./hctx1/sched_tags:cleared=228 ./hctx1/sched_tags:bits_per_word=64 ./hctx1/sched_tags:map_nr=4 ./hctx1/sched_tags:alloc_hint={15, 176, 25, 203} ./hctx1/sched_tags:wake_batch=8 ./hctx1/sched_tags:wake_index=0 ./hctx1/sched_tags:ws_active=0 ./hctx1/sched_tags:ws={ ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/sched_tags:} ./hctx1/sched_tags:round_robin=0 ./hctx1/sched_tags:min_shallow_depth=4294967295 ./hctx1/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx1/tags:nr_tags=512 ./hctx1/tags:nr_reserved_tags=0 ./hctx1/tags:active_queues=0 ./hctx1/tags:bitmap_tags: ./hctx1/tags:depth=512 ./hctx1/tags:busy=0 ./hctx1/tags:cleared=512 ./hctx1/tags:bits_per_word=64 ./hctx1/tags:map_nr=8 ./hctx1/tags:alloc_hint={253, 248, 391, 272} ./hctx1/tags:wake_batch=8 ./hctx1/tags:wake_index=6 ./hctx1/tags:ws_active=0 ./hctx1/tags:ws={ ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags: {.wait_cnt=8, .wait=inactive}, ./hctx1/tags:} ./hctx1/tags:round_robin=0 ./hctx1/tags:min_shallow_depth=4294967295 ./hctx1/ctx_map:00000000: 00 ./hctx1/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./hctx0/cpu0/completed:9424 0 ./hctx0/cpu0/merged:0 ./hctx0/cpu0/dispatched:9424 0 ./hctx0/type:default ./hctx0/dispatch_busy:7 ./hctx0/active:0 ./hctx0/run:9545 ./hctx0/queued:9424 ./hctx0/dispatched: 0 6052 ./hctx0/dispatched: 1 3435 ./hctx0/dispatched: 2 272 ./hctx0/dispatched: 4 374 ./hctx0/dispatched: 8 216 ./hctx0/dispatched: 16 49 ./hctx0/dispatched: 32+ 1 ./hctx0/io_poll:considered=0 ./hctx0/io_poll:invoked=0 ./hctx0/io_poll:success=0 ./hctx0/sched_tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/sched_tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/sched_tags:nr_tags=256 ./hctx0/sched_tags:nr_reserved_tags=0 ./hctx0/sched_tags:active_queues=0 ./hctx0/sched_tags:bitmap_tags: ./hctx0/sched_tags:depth=256 ./hctx0/sched_tags:busy=0 ./hctx0/sched_tags:cleared=233 ./hctx0/sched_tags:bits_per_word=64 ./hctx0/sched_tags:map_nr=4 ./hctx0/sched_tags:alloc_hint={196, 152, 174, 2} ./hctx0/sched_tags:wake_batch=8 ./hctx0/sched_tags:wake_index=0 ./hctx0/sched_tags:ws_active=0 ./hctx0/sched_tags:ws={ ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/sched_tags:} ./hctx0/sched_tags:round_robin=0 ./hctx0/sched_tags:min_shallow_depth=4294967295 ./hctx0/tags_bitmap:00000000: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000010: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000020: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags_bitmap:00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ./hctx0/tags:nr_tags=512 ./hctx0/tags:nr_reserved_tags=0 ./hctx0/tags:active_queues=0 ./hctx0/tags:bitmap_tags: ./hctx0/tags:depth=512 ./hctx0/tags:busy=0 ./hctx0/tags:cleared=511 ./hctx0/tags:bits_per_word=64 ./hctx0/tags:map_nr=8 ./hctx0/tags:alloc_hint={116, 124, 315, 478} ./hctx0/tags:wake_batch=8 ./hctx0/tags:wake_index=5 ./hctx0/tags:ws_active=0 ./hctx0/tags:ws={ ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags: {.wait_cnt=8, .wait=inactive}, ./hctx0/tags:} ./hctx0/tags:round_robin=0 ./hctx0/tags:min_shallow_depth=4294967295 ./hctx0/ctx_map:00000000: 00 ./hctx0/flags:alloc_policy=FIFO SHOULD_MERGE|TAG_QUEUE_SHARED ./write_hints:hint0: 0 ./write_hints:hint1: 0 ./write_hints:hint2: 0 ./write_hints:hint3: 0 ./write_hints:hint4: 0 ./state:SAME_COMP|NONROT|IO_STAT|SAME_FORCE|INIT_DONE|WC|FUA|STATS|REGISTERED|NOWAIT ./pm_only:0 ./poll_stat:read (512 Bytes): samples=0 ./poll_stat:write (512 Bytes): samples=0 ./poll_stat:read (1024 Bytes): samples=0 ./poll_stat:write (1024 Bytes): samples=0 ./poll_stat:read (2048 Bytes): samples=0 ./poll_stat:write (2048 Bytes): samples=0 ./poll_stat:read (4096 Bytes): samples=0 ./poll_stat:write (4096 Bytes): samples=0 ./poll_stat:read (8192 Bytes): samples=0 ./poll_stat:write (8192 Bytes): samples=0 ./poll_stat:read (16384 Bytes): samples=0 ./poll_stat:write (16384 Bytes): samples=0 ./poll_stat:read (32768 Bytes): samples=0 ./poll_stat:write (32768 Bytes): samples=0 ./poll_stat:read (65536 Bytes): samples=0 ./poll_stat:write (65536 Bytes): samples=0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-15 13:01 ` Haris Iqbal @ 2021-11-15 14:57 ` Ming Lei 2021-11-17 11:50 ` Haris Iqbal 0 siblings, 1 reply; 7+ messages in thread From: Ming Lei @ 2021-11-15 14:57 UTC (permalink / raw) To: Haris Iqbal Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis, ming.lei On Mon, Nov 15, 2021 at 02:01:32PM +0100, Haris Iqbal wrote: > On Wed, Nov 10, 2021 at 2:39 AM Ming Lei <ming.lei@redhat.com> wrote: > > > > Hello Haris, > > > > On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > > > Hi, > > > > > > We are observing an fio hang with the latest v5.10.78 Linux kernel > > > version with RNBD. The setup is as follows, > > > > > > On the server side, 16 nullblk devices. > > > On the client side, map those 16 block devices through RNBD-RTRS. > > > Change the scheduler for those RNBD block devices to mq-deadline. > > > > > > Run fios with the following configuration. > > > > > > [global] > > > description=Emulation of Storage Server Access Pattern > > > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > > > fadvise_hint=0 > > > rw=randrw:2 > > > direct=1 > > > random_distribution=zipf:1.2 > > > time_based=1 > > > runtime=60 > > > ramp_time=1 > > > ioengine=libaio > > > iodepth=128 > > > iodepth_batch_submit=128 > > > iodepth_batch_complete=128 > > > numjobs=1 > > > group_reporting > > > > > > > > > [job1] > > > filename=/dev/rnbd0 > > > [job2] > > > filename=/dev/rnbd1 > > > [job3] > > > filename=/dev/rnbd2 > > > [job4] > > > filename=/dev/rnbd3 > > > [job5] > > > filename=/dev/rnbd4 > > > [job6] > > > filename=/dev/rnbd5 > > > [job7] > > > filename=/dev/rnbd6 > > > [job8] > > > filename=/dev/rnbd7 > > > [job9] > > > filename=/dev/rnbd8 > > > [job10] > > > filename=/dev/rnbd9 > > > [job11] > > > filename=/dev/rnbd10 > > > [job12] > > > filename=/dev/rnbd11 > > > [job13] > > > filename=/dev/rnbd12 > > > [job14] > > > filename=/dev/rnbd13 > > > [job15] > > > filename=/dev/rnbd14 > > > [job16] > > > filename=/dev/rnbd15 > > > > > > Some of the fio threads hangs and the fio never finishes. > > > > > > fio fio.ini > > > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > fio-3.12 > > > Starting 16 processes > > > Jobs: 16 (f=12): > > > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > > > IOPS][eta 04d:07h:4 > > > Jobs: 15 (f=11): > > > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > > > IOPS][eta 01m:01s] > > > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > > > > > We checked the block devices, and there are requests waiting in their > > > fifo (not on all devices, just few whose corresponding fio threads are > > > hung). > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > > > 00000000ce398aec {.op=READ, .cmd_flags=, > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > .internal_tag=209} > > > 000000005ec82450 {.op=READ, .cmd_flags=, > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > .internal_tag=210} > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > > > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > .internal_tag=195} > > > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > .internal_tag=199} > > > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > .internal_tag=217} > > > > Can you post the whole debugfs log for rnbd0? > > > > (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) > > Attached the logfile. logfile just shows that there are requests in scheduler queue. > > > > > > > > > > > > Potential points which fixes the hang > > > > > > 1) Using no scheduler (none) on the client side RNBD block devices > > > results in no hang. > > > > > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > > > to the following fixes the hang, > > > iodepth_batch_complete_min=1 > > > iodepth_batch_complete_max=128 > > > OR, > > > iodepth_batch_complete=0 > > > > > > 3) We also tracked down the version from which the hang started. The > > > hang started with v5.10.50, and the following commit was one which > > > results in the hang > > > > > > commit 512106ae2355813a5eb84e8dc908628d52856890 > > > Author: Ming Lei <ming.lei@redhat.com> > > > Date: Fri Jun 25 10:02:48 2021 +0800 > > > > > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > > > > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > > > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > starts to support io batching submission by using hctx->dispatch_busy. > > > > > > However, blk_mq_update_dispatch_busy() isn't changed to update > > > hctx->dispatch_busy > > > in that commit, so fix the issue by updating hctx->dispatch_busy in case > > > of real scheduler. > > > > > > Reported-by: Jan Kara <jack@suse.cz> > > > Reviewed-by: Jan Kara <jack@suse.cz> > > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > index 00d6ed2fe812..a368eb6dc647 100644 > > > --- a/block/blk-mq.c > > > +++ b/block/blk-mq.c > > > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > > > blk_mq_hw_ctx *hctx, bool busy) > > > { > > > unsigned int ewma; > > > > > > - if (hctx->queue->elevator) > > > - return; > > > - > > > ewma = hctx->dispatch_busy; > > > > > > if (!ewma && !busy) > > > > > > We reverted the commit and tested and there is no hang. > > > > > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > > > that also. Hence, probably some other change fixed it. > > > > Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 > > ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") > > which is missed to 5.10.y. > > If you mean v5.10.0, then no, I see no hang there. As I mentioned > before, there is no hang till v5.10.49. > > > > > And not remember that there is fix for commit cb9516be7708 in mainline. > > > > commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? > > > > BTW, commit cb9516be7708 should just affect performance, not supposed to cause > > hang. > > True. It does look like that from the small code change. ->dispatch_busy just affects the batching size for dequeuing request from scheduler queue, see the following code in __blk_mq_do_dispatch_sched(): if (hctx->dispatch_busy) max_dispatch = 1; else max_dispatch = hctx->queue->nr_requests; Also see blk_mq_do_dispatch_sched(): static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) { int ret; do { ret = __blk_mq_do_dispatch_sched(hctx); } while (ret == 1); return ret; } If any request can be dispatched to driver, blk-mq will continue to dispatch until -EAGAIN or 0 is returned from __blk_mq_do_dispatch_sched(). In case of -EAGAIN, blk-mq will try again to dispatch request and run queue asynchronously if another -EAGAIN is returend. In case of 0 returned, blk_mq_dispatch_rq_list() can't make progress, the request will be moved to hctx->dispatch, and blk-mq covers the re-run uueue until all requests in hctx->dispatch are dispatched, then still dispatch request from scheduler queue. So the current code has provided forward-progress, and I don't see how patch 'blk-mq: update hctx->dispatch_busy in case of real scheduler' can cause this issue. Also not see any request in hctx->dispatch from debugfs log. > > I wasn't able to test in v5.14 and v5.15 because we are seeing some > other errors in those versions, most probably related to the > rdma-core/rxe driver. I'd rather see test result on upstream kernel of v5.14 or v5.15 or recent v5.15-rc. Thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-15 14:57 ` Ming Lei @ 2021-11-17 11:50 ` Haris Iqbal 2021-11-17 12:00 ` Ming Lei 0 siblings, 1 reply; 7+ messages in thread From: Haris Iqbal @ 2021-11-17 11:50 UTC (permalink / raw) To: Ming Lei; +Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis On Mon, Nov 15, 2021 at 3:58 PM Ming Lei <ming.lei@redhat.com> wrote: > > On Mon, Nov 15, 2021 at 02:01:32PM +0100, Haris Iqbal wrote: > > On Wed, Nov 10, 2021 at 2:39 AM Ming Lei <ming.lei@redhat.com> wrote: > > > > > > Hello Haris, > > > > > > On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > > > > Hi, > > > > > > > > We are observing an fio hang with the latest v5.10.78 Linux kernel > > > > version with RNBD. The setup is as follows, > > > > > > > > On the server side, 16 nullblk devices. > > > > On the client side, map those 16 block devices through RNBD-RTRS. > > > > Change the scheduler for those RNBD block devices to mq-deadline. > > > > > > > > Run fios with the following configuration. > > > > > > > > [global] > > > > description=Emulation of Storage Server Access Pattern > > > > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > > > > fadvise_hint=0 > > > > rw=randrw:2 > > > > direct=1 > > > > random_distribution=zipf:1.2 > > > > time_based=1 > > > > runtime=60 > > > > ramp_time=1 > > > > ioengine=libaio > > > > iodepth=128 > > > > iodepth_batch_submit=128 > > > > iodepth_batch_complete=128 > > > > numjobs=1 > > > > group_reporting > > > > > > > > > > > > [job1] > > > > filename=/dev/rnbd0 > > > > [job2] > > > > filename=/dev/rnbd1 > > > > [job3] > > > > filename=/dev/rnbd2 > > > > [job4] > > > > filename=/dev/rnbd3 > > > > [job5] > > > > filename=/dev/rnbd4 > > > > [job6] > > > > filename=/dev/rnbd5 > > > > [job7] > > > > filename=/dev/rnbd6 > > > > [job8] > > > > filename=/dev/rnbd7 > > > > [job9] > > > > filename=/dev/rnbd8 > > > > [job10] > > > > filename=/dev/rnbd9 > > > > [job11] > > > > filename=/dev/rnbd10 > > > > [job12] > > > > filename=/dev/rnbd11 > > > > [job13] > > > > filename=/dev/rnbd12 > > > > [job14] > > > > filename=/dev/rnbd13 > > > > [job15] > > > > filename=/dev/rnbd14 > > > > [job16] > > > > filename=/dev/rnbd15 > > > > > > > > Some of the fio threads hangs and the fio never finishes. > > > > > > > > fio fio.ini > > > > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > fio-3.12 > > > > Starting 16 processes > > > > Jobs: 16 (f=12): > > > > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > > > > IOPS][eta 04d:07h:4 > > > > Jobs: 15 (f=11): > > > > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > > > > IOPS][eta 01m:01s] > > > > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > > > > > > > We checked the block devices, and there are requests waiting in their > > > > fifo (not on all devices, just few whose corresponding fio threads are > > > > hung). > > > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > > > > 00000000ce398aec {.op=READ, .cmd_flags=, > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > .internal_tag=209} > > > > 000000005ec82450 {.op=READ, .cmd_flags=, > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > .internal_tag=210} > > > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > > > > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > .internal_tag=195} > > > > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > .internal_tag=199} > > > > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > .internal_tag=217} > > > > > > Can you post the whole debugfs log for rnbd0? > > > > > > (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) > > > > Attached the logfile. > > logfile just shows that there are requests in scheduler queue. > > > > > > > > > > > > > > > > > > Potential points which fixes the hang > > > > > > > > 1) Using no scheduler (none) on the client side RNBD block devices > > > > results in no hang. > > > > > > > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > > > > to the following fixes the hang, > > > > iodepth_batch_complete_min=1 > > > > iodepth_batch_complete_max=128 > > > > OR, > > > > iodepth_batch_complete=0 > > > > > > > > 3) We also tracked down the version from which the hang started. The > > > > hang started with v5.10.50, and the following commit was one which > > > > results in the hang > > > > > > > > commit 512106ae2355813a5eb84e8dc908628d52856890 > > > > Author: Ming Lei <ming.lei@redhat.com> > > > > Date: Fri Jun 25 10:02:48 2021 +0800 > > > > > > > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > > > > > > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > > > > > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > > starts to support io batching submission by using hctx->dispatch_busy. > > > > > > > > However, blk_mq_update_dispatch_busy() isn't changed to update > > > > hctx->dispatch_busy > > > > in that commit, so fix the issue by updating hctx->dispatch_busy in case > > > > of real scheduler. > > > > > > > > Reported-by: Jan Kara <jack@suse.cz> > > > > Reviewed-by: Jan Kara <jack@suse.cz> > > > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > > > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > > index 00d6ed2fe812..a368eb6dc647 100644 > > > > --- a/block/blk-mq.c > > > > +++ b/block/blk-mq.c > > > > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > > > > blk_mq_hw_ctx *hctx, bool busy) > > > > { > > > > unsigned int ewma; > > > > > > > > - if (hctx->queue->elevator) > > > > - return; > > > > - > > > > ewma = hctx->dispatch_busy; > > > > > > > > if (!ewma && !busy) > > > > > > > > We reverted the commit and tested and there is no hang. > > > > > > > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > > > > that also. Hence, probably some other change fixed it. > > > > > > Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 > > > ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") > > > which is missed to 5.10.y. > > > > If you mean v5.10.0, then no, I see no hang there. As I mentioned > > before, there is no hang till v5.10.49. > > > > > > > > And not remember that there is fix for commit cb9516be7708 in mainline. > > > > > > commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? > > > > > > BTW, commit cb9516be7708 should just affect performance, not supposed to cause > > > hang. > > > > True. It does look like that from the small code change. > > ->dispatch_busy just affects the batching size for dequeuing request > from scheduler queue, see the following code in > __blk_mq_do_dispatch_sched(): > > if (hctx->dispatch_busy) > max_dispatch = 1; > else > max_dispatch = hctx->queue->nr_requests; > > > Also see blk_mq_do_dispatch_sched(): > > static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) > { > int ret; > > do { > ret = __blk_mq_do_dispatch_sched(hctx); > } while (ret == 1); > > return ret; > } > > If any request can be dispatched to driver, blk-mq will continue to > dispatch until -EAGAIN or 0 is returned from __blk_mq_do_dispatch_sched(). > > In case of -EAGAIN, blk-mq will try again to dispatch request and run > queue asynchronously if another -EAGAIN is returend. > > In case of 0 returned, blk_mq_dispatch_rq_list() can't make progress, > the request will be moved to hctx->dispatch, and blk-mq covers the > re-run uueue until all requests in hctx->dispatch are dispatched, then > still dispatch request from scheduler queue. > > So the current code has provided forward-progress, and I don't see how > patch 'blk-mq: update hctx->dispatch_busy in case of real scheduler' can > cause this issue. > > Also not see any request in hctx->dispatch from debugfs log. > > > > > I wasn't able to test in v5.14 and v5.15 because we are seeing some > > other errors in those versions, most probably related to the > > rdma-core/rxe driver. > > I'd rather see test result on upstream kernel of v5.14 or v5.15 or recent > v5.15-rc. We were finally able to test in 5.15.2, and there is no hang. > > > > Thanks, > Ming > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel 2021-11-17 11:50 ` Haris Iqbal @ 2021-11-17 12:00 ` Ming Lei 0 siblings, 0 replies; 7+ messages in thread From: Ming Lei @ 2021-11-17 12:00 UTC (permalink / raw) To: Haris Iqbal Cc: linux-block, Jens Axboe, sashal, jack, Jinpu Wang, Danil Kipnis On Wed, Nov 17, 2021 at 12:50:54PM +0100, Haris Iqbal wrote: > On Mon, Nov 15, 2021 at 3:58 PM Ming Lei <ming.lei@redhat.com> wrote: > > > > On Mon, Nov 15, 2021 at 02:01:32PM +0100, Haris Iqbal wrote: > > > On Wed, Nov 10, 2021 at 2:39 AM Ming Lei <ming.lei@redhat.com> wrote: > > > > > > > > Hello Haris, > > > > > > > > On Tue, Nov 09, 2021 at 10:32:32AM +0100, Haris Iqbal wrote: > > > > > Hi, > > > > > > > > > > We are observing an fio hang with the latest v5.10.78 Linux kernel > > > > > version with RNBD. The setup is as follows, > > > > > > > > > > On the server side, 16 nullblk devices. > > > > > On the client side, map those 16 block devices through RNBD-RTRS. > > > > > Change the scheduler for those RNBD block devices to mq-deadline. > > > > > > > > > > Run fios with the following configuration. > > > > > > > > > > [global] > > > > > description=Emulation of Storage Server Access Pattern > > > > > bssplit=512/20:1k/16:2k/9:4k/12:8k/19:16k/10:32k/8:64k/4 > > > > > fadvise_hint=0 > > > > > rw=randrw:2 > > > > > direct=1 > > > > > random_distribution=zipf:1.2 > > > > > time_based=1 > > > > > runtime=60 > > > > > ramp_time=1 > > > > > ioengine=libaio > > > > > iodepth=128 > > > > > iodepth_batch_submit=128 > > > > > iodepth_batch_complete=128 > > > > > numjobs=1 > > > > > group_reporting > > > > > > > > > > > > > > > [job1] > > > > > filename=/dev/rnbd0 > > > > > [job2] > > > > > filename=/dev/rnbd1 > > > > > [job3] > > > > > filename=/dev/rnbd2 > > > > > [job4] > > > > > filename=/dev/rnbd3 > > > > > [job5] > > > > > filename=/dev/rnbd4 > > > > > [job6] > > > > > filename=/dev/rnbd5 > > > > > [job7] > > > > > filename=/dev/rnbd6 > > > > > [job8] > > > > > filename=/dev/rnbd7 > > > > > [job9] > > > > > filename=/dev/rnbd8 > > > > > [job10] > > > > > filename=/dev/rnbd9 > > > > > [job11] > > > > > filename=/dev/rnbd10 > > > > > [job12] > > > > > filename=/dev/rnbd11 > > > > > [job13] > > > > > filename=/dev/rnbd12 > > > > > [job14] > > > > > filename=/dev/rnbd13 > > > > > [job15] > > > > > filename=/dev/rnbd14 > > > > > [job16] > > > > > filename=/dev/rnbd15 > > > > > > > > > > Some of the fio threads hangs and the fio never finishes. > > > > > > > > > > fio fio.ini > > > > > job1: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job2: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job3: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job4: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job5: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job6: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job7: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job8: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job9: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job10: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job11: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job12: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job13: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job14: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job15: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > job16: (g=0): rw=randrw, bs=(R) 512B-64.0KiB, (W) 512B-64.0KiB, (T) > > > > > 512B-64.0KiB, ioengine=libaio, iodepth=128 > > > > > fio-3.12 > > > > > Starting 16 processes > > > > > Jobs: 16 (f=12): > > > > > [m(3),/(2),m(5),/(1),m(1),/(1),m(3)][0.0%][r=130MiB/s,w=130MiB/s][r=14.7k,w=14.7k > > > > > IOPS][eta 04d:07h:4 > > > > > Jobs: 15 (f=11): > > > > > [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][51.2%][r=7395KiB/s,w=6481KiB/s][r=770,w=766 > > > > > IOPS][eta 01m:01s] > > > > > Jobs: 15 (f=11): [m(3),/(2),m(5),/(1),_(1),/(1),m(3)][52.7%][eta 01m:01s] > > > > > > > > > > We checked the block devices, and there are requests waiting in their > > > > > fifo (not on all devices, just few whose corresponding fio threads are > > > > > hung). > > > > > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/read_fifo_list > > > > > 00000000ce398aec {.op=READ, .cmd_flags=, > > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > > .internal_tag=209} > > > > > 000000005ec82450 {.op=READ, .cmd_flags=, > > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > > .internal_tag=210} > > > > > > > > > > $ cat /sys/kernel/debug/block/rnbd0/sched/write_fifo_list > > > > > 000000000c1557f5 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > > .internal_tag=195} > > > > > 00000000fc6bfd98 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > > .internal_tag=199} > > > > > 000000009ef7c802 {.op=WRITE, .cmd_flags=SYNC|IDLE, > > > > > .rq_flags=SORTED|ELVPRIV|IO_STAT|HASHED, .state=idle, .tag=-1, > > > > > .internal_tag=217} > > > > > > > > Can you post the whole debugfs log for rnbd0? > > > > > > > > (cd /sys/kernel/debug/block/rnbd0 && find . -type f -exec grep -aH . {} \;) > > > > > > Attached the logfile. > > > > logfile just shows that there are requests in scheduler queue. > > > > > > > > > > > > > > > > > > > > > > > > Potential points which fixes the hang > > > > > > > > > > 1) Using no scheduler (none) on the client side RNBD block devices > > > > > results in no hang. > > > > > > > > > > 2) In the fio config, changing the line "iodepth_batch_complete=128" > > > > > to the following fixes the hang, > > > > > iodepth_batch_complete_min=1 > > > > > iodepth_batch_complete_max=128 > > > > > OR, > > > > > iodepth_batch_complete=0 > > > > > > > > > > 3) We also tracked down the version from which the hang started. The > > > > > hang started with v5.10.50, and the following commit was one which > > > > > results in the hang > > > > > > > > > > commit 512106ae2355813a5eb84e8dc908628d52856890 > > > > > Author: Ming Lei <ming.lei@redhat.com> > > > > > Date: Fri Jun 25 10:02:48 2021 +0800 > > > > > > > > > > blk-mq: update hctx->dispatch_busy in case of real scheduler > > > > > > > > > > [ Upstream commit cb9516be7708a2a18ec0a19fe3a225b5b3bc92c7 ] > > > > > > > > > > Commit 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > > > starts to support io batching submission by using hctx->dispatch_busy. > > > > > > > > > > However, blk_mq_update_dispatch_busy() isn't changed to update > > > > > hctx->dispatch_busy > > > > > in that commit, so fix the issue by updating hctx->dispatch_busy in case > > > > > of real scheduler. > > > > > > > > > > Reported-by: Jan Kara <jack@suse.cz> > > > > > Reviewed-by: Jan Kara <jack@suse.cz> > > > > > Fixes: 6e6fcbc27e77 ("blk-mq: support batching dispatch in case of io") > > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > > > > Link: https://lore.kernel.org/r/20210625020248.1630497-1-ming.lei@redhat.com > > > > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > > > index 00d6ed2fe812..a368eb6dc647 100644 > > > > > --- a/block/blk-mq.c > > > > > +++ b/block/blk-mq.c > > > > > @@ -1242,9 +1242,6 @@ static void blk_mq_update_dispatch_busy(struct > > > > > blk_mq_hw_ctx *hctx, bool busy) > > > > > { > > > > > unsigned int ewma; > > > > > > > > > > - if (hctx->queue->elevator) > > > > > - return; > > > > > - > > > > > ewma = hctx->dispatch_busy; > > > > > > > > > > if (!ewma && !busy) > > > > > > > > > > We reverted the commit and tested and there is no hang. > > > > > > > > > > 4) Lastly, we tested newer version like 5.13, and there is NO hang in > > > > > that also. Hence, probably some other change fixed it. > > > > > > > > Can you observe the issue on v5.10? Maybe there is one pre-patch of commit cb9516be7708 > > > > ("blk-mq: update hctx->dispatch_busy in case of real scheduler merged") > > > > which is missed to 5.10.y. > > > > > > If you mean v5.10.0, then no, I see no hang there. As I mentioned > > > before, there is no hang till v5.10.49. > > > > > > > > > > > And not remember that there is fix for commit cb9516be7708 in mainline. > > > > > > > > commit cb9516be7708 is merged to v5.14, instead of v5.13, did you test v5.14 or v5.15? > > > > > > > > BTW, commit cb9516be7708 should just affect performance, not supposed to cause > > > > hang. > > > > > > True. It does look like that from the small code change. > > > > ->dispatch_busy just affects the batching size for dequeuing request > > from scheduler queue, see the following code in > > __blk_mq_do_dispatch_sched(): > > > > if (hctx->dispatch_busy) > > max_dispatch = 1; > > else > > max_dispatch = hctx->queue->nr_requests; > > > > > > Also see blk_mq_do_dispatch_sched(): > > > > static int blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) > > { > > int ret; > > > > do { > > ret = __blk_mq_do_dispatch_sched(hctx); > > } while (ret == 1); > > > > return ret; > > } > > > > If any request can be dispatched to driver, blk-mq will continue to > > dispatch until -EAGAIN or 0 is returned from __blk_mq_do_dispatch_sched(). > > > > In case of -EAGAIN, blk-mq will try again to dispatch request and run > > queue asynchronously if another -EAGAIN is returend. > > > > In case of 0 returned, blk_mq_dispatch_rq_list() can't make progress, > > the request will be moved to hctx->dispatch, and blk-mq covers the > > re-run uueue until all requests in hctx->dispatch are dispatched, then > > still dispatch request from scheduler queue. > > > > So the current code has provided forward-progress, and I don't see how > > patch 'blk-mq: update hctx->dispatch_busy in case of real scheduler' can > > cause this issue. > > > > Also not see any request in hctx->dispatch from debugfs log. > > > > > > > > I wasn't able to test in v5.14 and v5.15 because we are seeing some > > > other errors in those versions, most probably related to the > > > rdma-core/rxe driver. > > > > I'd rather see test result on upstream kernel of v5.14 or v5.15 or recent > > v5.15-rc. > > We were finally able to test in 5.15.2, and there is no hang. OK, that is expected. Maybe some fixes aren't backported to stable tree. If RNBD can work on mainline, git-bisect may figure out which patches are missed. thanks, Ming ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-11-17 12:00 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-09 9:32 Observing an fio hang with RNBD device in the latest v5.10.78 Linux kernel Haris Iqbal 2021-11-10 1:38 ` Ming Lei 2021-11-11 8:38 ` Haris Iqbal 2021-11-15 13:01 ` Haris Iqbal 2021-11-15 14:57 ` Ming Lei 2021-11-17 11:50 ` Haris Iqbal 2021-11-17 12:00 ` Ming Lei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).