* stalling IO regression in linux 5.12 @ 2022-08-10 16:35 Chris Murphy 2022-08-10 17:48 ` Josef Bacik 2022-08-15 11:25 ` stalling IO regression in linux 5.12 Thorsten Leemhuis 0 siblings, 2 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-10 16:35 UTC (permalink / raw) To: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel; +Cc: Josef Bacik CPU: Intel E5-2680 v3 RAM: 128 G 02:00.0 RAID bus controller [0104]: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] [1000:005d] (rev 02), using megaraid_sas driver 8 Disks: TOSHIBA AL13SEB600 The problem exhibits as increasing load, increasing IO pressure (PSI), and actual IO goes to zero. It never happens on kernel 5.11 series, and always happens after 5.12-rc1 and persists through 5.18.0. There's a new mix of behaviors with 5.19, I suspect the mm improvements in this series might be masking the problem. The workload involves openqa, which spins up 30 qemu-kvm instances, and does a bunch of tests, generating quite a lot of writes: qcow2 files, and video in the form of many screenshots, and various log files, for each VM. These VMs are each in their own cgroup. As the problem begins, I see increasing IO pressure, and decreasing IO, for each qemu instance's cgroup, and the cgroups for httpd, journald, auditd, and postgresql. IO pressure goes to nearly ~99% and IO is literally 0. The problem left unattended to progress will eventually result in a completely unresponsive system, with no kernel messages. It reproduces in the following configurations, the first two I provide links to full dmesg with sysrq+w: btrfs raid10 (native) on plain partitions [1] btrfs single/dup on dmcrypt on mdadm raid 10 and parity raid [2] XFS on dmcrypt on mdadm raid10 or parity raid I've started a bisect, but for some reason I haven't figured out I've started getting compiled kernels that don't boot the hardware. The failure is very early on such that the UUID for the root file system isn't found, but not much to go on as to why.[3] I have tested the first and last skipped commits in the bisect log below, they successfully boot a VM but not the hardware. Anyway, I'm kinda stuck at this point trying to narrow it down further. Any suggestions? Thanks. [1] btrfs raid10, plain partitions https://drive.google.com/file/d/1-oT3MX-hHYtQqI0F3SpgPjCIDXXTysLU/view?usp=sharing [2] btrfs single/dup, dmcrypt, mdadm raid10 https://drive.google.com/file/d/1m_T3YYaEjBKUROz6dHt5_h92ZVRji9FM/view?usp=sharing [3] $ git bisect log git bisect start # status: waiting for both good and bad commits # bad: [c03c21ba6f4e95e406a1a7b4c34ef334b977c194] Merge tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs git bisect bad c03c21ba6f4e95e406a1a7b4c34ef334b977c194 # status: waiting for good commit(s), bad commit known # good: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11 git bisect good f40ddce88593482919761f74910f42f4b84c004b # bad: [df24212a493afda0d4de42176bea10d45825e9a0] Merge tag 's390-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux git bisect bad df24212a493afda0d4de42176bea10d45825e9a0 # good: [82851fce6107d5a3e66d95aee2ae68860a732703] Merge tag 'arm-dt-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect good 82851fce6107d5a3e66d95aee2ae68860a732703 # good: [99f1a5872b706094ece117368170a92c66b2e242] Merge tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux git bisect good 99f1a5872b706094ece117368170a92c66b2e242 # bad: [9eef02334505411667a7b51a8f349f8c6c4f3b66] Merge tag 'locking-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad 9eef02334505411667a7b51a8f349f8c6c4f3b66 # bad: [9820b4dca0f9c6b7ab8b4307286cdace171b724d] Merge tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-block git bisect bad 9820b4dca0f9c6b7ab8b4307286cdace171b724d # good: [bd018bbaa58640da786d4289563e71c5ef3938c7] Merge tag 'for-5.12/libata-2021-02-17' of git://git.kernel.dk/linux-block git bisect good bd018bbaa58640da786d4289563e71c5ef3938c7 # skip: [203c018079e13510f913fd0fd426370f4de0fd05] Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.12/drivers git bisect skip 203c018079e13510f913fd0fd426370f4de0fd05 # skip: [49d1ec8573f74ff1e23df1d5092211de46baa236] block: manage bio slab cache by xarray git bisect skip 49d1ec8573f74ff1e23df1d5092211de46baa236 # bad: [73d90386b559d6f4c3c5db5e6bb1b68aae8fd3e7] nvme: cleanup zone information initialization git bisect bad 73d90386b559d6f4c3c5db5e6bb1b68aae8fd3e7 # skip: [71217df39dc67a0aeed83352b0d712b7892036a2] block, bfq: make waker-queue detection more robust git bisect skip 71217df39dc67a0aeed83352b0d712b7892036a2 # bad: [8358c28a5d44bf0223a55a2334086c3707bb4185] block: fix memory leak of bvec git bisect bad 8358c28a5d44bf0223a55a2334086c3707bb4185 # skip: [3a905c37c3510ea6d7cfcdfd0f272ba731286560] block: skip bio_check_eod for partition-remapped bios git bisect skip 3a905c37c3510ea6d7cfcdfd0f272ba731286560 # skip: [3c337690d2ebb7a01fa13bfa59ce4911f358df42] block, bfq: avoid spurious switches to soft_rt of interactive queues git bisect skip 3c337690d2ebb7a01fa13bfa59ce4911f358df42 # skip: [3e1a88ec96259282b9a8b45c3f1fda7a3ff4f6ea] bio: add a helper calculating nr segments to alloc git bisect skip 3e1a88ec96259282b9a8b45c3f1fda7a3ff4f6ea # skip: [4eb1d689045552eb966ebf25efbc3ce648797d96] blk-crypto: use bio_kmalloc in blk_crypto_clone_bio git bisect skip 4eb1d689045552eb966ebf25efbc3ce648797d96 -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 16:35 stalling IO regression in linux 5.12 Chris Murphy @ 2022-08-10 17:48 ` Josef Bacik 2022-08-10 18:33 ` Chris Murphy 2022-08-15 11:25 ` stalling IO regression in linux 5.12 Thorsten Leemhuis 1 sibling, 1 reply; 58+ messages in thread From: Josef Bacik @ 2022-08-10 17:48 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022 at 12:35:34PM -0400, Chris Murphy wrote: > CPU: Intel E5-2680 v3 > RAM: 128 G > 02:00.0 RAID bus controller [0104]: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] [1000:005d] (rev 02), using megaraid_sas driver > 8 Disks: TOSHIBA AL13SEB600 > > > The problem exhibits as increasing load, increasing IO pressure (PSI), and actual IO goes to zero. It never happens on kernel 5.11 series, and always happens after 5.12-rc1 and persists through 5.18.0. There's a new mix of behaviors with 5.19, I suspect the mm improvements in this series might be masking the problem. > > The workload involves openqa, which spins up 30 qemu-kvm instances, and does a bunch of tests, generating quite a lot of writes: qcow2 files, and video in the form of many screenshots, and various log files, for each VM. These VMs are each in their own cgroup. As the problem begins, I see increasing IO pressure, and decreasing IO, for each qemu instance's cgroup, and the cgroups for httpd, journald, auditd, and postgresql. IO pressure goes to nearly ~99% and IO is literally 0. > > The problem left unattended to progress will eventually result in a completely unresponsive system, with no kernel messages. It reproduces in the following configurations, the first two I provide links to full dmesg with sysrq+w: > > btrfs raid10 (native) on plain partitions [1] > btrfs single/dup on dmcrypt on mdadm raid 10 and parity raid [2] > XFS on dmcrypt on mdadm raid10 or parity raid > > I've started a bisect, but for some reason I haven't figured out I've started getting compiled kernels that don't boot the hardware. The failure is very early on such that the UUID for the root file system isn't found, but not much to go on as to why.[3] I have tested the first and last skipped commits in the bisect log below, they successfully boot a VM but not the hardware. > > Anyway, I'm kinda stuck at this point trying to narrow it down further. Any suggestions? Thanks. > I looked at the traces, btrfs is stuck waiting on IO and blk tags, which means we've got a lot of outstanding requests and are waiting for them to finish so we can allocate more requests. Additionally I'm seeing a bunch of the blkg async submit things, which are used when we have the block cgroup stuff turned on and compression enabled, so we punt any compressed bios to a per-cgroup async thread to submit the IO's in the appropriate block cgroup context. This could mean we're just being overly mean and generating too many IO's, but since the IO goes to 0 I'm more inclined to believe there's a screw up in whatever IO cgroup controller you're using. To help narrow this down can you disable any IO controller you've got enabled and see if you can reproduce? If you can sysrq+w is super helpful as it'll point us in the next direction to look. Thanks, Josef ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 17:48 ` Josef Bacik @ 2022-08-10 18:33 ` Chris Murphy 2022-08-10 18:42 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-10 18:33 UTC (permalink / raw) To: Josef Bacik; +Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022, at 1:48 PM, Josef Bacik wrote: > To help narrow this down can you disable any IO controller you've got enabled > and see if you can reproduce? If you can sysrq+w is super helpful as it'll > point us in the next direction to look. Thanks, I'm not following, sorry. I can boot with systemd.unified_cgroup_hierarchy=0 to make sure it's all off, but we're not using an IO cgroup controllers specifically as far as I'm aware. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 18:33 ` Chris Murphy @ 2022-08-10 18:42 ` Chris Murphy 2022-08-10 19:31 ` Josef Bacik 2022-08-10 19:34 ` Chris Murphy 0 siblings, 2 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-10 18:42 UTC (permalink / raw) To: Josef Bacik; +Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022, at 2:33 PM, Chris Murphy wrote: > On Wed, Aug 10, 2022, at 1:48 PM, Josef Bacik wrote: > >> To help narrow this down can you disable any IO controller you've got enabled >> and see if you can reproduce? If you can sysrq+w is super helpful as it'll >> point us in the next direction to look. Thanks, > > I'm not following, sorry. I can boot with > systemd.unified_cgroup_hierarchy=0 to make sure it's all off, but we're > not using an IO cgroup controllers specifically as far as I'm aware. OK yeah that won't work because the workload requires cgroup2 or it won't run. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 18:42 ` Chris Murphy @ 2022-08-10 19:31 ` Josef Bacik 2022-08-10 19:34 ` Chris Murphy 1 sibling, 0 replies; 58+ messages in thread From: Josef Bacik @ 2022-08-10 19:31 UTC (permalink / raw) To: Chris Murphy; +Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022 at 02:42:40PM -0400, Chris Murphy wrote: > > > On Wed, Aug 10, 2022, at 2:33 PM, Chris Murphy wrote: > > On Wed, Aug 10, 2022, at 1:48 PM, Josef Bacik wrote: > > > >> To help narrow this down can you disable any IO controller you've got enabled > >> and see if you can reproduce? If you can sysrq+w is super helpful as it'll > >> point us in the next direction to look. Thanks, > > > > I'm not following, sorry. I can boot with > > systemd.unified_cgroup_hierarchy=0 to make sure it's all off, but we're > > not using an IO cgroup controllers specifically as far as I'm aware. > > OK yeah that won't work because the workload requires cgroup2 or it won't run. > Oh no I don't want cgroups completley off, just disable the io controller, so figure out which cgroup your thing is being run in, and then echo "-io" > <parent dir>/cgroup.subtree_control If you cat /sys/fs/cgroup/whatever/cgroup/cgroup.controllers and you see "io" in there keep doing the above in the next highest parent directory until io is no longer in there. Thanks, Josef ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 18:42 ` Chris Murphy 2022-08-10 19:31 ` Josef Bacik @ 2022-08-10 19:34 ` Chris Murphy 2022-08-12 16:05 ` stalling IO regression since linux 5.12, through 5.18 Chris Murphy 1 sibling, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-10 19:34 UTC (permalink / raw) To: Josef Bacik; +Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022, at 2:42 PM, Chris Murphy wrote: > On Wed, Aug 10, 2022, at 2:33 PM, Chris Murphy wrote: >> On Wed, Aug 10, 2022, at 1:48 PM, Josef Bacik wrote: >> >>> To help narrow this down can you disable any IO controller you've got enabled >>> and see if you can reproduce? If you can sysrq+w is super helpful as it'll >>> point us in the next direction to look. Thanks, >> >> I'm not following, sorry. I can boot with >> systemd.unified_cgroup_hierarchy=0 to make sure it's all off, but we're >> not using an IO cgroup controllers specifically as far as I'm aware. > > OK yeah that won't work because the workload requires cgroup2 or it won't run. Booted with cgroup_disable=io, and confirmed cat /sys/fs/cgroup/cgroup.controllers does not list io. I'll rerun the workload now. Sometimes reproduces fast, other times a couple hours. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-10 19:34 ` Chris Murphy @ 2022-08-12 16:05 ` Chris Murphy 2022-08-12 17:59 ` Josef Bacik 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-12 16:05 UTC (permalink / raw) To: Josef Bacik, paolo.valente Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Wed, Aug 10, 2022, at 3:34 PM, Chris Murphy wrote: > Booted with cgroup_disable=io, and confirmed cat > /sys/fs/cgroup/cgroup.controllers does not list io. The problem still reproduces with the cgroup IO controller disabled. On a whim, I decided to switch the IO scheduler from Fedora's default bfq for rotating drives to mq-deadline. The problem does not reproduce for 15+ hours, which is not 100% conclusive but probably 99% conclusive. I then switched live while running the workload to bfq on all eight drives, and within 10 minutes the system cratered, all new commands just hang. Load average goes to triple digits, i/o wait increasing, i/o pressure for the workload tasks to 100%, and IO completely stalls to zero. I was able to switch only two of the drive queues back to mq-deadline and then lost responsivness in that shell and had to issue sysrq+b... Before that I was able to extra sysrq+w and sysrq+t. https://drive.google.com/file/d/16hdQjyBnuzzQIhiQT6fQdE0nkRQJj7EI/view?usp=sharing I can't tell if this is a bfq bug, or if there's some negative interaction between bfq and scsi or megaraid_sas. Obviously it's rare because otherwise people would have been falling over this much sooner. But at this point there's strong correlation that it's bfq related and is a kernel regression that's been around since 5.12.0 through 5.18.0, and I suspect also 5.19.0 but it's being partly masked by other improvements. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-12 16:05 ` stalling IO regression since linux 5.12, through 5.18 Chris Murphy @ 2022-08-12 17:59 ` Josef Bacik 2022-08-12 18:02 ` Jens Axboe 0 siblings, 1 reply; 58+ messages in thread From: Josef Bacik @ 2022-08-12 17:59 UTC (permalink / raw) To: Chris Murphy Cc: Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel On Fri, Aug 12, 2022 at 12:05 PM Chris Murphy <lists@colorremedies.com> wrote: > > > > On Wed, Aug 10, 2022, at 3:34 PM, Chris Murphy wrote: > > Booted with cgroup_disable=io, and confirmed cat > > /sys/fs/cgroup/cgroup.controllers does not list io. > > The problem still reproduces with the cgroup IO controller disabled. > > On a whim, I decided to switch the IO scheduler from Fedora's default bfq for rotating drives to mq-deadline. The problem does not reproduce for 15+ hours, which is not 100% conclusive but probably 99% conclusive. I then switched live while running the workload to bfq on all eight drives, and within 10 minutes the system cratered, all new commands just hang. Load average goes to triple digits, i/o wait increasing, i/o pressure for the workload tasks to 100%, and IO completely stalls to zero. I was able to switch only two of the drive queues back to mq-deadline and then lost responsivness in that shell and had to issue sysrq+b... > > Before that I was able to extra sysrq+w and sysrq+t. > https://drive.google.com/file/d/16hdQjyBnuzzQIhiQT6fQdE0nkRQJj7EI/view?usp=sharing > > I can't tell if this is a bfq bug, or if there's some negative interaction between bfq and scsi or megaraid_sas. Obviously it's rare because otherwise people would have been falling over this much sooner. But at this point there's strong correlation that it's bfq related and is a kernel regression that's been around since 5.12.0 through 5.18.0, and I suspect also 5.19.0 but it's being partly masked by other improvements. This matches observations we've had internally (inside Facebook) as well as my continual integration performance testing. It should probably be looked into by the BFQ guys as it was working previously. Thanks, Josef ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-12 17:59 ` Josef Bacik @ 2022-08-12 18:02 ` Jens Axboe 2022-08-14 20:28 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Jens Axboe @ 2022-08-12 18:02 UTC (permalink / raw) To: Josef Bacik, Chris Murphy Cc: Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Jan Kara On 8/12/22 11:59 AM, Josef Bacik wrote: > On Fri, Aug 12, 2022 at 12:05 PM Chris Murphy <lists@colorremedies.com> wrote: >> >> >> >> On Wed, Aug 10, 2022, at 3:34 PM, Chris Murphy wrote: >>> Booted with cgroup_disable=io, and confirmed cat >>> /sys/fs/cgroup/cgroup.controllers does not list io. >> >> The problem still reproduces with the cgroup IO controller disabled. >> >> On a whim, I decided to switch the IO scheduler from Fedora's default bfq for rotating drives to mq-deadline. The problem does not reproduce for 15+ hours, which is not 100% conclusive but probably 99% conclusive. I then switched live while running the workload to bfq on all eight drives, and within 10 minutes the system cratered, all new commands just hang. Load average goes to triple digits, i/o wait increasing, i/o pressure for the workload tasks to 100%, and IO completely stalls to zero. I was able to switch only two of the drive queues back to mq-deadline and then lost responsivness in that shell and had to issue sysrq+b... >> >> Before that I was able to extra sysrq+w and sysrq+t. >> https://drive.google.com/file/d/16hdQjyBnuzzQIhiQT6fQdE0nkRQJj7EI/view?usp=sharing >> >> I can't tell if this is a bfq bug, or if there's some negative interaction between bfq and scsi or megaraid_sas. Obviously it's rare because otherwise people would have been falling over this much sooner. But at this point there's strong correlation that it's bfq related and is a kernel regression that's been around since 5.12.0 through 5.18.0, and I suspect also 5.19.0 but it's being partly masked by other improvements. > > This matches observations we've had internally (inside Facebook) as > well as my continual integration performance testing. It should > probably be looked into by the BFQ guys as it was working previously. > Thanks, 5.12 has a few BFQ changes: Jan Kara: bfq: Avoid false bfq queue merging bfq: Use 'ttime' local variable bfq: Use only idle IO periods for think time calculations Jia Cheng Hu block, bfq: set next_rq to waker_bfqq->next_rq in waker injection Paolo Valente block, bfq: use half slice_idle as a threshold to check short ttime block, bfq: increase time window for waker detection block, bfq: do not raise non-default weights block, bfq: avoid spurious switches to soft_rt of interactive queues block, bfq: do not expire a queue when it is the only busy one block, bfq: replace mechanism for evaluating I/O intensity block, bfq: re-evaluate convenience of I/O plugging on rq arrivals block, bfq: fix switch back from soft-rt weitgh-raising block, bfq: save also weight-raised service on queue merging block, bfq: save also injection state on queue merging block, bfq: make waker-queue detection more robust Might be worth trying to revert those from 5.12 to see if they are causing the issue? Jan, Paolo - does this ring any bells? -- Jens Axboe ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-12 18:02 ` Jens Axboe @ 2022-08-14 20:28 ` Chris Murphy 2022-08-16 14:22 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-14 20:28 UTC (permalink / raw) To: Jens Axboe, Josef Bacik Cc: Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Jan Kara On Fri, Aug 12, 2022, at 2:02 PM, Jens Axboe wrote: > On 8/12/22 11:59 AM, Josef Bacik wrote: >> On Fri, Aug 12, 2022 at 12:05 PM Chris Murphy <lists@colorremedies.com> wrote: >>> >>> >>> >>> On Wed, Aug 10, 2022, at 3:34 PM, Chris Murphy wrote: >>>> Booted with cgroup_disable=io, and confirmed cat >>>> /sys/fs/cgroup/cgroup.controllers does not list io. >>> >>> The problem still reproduces with the cgroup IO controller disabled. >>> >>> On a whim, I decided to switch the IO scheduler from Fedora's default bfq for rotating drives to mq-deadline. The problem does not reproduce for 15+ hours, which is not 100% conclusive but probably 99% conclusive. I then switched live while running the workload to bfq on all eight drives, and within 10 minutes the system cratered, all new commands just hang. Load average goes to triple digits, i/o wait increasing, i/o pressure for the workload tasks to 100%, and IO completely stalls to zero. I was able to switch only two of the drive queues back to mq-deadline and then lost responsivness in that shell and had to issue sysrq+b... >>> >>> Before that I was able to extra sysrq+w and sysrq+t. >>> https://drive.google.com/file/d/16hdQjyBnuzzQIhiQT6fQdE0nkRQJj7EI/view?usp=sharing >>> >>> I can't tell if this is a bfq bug, or if there's some negative interaction between bfq and scsi or megaraid_sas. Obviously it's rare because otherwise people would have been falling over this much sooner. But at this point there's strong correlation that it's bfq related and is a kernel regression that's been around since 5.12.0 through 5.18.0, and I suspect also 5.19.0 but it's being partly masked by other improvements. >> >> This matches observations we've had internally (inside Facebook) as >> well as my continual integration performance testing. It should >> probably be looked into by the BFQ guys as it was working previously. >> Thanks, > > 5.12 has a few BFQ changes: > > Jan Kara: > bfq: Avoid false bfq queue merging > bfq: Use 'ttime' local variable > bfq: Use only idle IO periods for think time calculations > > Jia Cheng Hu > block, bfq: set next_rq to waker_bfqq->next_rq in waker injection > > Paolo Valente > block, bfq: use half slice_idle as a threshold to check short ttime > block, bfq: increase time window for waker detection > block, bfq: do not raise non-default weights > block, bfq: avoid spurious switches to soft_rt of interactive queues > block, bfq: do not expire a queue when it is the only busy one > block, bfq: replace mechanism for evaluating I/O intensity > block, bfq: re-evaluate convenience of I/O plugging on rq arrivals > block, bfq: fix switch back from soft-rt weitgh-raising > block, bfq: save also weight-raised service on queue merging > block, bfq: save also injection state on queue merging > block, bfq: make waker-queue detection more robust > > Might be worth trying to revert those from 5.12 to see if they are > causing the issue? Jan, Paolo - does this ring any bells? git log --oneline --no-merges v5.11..c03c21ba6f4e > bisect.txt I tried checking out a33df75c6328, which is right before the first bfq commit, but that kernel won't boot the hardware. Next I checked out v5.12, then reverted these commits in order (that they were found in the bisect.txt file): 7684fbde4516 bfq: Use only idle IO periods for think time calculations 28c6def00919 bfq: Use 'ttime' local variable 41e76c85660c bfq: Avoid false bfq queue merging >>>a5bf0a92e1b8 bfq: bfq_check_waker() should be static 71217df39dc6 block, bfq: make waker-queue detection more robust 5a5436b98d5c block, bfq: save also injection state on queue merging e673914d52f9 block, bfq: save also weight-raised service on queue merging d1f600fa4732 block, bfq: fix switch back from soft-rt weitgh-raising 7f1995c27b19 block, bfq: re-evaluate convenience of I/O plugging on rq arrivals eb2fd80f9d2c block, bfq: replace mechanism for evaluating I/O intensity >>>1a23e06cdab2 bfq: don't duplicate code for different paths 2391d13ed484 block, bfq: do not expire a queue when it is the only busy one 3c337690d2eb block, bfq: avoid spurious switches to soft_rt of interactive queues 91b896f65d32 block, bfq: do not raise non-default weights ab1fb47e33dc block, bfq: increase time window for waker detection d4fc3640ff36 block, bfq: set next_rq to waker_bfqq->next_rq in waker injection b5f74ecacc31 block, bfq: use half slice_idle as a threshold to check short ttime The two commits prefixed by >>> above were not previously mentioned by Jens, but I reverted them anyway because they showed up in the git log command. OK so, within 10 minutes the problem does happen still. This is block/bfq-iosched.c resulting from the above reverts, in case anyone wants to double check what I did: https://drive.google.com/file/d/1ykU7MpmylJuXVobODWiiaLJk-XOiAjSt/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-14 20:28 ` Chris Murphy @ 2022-08-16 14:22 ` Chris Murphy 2022-08-16 15:25 ` Nikolay Borisov 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-16 14:22 UTC (permalink / raw) To: Jens Axboe, Jan Kara, Paolo Valente Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Sun, Aug 14, 2022, at 4:28 PM, Chris Murphy wrote: > On Fri, Aug 12, 2022, at 2:02 PM, Jens Axboe wrote: >> Might be worth trying to revert those from 5.12 to see if they are >> causing the issue? Jan, Paolo - does this ring any bells? > > git log --oneline --no-merges v5.11..c03c21ba6f4e > bisect.txt > > I tried checking out a33df75c6328, which is right before the first bfq > commit, but that kernel won't boot the hardware. > > Next I checked out v5.12, then reverted these commits in order (that > they were found in the bisect.txt file): > > 7684fbde4516 bfq: Use only idle IO periods for think time calculations > 28c6def00919 bfq: Use 'ttime' local variable > 41e76c85660c bfq: Avoid false bfq queue merging >>>>a5bf0a92e1b8 bfq: bfq_check_waker() should be static > 71217df39dc6 block, bfq: make waker-queue detection more robust > 5a5436b98d5c block, bfq: save also injection state on queue merging > e673914d52f9 block, bfq: save also weight-raised service on queue merging > d1f600fa4732 block, bfq: fix switch back from soft-rt weitgh-raising > 7f1995c27b19 block, bfq: re-evaluate convenience of I/O plugging on rq arrivals > eb2fd80f9d2c block, bfq: replace mechanism for evaluating I/O intensity >>>>1a23e06cdab2 bfq: don't duplicate code for different paths > 2391d13ed484 block, bfq: do not expire a queue when it is the only busy > one > 3c337690d2eb block, bfq: avoid spurious switches to soft_rt of > interactive queues > 91b896f65d32 block, bfq: do not raise non-default weights > ab1fb47e33dc block, bfq: increase time window for waker detection > d4fc3640ff36 block, bfq: set next_rq to waker_bfqq->next_rq in waker > injection > b5f74ecacc31 block, bfq: use half slice_idle as a threshold to check > short ttime > > The two commits prefixed by >>> above were not previously mentioned by > Jens, but I reverted them anyway because they showed up in the git log > command. > > OK so, within 10 minutes the problem does happen still. This is > block/bfq-iosched.c resulting from the above reverts, in case anyone > wants to double check what I did: > https://drive.google.com/file/d/1ykU7MpmylJuXVobODWiiaLJk-XOiAjSt/view?usp=sharing Any suggestions for further testing? I could try go down farther in the bisect.txt list. The problem is if the hardware falls over on an unbootable kernel, I have to bug someone with LOM access. That's a limited resource. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-16 14:22 ` Chris Murphy @ 2022-08-16 15:25 ` Nikolay Borisov 2022-08-16 15:34 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Nikolay Borisov @ 2022-08-16 15:25 UTC (permalink / raw) To: Chris Murphy, Jens Axboe, Jan Kara, Paolo Valente Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On 16.08.22 г. 17:22 ч., Chris Murphy wrote: > > > On Sun, Aug 14, 2022, at 4:28 PM, Chris Murphy wrote: >> On Fri, Aug 12, 2022, at 2:02 PM, Jens Axboe wrote: >>> Might be worth trying to revert those from 5.12 to see if they are >>> causing the issue? Jan, Paolo - does this ring any bells? >> >> git log --oneline --no-merges v5.11..c03c21ba6f4e > bisect.txt >> >> I tried checking out a33df75c6328, which is right before the first bfq >> commit, but that kernel won't boot the hardware. >> >> Next I checked out v5.12, then reverted these commits in order (that >> they were found in the bisect.txt file): >> >> 7684fbde4516 bfq: Use only idle IO periods for think time calculations >> 28c6def00919 bfq: Use 'ttime' local variable >> 41e76c85660c bfq: Avoid false bfq queue merging >>>>> a5bf0a92e1b8 bfq: bfq_check_waker() should be static >> 71217df39dc6 block, bfq: make waker-queue detection more robust >> 5a5436b98d5c block, bfq: save also injection state on queue merging >> e673914d52f9 block, bfq: save also weight-raised service on queue merging >> d1f600fa4732 block, bfq: fix switch back from soft-rt weitgh-raising >> 7f1995c27b19 block, bfq: re-evaluate convenience of I/O plugging on rq arrivals >> eb2fd80f9d2c block, bfq: replace mechanism for evaluating I/O intensity >>>>> 1a23e06cdab2 bfq: don't duplicate code for different paths >> 2391d13ed484 block, bfq: do not expire a queue when it is the only busy >> one >> 3c337690d2eb block, bfq: avoid spurious switches to soft_rt of >> interactive queues >> 91b896f65d32 block, bfq: do not raise non-default weights >> ab1fb47e33dc block, bfq: increase time window for waker detection >> d4fc3640ff36 block, bfq: set next_rq to waker_bfqq->next_rq in waker >> injection >> b5f74ecacc31 block, bfq: use half slice_idle as a threshold to check >> short ttime >> >> The two commits prefixed by >>> above were not previously mentioned by >> Jens, but I reverted them anyway because they showed up in the git log >> command. >> >> OK so, within 10 minutes the problem does happen still. This is >> block/bfq-iosched.c resulting from the above reverts, in case anyone >> wants to double check what I did: >> https://drive.google.com/file/d/1ykU7MpmylJuXVobODWiiaLJk-XOiAjSt/view?usp=sharing > > Any suggestions for further testing? I could try go down farther in the bisect.txt list. The problem is if the hardware falls over on an unbootable kernel, I have to bug someone with LOM access. That's a limited resource. > > How about changing the scheduler either mq-deadline or noop, just to see if this is also reproducible with a different scheduler. I guess noop would imply the blk cgroup controller is going to be disabled ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-16 15:25 ` Nikolay Borisov @ 2022-08-16 15:34 ` Chris Murphy 2022-08-17 9:52 ` Holger Hoffstätte 2022-08-17 12:06 ` Ming Lei 0 siblings, 2 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-16 15:34 UTC (permalink / raw) To: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: > On 16.08.22 г. 17:22 ч., Chris Murphy wrote: >> >> >> On Sun, Aug 14, 2022, at 4:28 PM, Chris Murphy wrote: >>> On Fri, Aug 12, 2022, at 2:02 PM, Jens Axboe wrote: >>>> Might be worth trying to revert those from 5.12 to see if they are >>>> causing the issue? Jan, Paolo - does this ring any bells? >>> >>> git log --oneline --no-merges v5.11..c03c21ba6f4e > bisect.txt >>> >>> I tried checking out a33df75c6328, which is right before the first bfq >>> commit, but that kernel won't boot the hardware. >>> >>> Next I checked out v5.12, then reverted these commits in order (that >>> they were found in the bisect.txt file): >>> >>> 7684fbde4516 bfq: Use only idle IO periods for think time calculations >>> 28c6def00919 bfq: Use 'ttime' local variable >>> 41e76c85660c bfq: Avoid false bfq queue merging >>>>>> a5bf0a92e1b8 bfq: bfq_check_waker() should be static >>> 71217df39dc6 block, bfq: make waker-queue detection more robust >>> 5a5436b98d5c block, bfq: save also injection state on queue merging >>> e673914d52f9 block, bfq: save also weight-raised service on queue merging >>> d1f600fa4732 block, bfq: fix switch back from soft-rt weitgh-raising >>> 7f1995c27b19 block, bfq: re-evaluate convenience of I/O plugging on rq arrivals >>> eb2fd80f9d2c block, bfq: replace mechanism for evaluating I/O intensity >>>>>> 1a23e06cdab2 bfq: don't duplicate code for different paths >>> 2391d13ed484 block, bfq: do not expire a queue when it is the only busy >>> one >>> 3c337690d2eb block, bfq: avoid spurious switches to soft_rt of >>> interactive queues >>> 91b896f65d32 block, bfq: do not raise non-default weights >>> ab1fb47e33dc block, bfq: increase time window for waker detection >>> d4fc3640ff36 block, bfq: set next_rq to waker_bfqq->next_rq in waker >>> injection >>> b5f74ecacc31 block, bfq: use half slice_idle as a threshold to check >>> short ttime >>> >>> The two commits prefixed by >>> above were not previously mentioned by >>> Jens, but I reverted them anyway because they showed up in the git log >>> command. >>> >>> OK so, within 10 minutes the problem does happen still. This is >>> block/bfq-iosched.c resulting from the above reverts, in case anyone >>> wants to double check what I did: >>> https://drive.google.com/file/d/1ykU7MpmylJuXVobODWiiaLJk-XOiAjSt/view?usp=sharing >> >> Any suggestions for further testing? I could try go down farther in the bisect.txt list. The problem is if the hardware falls over on an unbootable kernel, I have to bug someone with LOM access. That's a limited resource. >> >> > > How about changing the scheduler either mq-deadline or noop, just to see > if this is also reproducible with a different scheduler. I guess noop > would imply the blk cgroup controller is going to be disabled I already reported on that: always happens with bfq within an hour or less. Doesn't happen with mq-deadline for ~25+ hours. Does happen with bfq with the above patches removed. Does happen with cgroup.disabled=io set. Sounds to me like it's something bfq depends on and is somehow becoming perturbed in a way that mq-deadline does not, and has changed between 5.11 and 5.12. I have no idea what's under bfq that matches this description. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-16 15:34 ` Chris Murphy @ 2022-08-17 9:52 ` Holger Hoffstätte 2022-08-17 11:49 ` Jan Kara ` (2 more replies) 2022-08-17 12:06 ` Ming Lei 1 sibling, 3 replies; 58+ messages in thread From: Holger Hoffstätte @ 2022-08-17 9:52 UTC (permalink / raw) To: Chris Murphy, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Linux-RAID, linux-block, linux-kernel, Josef Bacik, linux-block On 2022-08-16 17:34, Chris Murphy wrote: > > On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: >> How about changing the scheduler either mq-deadline or noop, just >> to see if this is also reproducible with a different scheduler. I >> guess noop would imply the blk cgroup controller is going to be >> disabled > > I already reported on that: always happens with bfq within an hour or > less. Doesn't happen with mq-deadline for ~25+ hours. Does happen > with bfq with the above patches removed. Does happen with > cgroup.disabled=io set. > > Sounds to me like it's something bfq depends on and is somehow > becoming perturbed in a way that mq-deadline does not, and has > changed between 5.11 and 5.12. I have no idea what's under bfq that > matches this description. Chris, just a shot in the dark but can you try the patch from https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ on top of something more recent than 5.12? Ideally 5.19 where it applies cleanly. No guarantees, I just remembered this patch and your problem sounds like a lost wakeup. Maybe BFQ just drives the sbitmap in a way that triggers the symptom. -h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 9:52 ` Holger Hoffstätte @ 2022-08-17 11:49 ` Jan Kara 2022-08-17 14:37 ` Chris Murphy 2022-08-17 15:09 ` Chris Murphy 2022-08-17 11:57 ` Chris Murphy 2022-08-17 18:16 ` Chris Murphy 2 siblings, 2 replies; 58+ messages in thread From: Jan Kara @ 2022-08-17 11:49 UTC (permalink / raw) To: Holger Hoffstätte Cc: Chris Murphy, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed 17-08-22 11:52:54, Holger Hoffstätte wrote: > On 2022-08-16 17:34, Chris Murphy wrote: > > > > On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: > > > How about changing the scheduler either mq-deadline or noop, just > > > to see if this is also reproducible with a different scheduler. I > > > guess noop would imply the blk cgroup controller is going to be > > > disabled > > > > I already reported on that: always happens with bfq within an hour or > > less. Doesn't happen with mq-deadline for ~25+ hours. Does happen > > with bfq with the above patches removed. Does happen with > > cgroup.disabled=io set. > > > > Sounds to me like it's something bfq depends on and is somehow > > becoming perturbed in a way that mq-deadline does not, and has > > changed between 5.11 and 5.12. I have no idea what's under bfq that > > matches this description. > > Chris, just a shot in the dark but can you try the patch from > > https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ > > on top of something more recent than 5.12? Ideally 5.19 where it applies > cleanly. > > No guarantees, I just remembered this patch and your problem sounds like > a lost wakeup. Maybe BFQ just drives the sbitmap in a way that triggers the > symptom. Yes, symptoms look similar and it happens for devices with shared tagsets (which megaraid sas is) but that problem usually appeared when there are lots of LUNs sharing the tagset so that number of tags available per LUN was rather low. Not sure if that is the case here but probably that patch is worth a try. Another thing worth trying is to compile the kernel without CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in BFQ so we will see whether the problem may be cgroup related or not. Another interesting thing might be to dump /sys/kernel/debug/block/<device>/hctx*/{sched_tags,sched_tags_bitmap,tags,tags_bitmap} as the system is hanging. That should tell us whether tags are in fact in use or not when processes are blocking waiting for tags. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 11:49 ` Jan Kara @ 2022-08-17 14:37 ` Chris Murphy 2022-08-17 15:09 ` Chris Murphy 1 sibling, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-17 14:37 UTC (permalink / raw) To: Jan Kara, Holger Hoffstätte Cc: Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 7:49 AM, Jan Kara wrote: > Another thing worth trying is to compile the kernel without > CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in > BFQ so we will see whether the problem may be cgroup related or not. Does boot param cgroup.disable=io affect it? Because the problem still happens with that parameter. Otherwise I can build a kernel with it disabled. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 11:49 ` Jan Kara 2022-08-17 14:37 ` Chris Murphy @ 2022-08-17 15:09 ` Chris Murphy 2022-08-17 16:30 ` Jan Kara 1 sibling, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 15:09 UTC (permalink / raw) To: Jan Kara, Holger Hoffstätte Cc: Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 7:49 AM, Jan Kara wrote: > > Another thing worth trying is to compile the kernel without > CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in > BFQ so we will see whether the problem may be cgroup related or not. The problem happens with a 5.12.0 kernel built without CONFIG_BFQ_GROUP_IOSCHED. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 15:09 ` Chris Murphy @ 2022-08-17 16:30 ` Jan Kara 2022-08-17 16:47 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Jan Kara @ 2022-08-17 16:30 UTC (permalink / raw) To: Chris Murphy Cc: Jan Kara, Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed 17-08-22 11:09:26, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 7:49 AM, Jan Kara wrote: > > > > > Another thing worth trying is to compile the kernel without > > CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in > > BFQ so we will see whether the problem may be cgroup related or not. > > The problem happens with a 5.12.0 kernel built without > CONFIG_BFQ_GROUP_IOSCHED. Thanks for testing! Just to answer your previous question: This is different from cgroup.disable=io because BFQ takes different code paths. So this makes it even less likely this is some obscure BFQ bug. Why BFQ could be different here from mq-deadline is that it artificially reduces device queue depth (it sets shallow_depth when allocating new tags) and maybe that triggers some bug in request tag allocation. BTW, are you sure the first problematic kernel is 5.12? Because support for shared tagsets was added to megaraid_sas driver in 5.11 (5.11-rc3 in particular - commit 81e7eb5bf08f3 ("Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug"")) and that is one candidate I'd expect to start to trigger issues. BTW that may be an interesting thing to try: Can you boot with "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the issue reproduces? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 16:30 ` Jan Kara @ 2022-08-17 16:47 ` Chris Murphy 2022-08-17 17:57 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 16:47 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 12:30 PM, Jan Kara wrote: > BTW, are you sure the first problematic kernel is 5.12? 100% It consistently reproduces with any 5.12 series kernel, including from c03c21ba6f4e which is before rc1. It's frustrating that git bisect produces kernels that won't boot, I was more than half way through! :D And could have been done by now... We've been running on 5.11 series kernels for a year because of this problem. > BTW that may be an > interesting thing to try: Can you boot with > "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the > issue reproduces? Yep. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 16:47 ` Chris Murphy @ 2022-08-17 17:57 ` Chris Murphy 2022-08-17 18:15 ` Jan Kara 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 17:57 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 12:47 PM, Chris Murphy wrote: Can you boot with >> "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the >> issue reproduces? This has been running an hour without symptoms. It's strongly suggestive, but needs to run overnight to be sure. Anecdotally, the max write IO is less than what I'm used to seeing. [ 0.583121] Kernel command line: BOOT_IMAGE=(md/0)/vmlinuz-5.12.5-300.fc34.x86_64 root=UUID=04f1fb7f-5cc4-4dfb-a7cf-b6b6925bf895 ro rootflags=subvol=root rd.md.uuid=e7782150:092e161a:68395862:31375bca biosdevname=1 net.ifnames=0 log_buf_len=8M plymouth.enable=0 megaraid_sas.host_tagset_enable=0 ... [ 6.745964] megasas: 07.714.04.00-rc1 [ 6.758472] megaraid_sas 0000:02:00.0: BAR:0x1 BAR's base_addr(phys):0x0000000092000000 mapped virt_addr:0x00000000c54554ff [ 6.758477] megaraid_sas 0000:02:00.0: FW now in Ready state [ 6.770658] megaraid_sas 0000:02:00.0: 63 bit DMA mask and 32 bit consistent mask [ 6.795060] megaraid_sas 0000:02:00.0: firmware supports msix : (96) [ 6.807537] megaraid_sas 0000:02:00.0: requested/available msix 49/49 [ 6.819259] megaraid_sas 0000:02:00.0: current msix/online cpus : (49/48) [ 6.830800] megaraid_sas 0000:02:00.0: RDPQ mode : (disabled) [ 6.842031] megaraid_sas 0000:02:00.0: Current firmware supports maximum commands: 928 LDIO threshold: 0 [ 6.871246] megaraid_sas 0000:02:00.0: Performance mode :Latency (latency index = 1) [ 6.882265] megaraid_sas 0000:02:00.0: FW supports sync cache : No [ 6.893034] megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 [ 6.988550] megaraid_sas 0000:02:00.0: FW provided supportMaxExtLDs: 1 max_lds: 64 [ 6.988554] megaraid_sas 0000:02:00.0: controller type : MR(2048MB) [ 6.988555] megaraid_sas 0000:02:00.0: Online Controller Reset(OCR) : Enabled [ 6.988556] megaraid_sas 0000:02:00.0: Secure JBOD support : No [ 6.988557] megaraid_sas 0000:02:00.0: NVMe passthru support : No [ 6.988558] megaraid_sas 0000:02:00.0: FW provided TM TaskAbort/Reset timeout : 0 secs/0 secs [ 6.988559] megaraid_sas 0000:02:00.0: JBOD sequence map support : No [ 6.988560] megaraid_sas 0000:02:00.0: PCI Lane Margining support : No [ 7.025160] megaraid_sas 0000:02:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 [ 7.025162] megaraid_sas 0000:02:00.0: INIT adapter done [ 7.025164] megaraid_sas 0000:02:00.0: JBOD sequence map is disabled megasas_setup_jbod_map 5707 [ 7.029878] megaraid_sas 0000:02:00.0: pci id : (0x1000)/(0x005d)/(0x1028)/(0x1f47) [ 7.029881] megaraid_sas 0000:02:00.0: unevenspan support : yes [ 7.029882] megaraid_sas 0000:02:00.0: firmware crash dump : no [ 7.029883] megaraid_sas 0000:02:00.0: JBOD sequence map : disabled [ 7.029915] megaraid_sas 0000:02:00.0: Max firmware commands: 927 shared with nr_hw_queues = 1 [ 7.029918] scsi host11: Avago SAS based MegaRAID driver -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 17:57 ` Chris Murphy @ 2022-08-17 18:15 ` Jan Kara 2022-08-17 18:18 ` Chris Murphy 2022-08-17 18:21 ` Holger Hoffstätte 0 siblings, 2 replies; 58+ messages in thread From: Jan Kara @ 2022-08-17 18:15 UTC (permalink / raw) To: Chris Murphy Cc: Jan Kara, Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed 17-08-22 13:57:00, Chris Murphy wrote: > On Wed, Aug 17, 2022, at 12:47 PM, Chris Murphy wrote: > Can you boot with > >> "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the > >> issue reproduces? > > This has been running an hour without symptoms. It's strongly suggestive, > but needs to run overnight to be sure. Anecdotally, the max write IO is > less than what I'm used to seeing. OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance of non-mq IO schedulers with multiple HW queues") might be what's causing issues (although I don't know how yet...). Honza > > [ 0.583121] Kernel command line: BOOT_IMAGE=(md/0)/vmlinuz-5.12.5-300.fc34.x86_64 root=UUID=04f1fb7f-5cc4-4dfb-a7cf-b6b6925bf895 ro rootflags=subvol=root rd.md.uuid=e7782150:092e161a:68395862:31375bca biosdevname=1 net.ifnames=0 log_buf_len=8M plymouth.enable=0 megaraid_sas.host_tagset_enable=0 > ... > [ 6.745964] megasas: 07.714.04.00-rc1 > [ 6.758472] megaraid_sas 0000:02:00.0: BAR:0x1 BAR's base_addr(phys):0x0000000092000000 mapped virt_addr:0x00000000c54554ff > [ 6.758477] megaraid_sas 0000:02:00.0: FW now in Ready state > [ 6.770658] megaraid_sas 0000:02:00.0: 63 bit DMA mask and 32 bit consistent mask > [ 6.795060] megaraid_sas 0000:02:00.0: firmware supports msix : (96) > [ 6.807537] megaraid_sas 0000:02:00.0: requested/available msix 49/49 > [ 6.819259] megaraid_sas 0000:02:00.0: current msix/online cpus : (49/48) > [ 6.830800] megaraid_sas 0000:02:00.0: RDPQ mode : (disabled) > [ 6.842031] megaraid_sas 0000:02:00.0: Current firmware supports maximum commands: 928 LDIO threshold: 0 > [ 6.871246] megaraid_sas 0000:02:00.0: Performance mode :Latency (latency index = 1) > [ 6.882265] megaraid_sas 0000:02:00.0: FW supports sync cache : No > [ 6.893034] megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 > [ 6.988550] megaraid_sas 0000:02:00.0: FW provided supportMaxExtLDs: 1 max_lds: 64 > [ 6.988554] megaraid_sas 0000:02:00.0: controller type : MR(2048MB) > [ 6.988555] megaraid_sas 0000:02:00.0: Online Controller Reset(OCR) : Enabled > [ 6.988556] megaraid_sas 0000:02:00.0: Secure JBOD support : No > [ 6.988557] megaraid_sas 0000:02:00.0: NVMe passthru support : No > [ 6.988558] megaraid_sas 0000:02:00.0: FW provided TM TaskAbort/Reset timeout : 0 secs/0 secs > [ 6.988559] megaraid_sas 0000:02:00.0: JBOD sequence map support : No > [ 6.988560] megaraid_sas 0000:02:00.0: PCI Lane Margining support : No > [ 7.025160] megaraid_sas 0000:02:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 > [ 7.025162] megaraid_sas 0000:02:00.0: INIT adapter done > [ 7.025164] megaraid_sas 0000:02:00.0: JBOD sequence map is disabled megasas_setup_jbod_map 5707 > [ 7.029878] megaraid_sas 0000:02:00.0: pci id : (0x1000)/(0x005d)/(0x1028)/(0x1f47) > [ 7.029881] megaraid_sas 0000:02:00.0: unevenspan support : yes > [ 7.029882] megaraid_sas 0000:02:00.0: firmware crash dump : no > [ 7.029883] megaraid_sas 0000:02:00.0: JBOD sequence map : disabled > [ 7.029915] megaraid_sas 0000:02:00.0: Max firmware commands: 927 shared with nr_hw_queues = 1 > [ 7.029918] scsi host11: Avago SAS based MegaRAID driver > > > > > -- > Chris Murphy -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:15 ` Jan Kara @ 2022-08-17 18:18 ` Chris Murphy 2022-08-17 18:33 ` Jan Kara 2022-08-17 18:21 ` Holger Hoffstätte 1 sibling, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 18:18 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 2:15 PM, Jan Kara wrote: > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance > of non-mq IO schedulers with multiple HW queues") might be what's causing > issues (although I don't know how yet...). I can revert it from 5.12.0 and try. Let me know which next test is preferred :) -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:18 ` Chris Murphy @ 2022-08-17 18:33 ` Jan Kara 2022-08-17 18:54 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Jan Kara @ 2022-08-17 18:33 UTC (permalink / raw) To: Chris Murphy Cc: Jan Kara, Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed 17-08-22 14:18:01, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 2:15 PM, Jan Kara wrote: > > > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance > > of non-mq IO schedulers with multiple HW queues") might be what's causing > > issues (although I don't know how yet...). > > I can revert it from 5.12.0 and try. Let me know which next test is preferred :) Let's try to revert this first so that we have it narrowed down what started causing the issues. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:33 ` Jan Kara @ 2022-08-17 18:54 ` Chris Murphy 2022-08-17 19:23 ` Chris Murphy 2022-08-18 2:31 ` Chris Murphy 0 siblings, 2 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-17 18:54 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 2:33 PM, Jan Kara wrote: > On Wed 17-08-22 14:18:01, Chris Murphy wrote: >> >> >> On Wed, Aug 17, 2022, at 2:15 PM, Jan Kara wrote: >> >> > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance >> > of non-mq IO schedulers with multiple HW queues") might be what's causing >> > issues (although I don't know how yet...). >> >> I can revert it from 5.12.0 and try. Let me know which next test is preferred :) > > Let's try to revert this first so that we have it narrowed down what > started causing the issues. OK I've reverted b6e68ee82585, and removing megaraid_sas.host_tagset_enable=0, and will restart the workload... Usually it's within 10 minutes but the newer the kernel it seems the longer it takes, or the more things I have to throw at it. The problem doesn't reproduce at all with 5.19 series unless I also run a separate dnf install, and that only triggers maybe 1 in 3 times. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:54 ` Chris Murphy @ 2022-08-17 19:23 ` Chris Murphy 2022-08-18 2:31 ` Chris Murphy 1 sibling, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-17 19:23 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 2:54 PM, Chris Murphy wrote: > On Wed, Aug 17, 2022, at 2:33 PM, Jan Kara wrote: >> On Wed 17-08-22 14:18:01, Chris Murphy wrote: >>> >>> >>> On Wed, Aug 17, 2022, at 2:15 PM, Jan Kara wrote: >>> >>> > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance >>> > of non-mq IO schedulers with multiple HW queues") might be what's causing >>> > issues (although I don't know how yet...). >>> >>> I can revert it from 5.12.0 and try. Let me know which next test is preferred :) >> >> Let's try to revert this first so that we have it narrowed down what >> started causing the issues. > > OK I've reverted b6e68ee82585, and removing > megaraid_sas.host_tagset_enable=0, and will restart the workload... > > Usually it's within 10 minutes but the newer the kernel it seems the > longer it takes, or the more things I have to throw at it. The problem > doesn't reproduce at all with 5.19 series unless I also run a separate > dnf install, and that only triggers maybe 1 in 3 times. What I'm seeing is similar to 5.18 and occasionally 5.19... top reports high %wa, above 30% sometimes above 60%, and increasing load (48 cpus so load 48 is OK, but this is triple digits which never happens on 5.11 series kernels). IO pressure is 10x higher than with mq-deadline (or bfq on 5.11 series kernel) 40-50% right now iotop usually craters to 0 by now, but it's near normal. So I think b6e68ee82585 is s contributing factor. But isn't the only factor. I'm going to let this keep running and see if it matures into the more typical failure pattern. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:54 ` Chris Murphy 2022-08-17 19:23 ` Chris Murphy @ 2022-08-18 2:31 ` Chris Murphy 1 sibling, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-18 2:31 UTC (permalink / raw) To: Jan Kara Cc: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 2:54 PM, Chris Murphy wrote: > On Wed, Aug 17, 2022, at 2:33 PM, Jan Kara wrote: >> On Wed 17-08-22 14:18:01, Chris Murphy wrote: >>> >>> >>> On Wed, Aug 17, 2022, at 2:15 PM, Jan Kara wrote: >>> >>> > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance >>> > of non-mq IO schedulers with multiple HW queues") might be what's causing >>> > issues (although I don't know how yet...). >>> >>> I can revert it from 5.12.0 and try. Let me know which next test is preferred :) >> >> Let's try to revert this first so that we have it narrowed down what >> started causing the issues. > > OK I've reverted b6e68ee82585, and removing > megaraid_sas.host_tagset_enable=0, and will restart the workload... I ran this for 7 hours and the problem didn't happen. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:15 ` Jan Kara 2022-08-17 18:18 ` Chris Murphy @ 2022-08-17 18:21 ` Holger Hoffstätte 1 sibling, 0 replies; 58+ messages in thread From: Holger Hoffstätte @ 2022-08-17 18:21 UTC (permalink / raw) To: Jan Kara, Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Paolo Valente, Linux-RAID, linux-block, linux-kernel, Josef Bacik On 2022-08-17 20:15, Jan Kara wrote: > On Wed 17-08-22 13:57:00, Chris Murphy wrote: >> On Wed, Aug 17, 2022, at 12:47 PM, Chris Murphy wrote: >> Can you boot with >>>> "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the >>>> issue reproduces? >> >> This has been running an hour without symptoms. It's strongly suggestive, >> but needs to run overnight to be sure. Anecdotally, the max write IO is >> less than what I'm used to seeing. > > OK, if this indeed passes then b6e68ee82585 ("blk-mq: Improve performance > of non-mq IO schedulers with multiple HW queues") might be what's causing > issues (although I don't know how yet...). > > Honza Certainly explains why BFQ turned up as a suspect, considering it's still single-queue (fair MQ scheduling is .. complicated). -h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 9:52 ` Holger Hoffstätte 2022-08-17 11:49 ` Jan Kara @ 2022-08-17 11:57 ` Chris Murphy 2022-08-17 12:31 ` Holger Hoffstätte 2022-08-17 18:16 ` Chris Murphy 2 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 11:57 UTC (permalink / raw) To: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote: > On 2022-08-16 17:34, Chris Murphy wrote: >> >> On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: >>> How about changing the scheduler either mq-deadline or noop, just >>> to see if this is also reproducible with a different scheduler. I >>> guess noop would imply the blk cgroup controller is going to be >>> disabled >> >> I already reported on that: always happens with bfq within an hour or >> less. Doesn't happen with mq-deadline for ~25+ hours. Does happen >> with bfq with the above patches removed. Does happen with >> cgroup.disabled=io set. >> >> Sounds to me like it's something bfq depends on and is somehow >> becoming perturbed in a way that mq-deadline does not, and has >> changed between 5.11 and 5.12. I have no idea what's under bfq that >> matches this description. > > Chris, just a shot in the dark but can you try the patch from > > https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ > > on top of something more recent than 5.12? Ideally 5.19 where it applies > cleanly. The problem doesn't reliably reproduce on 5.19. A patch for 5.12..5.18 would be much more testable. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 11:57 ` Chris Murphy @ 2022-08-17 12:31 ` Holger Hoffstätte 0 siblings, 0 replies; 58+ messages in thread From: Holger Hoffstätte @ 2022-08-17 12:31 UTC (permalink / raw) To: Chris Murphy, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Linux-RAID, linux-block, linux-kernel, Josef Bacik On 2022-08-17 13:57, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote: >> On 2022-08-16 17:34, Chris Murphy wrote: >>> >>> On Tue, Aug 16, 2022, at 11:25 AM, Nikolay Borisov wrote: >>>> How about changing the scheduler either mq-deadline or noop, just >>>> to see if this is also reproducible with a different scheduler. I >>>> guess noop would imply the blk cgroup controller is going to be >>>> disabled >>> >>> I already reported on that: always happens with bfq within an hour or >>> less. Doesn't happen with mq-deadline for ~25+ hours. Does happen >>> with bfq with the above patches removed. Does happen with >>> cgroup.disabled=io set. >>> >>> Sounds to me like it's something bfq depends on and is somehow >>> becoming perturbed in a way that mq-deadline does not, and has >>> changed between 5.11 and 5.12. I have no idea what's under bfq that >>> matches this description. >> >> Chris, just a shot in the dark but can you try the patch from >> >> https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ >> >> on top of something more recent than 5.12? Ideally 5.19 where it applies >> cleanly. > > The problem doesn't reliably reproduce on 5.19. A patch for 5.12..5.18 would be much more testable. If you look at the changes to sbitmap at: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/lib/sbitmap.c you'll find that they are relatively recent, so Yukai's patch will probably also apply to 5.18 - I don't know. Also look at the most recent commit which mentions "Checking free bits when setting the target bits. Otherwise, it may reuse the busying bits." Reusing the busy bits sounds "not great" either and (AFAIU) may also be a cause for lost wakeups, but I'm sure Jan and Ming know all that better than me. Especially Jan's suggestions re. disabling BFQ cgroup support is probably the easiest thing to try first. What you're observing may not have a single root cause, and even if it does, it might not be where we suspect. -h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 9:52 ` Holger Hoffstätte 2022-08-17 11:49 ` Jan Kara 2022-08-17 11:57 ` Chris Murphy @ 2022-08-17 18:16 ` Chris Murphy 2022-08-17 18:38 ` Holger Hoffstätte 2 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 18:16 UTC (permalink / raw) To: Holger Hoffstätte, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote: > Chris, just a shot in the dark but can you try the patch from > > https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ > > on top of something more recent than 5.12? Ideally 5.19 where it applies > cleanly. This patch applies cleanly on 5.12.0. I can try newer kernels later, but as the problem so easily reproduces with 5.12 and the problem first appeared there, is why I'm sticking with it. (For sure we prefer to be on 5.19 series.) Let me know if I should try it still. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 18:16 ` Chris Murphy @ 2022-08-17 18:38 ` Holger Hoffstätte 0 siblings, 0 replies; 58+ messages in thread From: Holger Hoffstätte @ 2022-08-17 18:38 UTC (permalink / raw) To: Chris Murphy, Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente Cc: Linux-RAID, linux-block, linux-kernel, Josef Bacik On 2022-08-17 20:16, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote: > >> Chris, just a shot in the dark but can you try the patch from >> >> https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@huaweicloud.com/ >> >> on top of something more recent than 5.12? Ideally 5.19 where it applies >> cleanly. > > > This patch applies cleanly on 5.12.0. I can try newer kernels later, but as the problem so easily reproduces with 5.12 and the problem first appeared there, is why I'm sticking with it. (For sure we prefer to be on 5.19 series.) > > Let me know if I should try it still. I just started running it in 5.19.2 to see if it breaks something; no issues so far but then again I didn't have any problems to begin with and only do peasant I/O load, and no MegaRAID. However if it applies *and builds* on 5.12 I'd just go ahead and see what catches fire. But you need to set the megaraid setting to fail, otherwise we won't be able to see whether this is really a contributing factor, or indeed the other commit that Jan identified. Unfortunately 5.12 is a bit old already and most of the other important fixes to sbitmap.c probably won't apply due to some other blk-mq changes. In any case the plot thickens, so keep going. :) -h ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-16 15:34 ` Chris Murphy 2022-08-17 9:52 ` Holger Hoffstätte @ 2022-08-17 12:06 ` Ming Lei 2022-08-17 14:34 ` Chris Murphy 1 sibling, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-17 12:06 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, Ming Lei Hello Chris, On Tue, Aug 16, 2022 at 11:35 PM Chris Murphy <lists@colorremedies.com> wrote: > > > ... > > I already reported on that: always happens with bfq within an hour or less. Doesn't happen with mq-deadline for ~25+ hours. Does happen with bfq with the above patches removed. Does happen with cgroup.disabled=io set. > > Sounds to me like it's something bfq depends on and is somehow becoming perturbed in a way that mq-deadline does not, and has changed between 5.11 and 5.12. I have no idea what's under bfq that matches this description. > blk-mq debugfs log is usually helpful for io stall issue, care to post the blk-mq debugfs log: (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) Thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 12:06 ` Ming Lei @ 2022-08-17 14:34 ` Chris Murphy 2022-08-17 14:53 ` Ming Lei 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 14:34 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote: > blk-mq debugfs log is usually helpful for io stall issue, care to post > the blk-mq debugfs log: > > (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) This is only sda https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharing This is all the block devices https://drive.google.com/file/d/1iHqRuoz8ZzvkNcMtkV3Ep7h5Uof7sTKw/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 14:34 ` Chris Murphy @ 2022-08-17 14:53 ` Ming Lei 2022-08-17 15:02 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-17 14:53 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022 at 10:34:38AM -0400, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote: > > > blk-mq debugfs log is usually helpful for io stall issue, care to post > > the blk-mq debugfs log: > > > > (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) > > This is only sda > https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharing From the log, there isn't any in-flight IO request. So please confirm that it is collected after the IO stall is triggered. If yes, the issue may not be related with BFQ, and should be related with blk-cgroup code. Thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 14:53 ` Ming Lei @ 2022-08-17 15:02 ` Chris Murphy 2022-08-17 15:34 ` Ming Lei 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 15:02 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 10:53 AM, Ming Lei wrote: > On Wed, Aug 17, 2022 at 10:34:38AM -0400, Chris Murphy wrote: >> >> >> On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote: >> >> > blk-mq debugfs log is usually helpful for io stall issue, care to post >> > the blk-mq debugfs log: >> > >> > (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) >> >> This is only sda >> https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharing > > From the log, there isn't any in-flight IO request. > > So please confirm that it is collected after the IO stall is triggered. Yes, iotop reports no reads or writes at the time of collection. IO pressure 99% for auditd, systemd-journald, rsyslogd, and postgresql, with increasing pressure from all the qemu processes. Keep in mind this is a raid10, so maybe it's enough for just one block device IO to stall and the whole thing stops? That's why I included all block devices. > If yes, the issue may not be related with BFQ, and should be related > with blk-cgroup code. Problem happens with cgroup.disable=io, does this setting affect blk-cgroup? -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 15:02 ` Chris Murphy @ 2022-08-17 15:34 ` Ming Lei 2022-08-17 16:34 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-17 15:34 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022 at 11:02:25AM -0400, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 10:53 AM, Ming Lei wrote: > > On Wed, Aug 17, 2022 at 10:34:38AM -0400, Chris Murphy wrote: > >> > >> > >> On Wed, Aug 17, 2022, at 8:06 AM, Ming Lei wrote: > >> > >> > blk-mq debugfs log is usually helpful for io stall issue, care to post > >> > the blk-mq debugfs log: > >> > > >> > (cd /sys/kernel/debug/block/$disk && find . -type f -exec grep -aH . {} \;) > >> > >> This is only sda > >> https://drive.google.com/file/d/1aAld-kXb3RUiv_ShAvD_AGAFDRS03Lr0/view?usp=sharing > > > > From the log, there isn't any in-flight IO request. > > > > So please confirm that it is collected after the IO stall is triggered. > > Yes, iotop reports no reads or writes at the time of collection. IO pressure 99% for auditd, systemd-journald, rsyslogd, and postgresql, with increasing pressure from all the qemu processes. > > Keep in mind this is a raid10, so maybe it's enough for just one block device IO to stall and the whole thing stops? That's why I included all block devices. > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on request based block devices, but sda is _not_ included in this log, and only sdi, sdg and sdf are collected, is that expected? BTW, all request based block devices should be observed in blk-mq debugfs. thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 15:34 ` Ming Lei @ 2022-08-17 16:34 ` Chris Murphy 2022-08-18 1:03 ` Ming Lei 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-17 16:34 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 11:34 AM, Ming Lei wrote: > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on > request based block devices, but sda is _not_ included in this log, and > only sdi, sdg and sdf are collected, is that expected? While the problem was happening I did cd /sys/kernel/debug/block find . -type f -exec grep -aH . {} \; The file has the nodes out of order, but I don't know enough about the interface to see if there are things that are missing, or what it means. > BTW, all request based block devices should be observed in blk-mq debugfs. /sys/kernel/debug/block contains drwxr-xr-x. 2 root root 0 Aug 17 15:20 md0 drwxr-xr-x. 51 root root 0 Aug 17 15:20 sda drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdb drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdc drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdd drwxr-xr-x. 51 root root 0 Aug 17 15:20 sde drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdf drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdg drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdh drwxr-xr-x. 4 root root 0 Aug 17 15:20 sdi drwxr-xr-x. 2 root root 0 Aug 17 15:20 zram0 -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-17 16:34 ` Chris Murphy @ 2022-08-18 1:03 ` Ming Lei 2022-08-18 2:30 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-18 1:03 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022 at 12:34:42PM -0400, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 11:34 AM, Ming Lei wrote: > > > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on > > request based block devices, but sda is _not_ included in this log, and > > only sdi, sdg and sdf are collected, is that expected? > > While the problem was happening I did > > cd /sys/kernel/debug/block > find . -type f -exec grep -aH . {} \; > > The file has the nodes out of order, but I don't know enough about the interface to see if there are things that are missing, or what it means. > > > > BTW, all request based block devices should be observed in blk-mq debugfs. > > /sys/kernel/debug/block contains > > drwxr-xr-x. 2 root root 0 Aug 17 15:20 md0 > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sda > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdb > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdc > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdd > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sde > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdf > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdg > drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdh > drwxr-xr-x. 4 root root 0 Aug 17 15:20 sdi > drwxr-xr-x. 2 root root 0 Aug 17 15:20 zram0 OK, so lots of devices are missed in your log, and the following command is supposed to work for collecting log from all block device's debugfs: (cd /sys/kernel/debug/block/ && find . -type f -exec grep -aH . {} \;) Thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 1:03 ` Ming Lei @ 2022-08-18 2:30 ` Chris Murphy 2022-08-18 3:24 ` Ming Lei 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-18 2:30 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 9:03 PM, Ming Lei wrote: > On Wed, Aug 17, 2022 at 12:34:42PM -0400, Chris Murphy wrote: >> >> >> On Wed, Aug 17, 2022, at 11:34 AM, Ming Lei wrote: >> >> > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on >> > request based block devices, but sda is _not_ included in this log, and >> > only sdi, sdg and sdf are collected, is that expected? >> >> While the problem was happening I did >> >> cd /sys/kernel/debug/block >> find . -type f -exec grep -aH . {} \; >> >> The file has the nodes out of order, but I don't know enough about the interface to see if there are things that are missing, or what it means. >> >> >> > BTW, all request based block devices should be observed in blk-mq debugfs. >> >> /sys/kernel/debug/block contains >> >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 md0 >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sda >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdb >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdc >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdd >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sde >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdf >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdg >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdh >> drwxr-xr-x. 4 root root 0 Aug 17 15:20 sdi >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 zram0 > > OK, so lots of devices are missed in your log, and the following command > is supposed to work for collecting log from all block device's debugfs: > > (cd /sys/kernel/debug/block/ && find . -type f -exec grep -aH . {} \;) OK here it is: https://drive.google.com/file/d/18nEOx2Ghsqx8uII6nzWpCFuYENHuQd-f/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 2:30 ` Chris Murphy @ 2022-08-18 3:24 ` Ming Lei 2022-08-18 4:12 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-18 3:24 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022 at 10:30:39PM -0400, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 9:03 PM, Ming Lei wrote: > > On Wed, Aug 17, 2022 at 12:34:42PM -0400, Chris Murphy wrote: > >> > >> > >> On Wed, Aug 17, 2022, at 11:34 AM, Ming Lei wrote: > >> > >> > From the 2nd log of blockdebugfs-all.txt, still not see any in-flight IO on > >> > request based block devices, but sda is _not_ included in this log, and > >> > only sdi, sdg and sdf are collected, is that expected? > >> > >> While the problem was happening I did > >> > >> cd /sys/kernel/debug/block > >> find . -type f -exec grep -aH . {} \; > >> > >> The file has the nodes out of order, but I don't know enough about the interface to see if there are things that are missing, or what it means. > >> > >> > >> > BTW, all request based block devices should be observed in blk-mq debugfs. > >> > >> /sys/kernel/debug/block contains > >> > >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 md0 > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sda > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdb > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdc > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdd > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sde > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdf > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdg > >> drwxr-xr-x. 51 root root 0 Aug 17 15:20 sdh > >> drwxr-xr-x. 4 root root 0 Aug 17 15:20 sdi > >> drwxr-xr-x. 2 root root 0 Aug 17 15:20 zram0 > > > > OK, so lots of devices are missed in your log, and the following command > > is supposed to work for collecting log from all block device's debugfs: > > > > (cd /sys/kernel/debug/block/ && find . -type f -exec grep -aH . {} \;) > > OK here it is: > > https://drive.google.com/file/d/18nEOx2Ghsqx8uII6nzWpCFuYENHuQd-f/view?usp=sharing The above log shows that the io stall happens on sdd, where: 1) 616 requests pending from scheduler queue grep "busy=" blockdebugfs-all2.txt | grep sdd | grep sched | awk -F "=" '{s+=$2} END {print s}' 616 2) 11 requests pending from ./sdd/hctx2/dispatch for more than 300 seconds Recently we seldom observe io hang from dispatch list, except for the following two: https://lore.kernel.org/linux-block/20220803023355.3687360-1-yuyufen@huaweicloud.com/ https://lore.kernel.org/linux-block/20220726122224.1790882-1-yukuai1@huaweicloud.com/ BTW, what is the output of the following log? (cd /sys/block/sdd/device && find . -type f -exec grep -aH . {} \;) Also the above log shows that host_tagset_enable support is still crippled on v5.12, I guess the issue may not be triggered(or pretty hard) after you update to d97e594c5166 ("blk-mq: Use request queue-wide tags for tagset-wide sbitmap"), or v5.14. thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 3:24 ` Ming Lei @ 2022-08-18 4:12 ` Chris Murphy 2022-08-18 4:18 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-18 4:12 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > OK, can you post the blk-mq debugfs log after you trigger it on v5.17? https://drive.google.com/file/d/1n8f66pVLCwQTJ0PMd71EiUZoeTWQk3dB/view?usp=sharing This time it happened pretty quickly. This log is soon after triple digit load and no IO, but not as fully developed as before. The system has become entirely unresponsive to new commands, so I have to issue sysrq+b - if I let it go too long even that won't work. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 4:12 ` Chris Murphy @ 2022-08-18 4:18 ` Chris Murphy 2022-08-18 4:27 ` Chris Murphy 0 siblings, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-18 4:18 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: > On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > >> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > > https://drive.google.com/file/d/1n8f66pVLCwQTJ0PMd71EiUZoeTWQk3dB/view?usp=sharing > > This time it happened pretty quickly. This log is soon after triple > digit load and no IO, but not as fully developed as before. The system > has become entirely unresponsive to new commands, so I have to issue > sysrq+b - if I let it go too long even that won't work. OK by the time I clicked send, the system had recovered. That also sometimes happens but then later IO stalls again and won't recover. So I haven't issued sysrq+b on this run yet. Here is a second blk-mq debugfs log... https://drive.google.com/file/d/1irHcns0qe7e7DJaDfanX8vSiqE1Nj5xl/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 4:18 ` Chris Murphy @ 2022-08-18 4:27 ` Chris Murphy 2022-08-18 4:32 ` Chris Murphy ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-18 4:27 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >> >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 4:27 ` Chris Murphy @ 2022-08-18 4:32 ` Chris Murphy 2022-08-18 5:15 ` Ming Lei 2022-08-18 5:24 ` Ming Lei 2 siblings, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-18 4:32 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 12:27 AM, Chris Murphy wrote: > On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >> On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >>> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >>> >>>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > > Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. > > https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing sysfs sdc https://drive.google.com/file/d/1DLZHX8Mg_d5w-XSsAYYK1NDzn1pA_QPm/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 4:27 ` Chris Murphy 2022-08-18 4:32 ` Chris Murphy @ 2022-08-18 5:15 ` Ming Lei 2022-08-18 18:52 ` Chris Murphy 2022-08-18 5:24 ` Ming Lei 2 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-18 5:15 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: > > > On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: > > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: > >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > >> > >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > > Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. > > https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing > Please test the following patch and see if it makes a difference: diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index a4f7c101b53b..8e8d77e79dd6 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -44,7 +44,10 @@ void __blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) */ smp_mb(); - blk_mq_run_hw_queue(hctx, true); + if (blk_mq_is_shared_tags(hctx->flags)) + blk_mq_run_hw_queues(hctx->queue, true); + else + blk_mq_run_hw_queue(hctx, true); } static int sched_rq_cmp(void *priv, const struct list_head *a, Thanks, Ming ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 5:15 ` Ming Lei @ 2022-08-18 18:52 ` Chris Murphy 0 siblings, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-08-18 18:52 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 1:15 AM, Ming Lei wrote: > On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >> >> >> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >> > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >> >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >> >> >> >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >> >> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >> >> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >> > > Please test the following patch and see if it makes a difference: > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index a4f7c101b53b..8e8d77e79dd6 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -44,7 +44,10 @@ void __blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx) > */ > smp_mb(); > > - blk_mq_run_hw_queue(hctx, true); > + if (blk_mq_is_shared_tags(hctx->flags)) > + blk_mq_run_hw_queues(hctx->queue, true); > + else > + blk_mq_run_hw_queue(hctx, true); > } > > static int sched_rq_cmp(void *priv, const struct list_head *a, I still get a stall. By the time I noticed it, I can't run any new commands (they just hang) so I had to sysrq+b. Let me know if I should rerun the test in order to capture block debug log. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 4:27 ` Chris Murphy 2022-08-18 4:32 ` Chris Murphy 2022-08-18 5:15 ` Ming Lei @ 2022-08-18 5:24 ` Ming Lei 2022-08-18 13:50 ` Chris Murphy 2022-08-19 19:20 ` Chris Murphy 2 siblings, 2 replies; 58+ messages in thread From: Ming Lei @ 2022-08-18 5:24 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: > > > On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: > > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: > >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > >> > >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > > Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. > > https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing > Also please test the following one too: diff --git a/block/blk-mq.c b/block/blk-mq.c index 5ee62b95f3e5..d01c64be08e2 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, if (!needs_restart || (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) blk_mq_run_hw_queue(hctx, true); - else if (needs_restart && needs_resource) + else if (needs_restart && (needs_resource || + blk_mq_is_shared_tags(hctx->flags))) blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); blk_mq_update_dispatch_busy(hctx, true); Thanks, Ming ^ permalink raw reply related [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 5:24 ` Ming Lei @ 2022-08-18 13:50 ` Chris Murphy 2022-08-18 15:10 ` Ming Lei 2022-08-19 19:20 ` Chris Murphy 1 sibling, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-18 13:50 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: > > Also please test the following one too: > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 5ee62b95f3e5..d01c64be08e2 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx > *hctx, struct list_head *list, > if (!needs_restart || > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > blk_mq_run_hw_queue(hctx, true); > - else if (needs_restart && needs_resource) > + else if (needs_restart && (needs_resource || > + blk_mq_is_shared_tags(hctx->flags))) > blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); > > blk_mq_update_dispatch_busy(hctx, true); > Should I test both patches at the same time, or separately? On top of v5.17 clean, or with b6e68ee82585 still reverted? -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 13:50 ` Chris Murphy @ 2022-08-18 15:10 ` Ming Lei 0 siblings, 0 replies; 58+ messages in thread From: Ming Lei @ 2022-08-18 15:10 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022 at 9:50 PM Chris Murphy <lists@colorremedies.com> wrote: > > > > On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: > > > > > Also please test the following one too: > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 5ee62b95f3e5..d01c64be08e2 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx > > *hctx, struct list_head *list, > > if (!needs_restart || > > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > > blk_mq_run_hw_queue(hctx, true); > > - else if (needs_restart && needs_resource) > > + else if (needs_restart && (needs_resource || > > + blk_mq_is_shared_tags(hctx->flags))) > > blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); > > > > blk_mq_update_dispatch_busy(hctx, true); > > > > Should I test both patches at the same time, or separately? On top of v5.17 clean, or with b6e68ee82585 still reverted? Please test it separately against v5.17. thanks, ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-18 5:24 ` Ming Lei 2022-08-18 13:50 ` Chris Murphy @ 2022-08-19 19:20 ` Chris Murphy 2022-08-20 7:00 ` Ming Lei 1 sibling, 1 reply; 58+ messages in thread From: Chris Murphy @ 2022-08-19 19:20 UTC (permalink / raw) To: Ming Lei Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: > On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >> >> >> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >> > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >> >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >> >> >> >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >> >> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >> >> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >> > > Also please test the following one too: > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 5ee62b95f3e5..d01c64be08e2 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx > *hctx, struct list_head *list, > if (!needs_restart || > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > blk_mq_run_hw_queue(hctx, true); > - else if (needs_restart && needs_resource) > + else if (needs_restart && (needs_resource || > + blk_mq_is_shared_tags(hctx->flags))) > blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); > > blk_mq_update_dispatch_busy(hctx, true); > With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-19 19:20 ` Chris Murphy @ 2022-08-20 7:00 ` Ming Lei 2022-09-01 7:02 ` Yu Kuai 0 siblings, 1 reply; 58+ messages in thread From: Ming Lei @ 2022-08-20 7:00 UTC (permalink / raw) To: Chris Murphy Cc: Nikolay Borisov, Jens Axboe, Jan Kara, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: > > > On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: > > On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: > >> > >> > >> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: > >> > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: > >> >> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > >> >> > >> >>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > >> > >> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. > >> > >> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing > >> > > > > Also please test the following one too: > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > index 5ee62b95f3e5..d01c64be08e2 100644 > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx > > *hctx, struct list_head *list, > > if (!needs_restart || > > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > > blk_mq_run_hw_queue(hctx, true); > > - else if (needs_restart && needs_resource) > > + else if (needs_restart && (needs_resource || > > + blk_mq_is_shared_tags(hctx->flags))) > > blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); > > > > blk_mq_update_dispatch_busy(hctx, true); > > > > > With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: > https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing The log is similar with before, and the only difference is RESTART not set. Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues Thanks, Ming ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-08-20 7:00 ` Ming Lei @ 2022-09-01 7:02 ` Yu Kuai 2022-09-01 8:03 ` Jan Kara ` (2 more replies) 0 siblings, 3 replies; 58+ messages in thread From: Yu Kuai @ 2022-09-01 7:02 UTC (permalink / raw) To: Ming Lei, Chris Murphy, Jan Kara Cc: Nikolay Borisov, Jens Axboe, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) Hi, Chris 在 2022/08/20 15:00, Ming Lei 写道: > On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: >> >> >> On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: >>> On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >>>> >>>> >>>> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >>>>> On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >>>>>> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >>>>>> >>>>>>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >>>> >>>> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >>>> >>>> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >>>> >>> >>> Also please test the following one too: >>> >>> >>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>> index 5ee62b95f3e5..d01c64be08e2 100644 >>> --- a/block/blk-mq.c >>> +++ b/block/blk-mq.c >>> @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx >>> *hctx, struct list_head *list, >>> if (!needs_restart || >>> (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) >>> blk_mq_run_hw_queue(hctx, true); >>> - else if (needs_restart && needs_resource) >>> + else if (needs_restart && (needs_resource || >>> + blk_mq_is_shared_tags(hctx->flags))) >>> blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); >>> >>> blk_mq_update_dispatch_busy(hctx, true); >>> >> >> >> With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: >> https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing > > The log is similar with before, and the only difference is RESTART not > set. > > Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: > > 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues Have you tried this patch? We meet a similar problem in our test, and I'm pretty sure about the situation at the scene, Our test environment:nvme with bfq ioscheduler, How io is stalled: 1. hctx1 dispatch rq from bfq in service queue, bfqq becomes empty, dispatch somehow fails and rq is inserted to hctx1->dispatch, new run work is queued. 2. other hctx tries to dispatch rq, however, in service bfqq is empty, bfq_dispatch_request return NULL, thus blk_mq_delay_run_hw_queues is called. 3. for the problem described in above patch,run work from "hctx1" can be stalled. Above patch should fix this io stall, however, it seems to me bfq do have some problems that in service bfqq doesn't expire under following situation: 1. dispatched rqs don't complete 2. no new rq is issued to bfq Thanks, Kuai > > > > Thanks, > Ming > > . > ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-09-01 7:02 ` Yu Kuai @ 2022-09-01 8:03 ` Jan Kara 2022-09-01 8:19 ` Yu Kuai 2022-09-02 16:53 ` Chris Murphy 2022-09-06 9:45 ` Paolo Valente 2 siblings, 1 reply; 58+ messages in thread From: Jan Kara @ 2022-09-01 8:03 UTC (permalink / raw) To: Yu Kuai Cc: Ming Lei, Chris Murphy, Jan Kara, Nikolay Borisov, Jens Axboe, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) On Thu 01-09-22 15:02:03, Yu Kuai wrote: > Hi, Chris > > 在 2022/08/20 15:00, Ming Lei 写道: > > On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: > > > > > > > > > On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: > > > > On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: > > > > > > > > > > > > > > > On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: > > > > > > On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: > > > > > > > On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: > > > > > > > > > > > > > > > OK, can you post the blk-mq debugfs log after you trigger it on v5.17? > > > > > > > > > > Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. > > > > > > > > > > https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing > > > > > > > > > > > > > Also please test the following one too: > > > > > > > > > > > > diff --git a/block/blk-mq.c b/block/blk-mq.c > > > > index 5ee62b95f3e5..d01c64be08e2 100644 > > > > --- a/block/blk-mq.c > > > > +++ b/block/blk-mq.c > > > > @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx > > > > *hctx, struct list_head *list, > > > > if (!needs_restart || > > > > (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) > > > > blk_mq_run_hw_queue(hctx, true); > > > > - else if (needs_restart && needs_resource) > > > > + else if (needs_restart && (needs_resource || > > > > + blk_mq_is_shared_tags(hctx->flags))) > > > > blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); > > > > > > > > blk_mq_update_dispatch_busy(hctx, true); > > > > > > > > > > > > > With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: > > > https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing > > > > The log is similar with before, and the only difference is RESTART not > > set. > > > > Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: > > > > 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues > > Have you tried this patch? > > We meet a similar problem in our test, and I'm pretty sure about the > situation at the scene, > > Our test environment:nvme with bfq ioscheduler, > > How io is stalled: > > 1. hctx1 dispatch rq from bfq in service queue, bfqq becomes empty, > dispatch somehow fails and rq is inserted to hctx1->dispatch, new run > work is queued. > > 2. other hctx tries to dispatch rq, however, in service bfqq is > empty, bfq_dispatch_request return NULL, thus > blk_mq_delay_run_hw_queues is called. > > 3. for the problem described in above patch,run work from "hctx1" > can be stalled. > > Above patch should fix this io stall, however, it seems to me bfq do > have some problems that in service bfqq doesn't expire under following > situation: > > 1. dispatched rqs don't complete > 2. no new rq is issued to bfq And I guess: 3. there are requests queued in other bfqqs ? Otherwise I don't see a point in expiring current bfqq because there's nothing bfq could do anyway. But under normal circumstances the request completion should not take so long so I don't think it would be really worth it to implement some special mechanism for this in bfq. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-09-01 8:03 ` Jan Kara @ 2022-09-01 8:19 ` Yu Kuai 2022-09-06 9:49 ` Paolo Valente 0 siblings, 1 reply; 58+ messages in thread From: Yu Kuai @ 2022-09-01 8:19 UTC (permalink / raw) To: Jan Kara, Yu Kuai Cc: Ming Lei, Chris Murphy, Nikolay Borisov, Jens Axboe, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) 在 2022/09/01 16:03, Jan Kara 写道: > On Thu 01-09-22 15:02:03, Yu Kuai wrote: >> Hi, Chris >> >> 在 2022/08/20 15:00, Ming Lei 写道: >>> On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: >>>> >>>> >>>> On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: >>>>> On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >>>>>> >>>>>> >>>>>> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >>>>>>> On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >>>>>>>> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >>>>>>>> >>>>>>>>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >>>>>> >>>>>> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >>>>>> >>>>>> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >>>>>> >>>>> >>>>> Also please test the following one too: >>>>> >>>>> >>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>>> index 5ee62b95f3e5..d01c64be08e2 100644 >>>>> --- a/block/blk-mq.c >>>>> +++ b/block/blk-mq.c >>>>> @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx >>>>> *hctx, struct list_head *list, >>>>> if (!needs_restart || >>>>> (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) >>>>> blk_mq_run_hw_queue(hctx, true); >>>>> - else if (needs_restart && needs_resource) >>>>> + else if (needs_restart && (needs_resource || >>>>> + blk_mq_is_shared_tags(hctx->flags))) >>>>> blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); >>>>> >>>>> blk_mq_update_dispatch_busy(hctx, true); >>>>> >>>> >>>> >>>> With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: >>>> https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing >>> >>> The log is similar with before, and the only difference is RESTART not >>> set. >>> >>> Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: >>> >>> 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues >> >> Have you tried this patch? >> >> We meet a similar problem in our test, and I'm pretty sure about the >> situation at the scene, >> >> Our test environment:nvme with bfq ioscheduler, >> >> How io is stalled: >> >> 1. hctx1 dispatch rq from bfq in service queue, bfqq becomes empty, >> dispatch somehow fails and rq is inserted to hctx1->dispatch, new run >> work is queued. >> >> 2. other hctx tries to dispatch rq, however, in service bfqq is >> empty, bfq_dispatch_request return NULL, thus >> blk_mq_delay_run_hw_queues is called. >> >> 3. for the problem described in above patch,run work from "hctx1" >> can be stalled. >> >> Above patch should fix this io stall, however, it seems to me bfq do >> have some problems that in service bfqq doesn't expire under following >> situation: >> >> 1. dispatched rqs don't complete >> 2. no new rq is issued to bfq > > And I guess: > 3. there are requests queued in other bfqqs > ? Yes, of course, other bfqqs still have requests, but current implementation have flaws that even if other bfqqs doesn't have requests, bfq_asymmetric_scenario() can still return true because num_groups_with_pending_reqs > 0. We tried to fix this, however, there seems to be some misunderstanding with Paolo, and it's not applied to mainline yet... Thanks, Kuai > > Otherwise I don't see a point in expiring current bfqq because there's > nothing bfq could do anyway. But under normal circumstances the request > completion should not take so long so I don't think it would be really > worth it to implement some special mechanism for this in bfq. > > Honza > ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-09-01 8:19 ` Yu Kuai @ 2022-09-06 9:49 ` Paolo Valente 0 siblings, 0 replies; 58+ messages in thread From: Paolo Valente @ 2022-09-06 9:49 UTC (permalink / raw) To: Yu Kuai Cc: Jan Kara, Ming Lei, Chris Murphy, Nikolay Borisov, Jens Axboe, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) > Il giorno 1 set 2022, alle ore 10:19, Yu Kuai <yukuai1@huaweicloud.com> ha scritto: > > 在 2022/09/01 16:03, Jan Kara 写道: >> On Thu 01-09-22 15:02:03, Yu Kuai wrote: >>> Hi, Chris >>> >>> 在 2022/08/20 15:00, Ming Lei 写道: >>>> On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: >>>>> >>>>> >>>>> On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: >>>>>> On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >>>>>>>> On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >>>>>>>>> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >>>>>>>>> >>>>>>>>>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >>>>>>> >>>>>>> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >>>>>>> >>>>>>> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >>>>>>> >>>>>> >>>>>> Also please test the following one too: >>>>>> >>>>>> >>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>>>> index 5ee62b95f3e5..d01c64be08e2 100644 >>>>>> --- a/block/blk-mq.c >>>>>> +++ b/block/blk-mq.c >>>>>> @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx >>>>>> *hctx, struct list_head *list, >>>>>> if (!needs_restart || >>>>>> (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) >>>>>> blk_mq_run_hw_queue(hctx, true); >>>>>> - else if (needs_restart && needs_resource) >>>>>> + else if (needs_restart && (needs_resource || >>>>>> + blk_mq_is_shared_tags(hctx->flags))) >>>>>> blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); >>>>>> >>>>>> blk_mq_update_dispatch_busy(hctx, true); >>>>>> >>>>> >>>>> >>>>> With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: >>>>> https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing >>>> >>>> The log is similar with before, and the only difference is RESTART not >>>> set. >>>> >>>> Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: >>>> >>>> 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues >>> >>> Have you tried this patch? >>> >>> We meet a similar problem in our test, and I'm pretty sure about the >>> situation at the scene, >>> >>> Our test environment:nvme with bfq ioscheduler, >>> >>> How io is stalled: >>> >>> 1. hctx1 dispatch rq from bfq in service queue, bfqq becomes empty, >>> dispatch somehow fails and rq is inserted to hctx1->dispatch, new run >>> work is queued. >>> >>> 2. other hctx tries to dispatch rq, however, in service bfqq is >>> empty, bfq_dispatch_request return NULL, thus >>> blk_mq_delay_run_hw_queues is called. >>> >>> 3. for the problem described in above patch,run work from "hctx1" >>> can be stalled. >>> >>> Above patch should fix this io stall, however, it seems to me bfq do >>> have some problems that in service bfqq doesn't expire under following >>> situation: >>> >>> 1. dispatched rqs don't complete >>> 2. no new rq is issued to bfq >> And I guess: >> 3. there are requests queued in other bfqqs >> ? > > Yes, of course, other bfqqs still have requests, but current > implementation have flaws that even if other bfqqs doesn't have > requests, bfq_asymmetric_scenario() can still return true because > num_groups_with_pending_reqs > 0. We tried to fix this, however, there > seems to be some misunderstanding with Paolo, and it's not applied to > mainline yet... > I think this is an unsolved performance issue (being solved patiently by Yu Kuai), but not a functional flaw. The solution of this issue would probably solve this stall, but not the essential problem: refcounting gets broken if reqs disappear for bfq without any notification. Thanks, Paolo > Thanks, > Kuai >> Otherwise I don't see a point in expiring current bfqq because there's >> nothing bfq could do anyway. But under normal circumstances the request >> completion should not take so long so I don't think it would be really >> worth it to implement some special mechanism for this in bfq. >> Honza ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-09-01 7:02 ` Yu Kuai 2022-09-01 8:03 ` Jan Kara @ 2022-09-02 16:53 ` Chris Murphy 2022-09-06 9:45 ` Paolo Valente 2 siblings, 0 replies; 58+ messages in thread From: Chris Murphy @ 2022-09-02 16:53 UTC (permalink / raw) To: Yu Kuai, Ming Lei, Jan Kara Cc: Nikolay Borisov, Jens Axboe, Paolo Valente, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) On Thu, Sep 1, 2022, at 3:02 AM, Yu Kuai wrote: > Hi, Chris >> Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: >> >> 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues > > Have you tried this patch? The problem happens on 5.18 series kernels. But takes longer. Once I regain access to this setup, I can try to reproduce on 5.18 and 5.19, and provide block debugfs logs. -- Chris Murphy ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression since linux 5.12, through 5.18 2022-09-01 7:02 ` Yu Kuai 2022-09-01 8:03 ` Jan Kara 2022-09-02 16:53 ` Chris Murphy @ 2022-09-06 9:45 ` Paolo Valente 2 siblings, 0 replies; 58+ messages in thread From: Paolo Valente @ 2022-09-06 9:45 UTC (permalink / raw) To: Yu Kuai Cc: Ming Lei, Chris Murphy, Jan Kara, Nikolay Borisov, Jens Axboe, Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, Josef Bacik, yukuai (C) > Il giorno 1 set 2022, alle ore 09:02, Yu Kuai <yukuai1@huaweicloud.com> ha scritto: > > Hi, Chris > > 在 2022/08/20 15:00, Ming Lei 写道: >> On Fri, Aug 19, 2022 at 03:20:25PM -0400, Chris Murphy wrote: >>> >>> >>> On Thu, Aug 18, 2022, at 1:24 AM, Ming Lei wrote: >>>> On Thu, Aug 18, 2022 at 12:27:04AM -0400, Chris Murphy wrote: >>>>> >>>>> >>>>> On Thu, Aug 18, 2022, at 12:18 AM, Chris Murphy wrote: >>>>>> On Thu, Aug 18, 2022, at 12:12 AM, Chris Murphy wrote: >>>>>>> On Wed, Aug 17, 2022, at 11:41 PM, Ming Lei wrote: >>>>>>> >>>>>>>> OK, can you post the blk-mq debugfs log after you trigger it on v5.17? >>>>> >>>>> Same boot, 3rd log. But the load is above 300 so I kinda need to sysrq+b soon. >>>>> >>>>> https://drive.google.com/file/d/1375H558kqPTdng439rvG6LuXXWPXLToo/view?usp=sharing >>>>> >>>> >>>> Also please test the following one too: >>>> >>>> >>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>> index 5ee62b95f3e5..d01c64be08e2 100644 >>>> --- a/block/blk-mq.c >>>> +++ b/block/blk-mq.c >>>> @@ -1991,7 +1991,8 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx >>>> *hctx, struct list_head *list, >>>> if (!needs_restart || >>>> (no_tag && list_empty_careful(&hctx->dispatch_wait.entry))) >>>> blk_mq_run_hw_queue(hctx, true); >>>> - else if (needs_restart && needs_resource) >>>> + else if (needs_restart && (needs_resource || >>>> + blk_mq_is_shared_tags(hctx->flags))) >>>> blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY); >>>> >>>> blk_mq_update_dispatch_busy(hctx, true); >>>> >>> >>> >>> With just this patch on top of 5.17.0, it still hangs. I've captured block debugfs log: >>> https://drive.google.com/file/d/1ic4YHxoL9RrCdy_5FNdGfh_q_J3d_Ft0/view?usp=sharing >> The log is similar with before, and the only difference is RESTART not >> set. >> Also follows another patch merged to v5.18 and it fixes io stall too, feel free to test it: >> 8f5fea65b06d blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues > > Have you tried this patch? > > We meet a similar problem in our test, and I'm pretty sure about the > situation at the scene, > > Our test environment:nvme with bfq ioscheduler, > > How io is stalled: > > 1. hctx1 dispatch rq from bfq in service queue, bfqq becomes empty, > dispatch somehow fails and rq is inserted to hctx1->dispatch, new run > work is queued. > > 2. other hctx tries to dispatch rq, however, in service bfqq is > empty, bfq_dispatch_request return NULL, thus > blk_mq_delay_run_hw_queues is called. > > 3. for the problem described in above patch,run work from "hctx1" > can be stalled. > > Above patch should fix this io stall, however, it seems to me bfq do > have some problems that in service bfqq doesn't expire under following > situation: > > 1. dispatched rqs don't complete > 2. no new rq is issued to bfq > There may be one more important problem: is bfq_finish_requeue_request eventually invoked for the failed rq? If it is not, then a memory leak follows, because recounting gets unavoidably unbalanced. In contrast, if bfq_finish_requeue_request is correctly invoked, then no stall should occur. Thanks, Paolo > Thanks, > Kuai >> Thanks, >> Ming >> . ^ permalink raw reply [flat|nested] 58+ messages in thread
* Re: stalling IO regression in linux 5.12 2022-08-10 16:35 stalling IO regression in linux 5.12 Chris Murphy 2022-08-10 17:48 ` Josef Bacik @ 2022-08-15 11:25 ` Thorsten Leemhuis 1 sibling, 0 replies; 58+ messages in thread From: Thorsten Leemhuis @ 2022-08-15 11:25 UTC (permalink / raw) To: Btrfs BTRFS, Linux-RAID, linux-block, linux-kernel, regressions [TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.] Hi, this is your Linux kernel regression tracker. On 10.08.22 18:35, Chris Murphy wrote: > CPU: Intel E5-2680 v3 > RAM: 128 G > 02:00.0 RAID bus controller [0104]: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] [1000:005d] (rev 02), using megaraid_sas driver > 8 Disks: TOSHIBA AL13SEB600 > > > The problem exhibits as increasing load, increasing IO pressure (PSI), and actual IO goes to zero. It never happens on kernel 5.11 series, and always happens after 5.12-rc1 and persists through 5.18.0. There's a new mix of behaviors with 5.19, I suspect the mm improvements in this series might be masking the problem. > > The workload involves openqa, which spins up 30 qemu-kvm instances, and does a bunch of tests, generating quite a lot of writes: qcow2 files, and video in the form of many screenshots, and various log files, for each VM. These VMs are each in their own cgroup. As the problem begins, I see increasing IO pressure, and decreasing IO, for each qemu instance's cgroup, and the cgroups for httpd, journald, auditd, and postgresql. IO pressure goes to nearly ~99% and IO is literally 0. > > The problem left unattended to progress will eventually result in a completely unresponsive system, with no kernel messages. It reproduces in the following configurations, the first two I provide links to full dmesg with sysrq+w: > > btrfs raid10 (native) on plain partitions [1] > btrfs single/dup on dmcrypt on mdadm raid 10 and parity raid [2] > XFS on dmcrypt on mdadm raid10 or parity raid > > I've started a bisect, but for some reason I haven't figured out I've started getting compiled kernels that don't boot the hardware. The failure is very early on such that the UUID for the root file system isn't found, but not much to go on as to why.[3] I have tested the first and last skipped commits in the bisect log below, they successfully boot a VM but not the hardware. > > Anyway, I'm kinda stuck at this point trying to narrow it down further. Any suggestions? Thanks. > > [1] btrfs raid10, plain partitions > https://drive.google.com/file/d/1-oT3MX-hHYtQqI0F3SpgPjCIDXXTysLU/view?usp=sharing > > [2] btrfs single/dup, dmcrypt, mdadm raid10 > https://drive.google.com/file/d/1m_T3YYaEjBKUROz6dHt5_h92ZVRji9FM/view?usp=sharing > > [3] > $ git bisect log > git bisect start > # status: waiting for both good and bad commits > # bad: [c03c21ba6f4e95e406a1a7b4c34ef334b977c194] Merge tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs > git bisect bad c03c21ba6f4e95e406a1a7b4c34ef334b977c194 > # status: waiting for good commit(s), bad commit known > # good: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11 > git bisect good f40ddce88593482919761f74910f42f4b84c004b > # bad: [df24212a493afda0d4de42176bea10d45825e9a0] Merge tag 's390-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux > git bisect bad df24212a493afda0d4de42176bea10d45825e9a0 > # good: [82851fce6107d5a3e66d95aee2ae68860a732703] Merge tag 'arm-dt-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc > git bisect good 82851fce6107d5a3e66d95aee2ae68860a732703 > # good: [99f1a5872b706094ece117368170a92c66b2e242] Merge tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux > git bisect good 99f1a5872b706094ece117368170a92c66b2e242 > # bad: [9eef02334505411667a7b51a8f349f8c6c4f3b66] Merge tag 'locking-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect bad 9eef02334505411667a7b51a8f349f8c6c4f3b66 > # bad: [9820b4dca0f9c6b7ab8b4307286cdace171b724d] Merge tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-block > git bisect bad 9820b4dca0f9c6b7ab8b4307286cdace171b724d > # good: [bd018bbaa58640da786d4289563e71c5ef3938c7] Merge tag 'for-5.12/libata-2021-02-17' of git://git.kernel.dk/linux-block > git bisect good bd018bbaa58640da786d4289563e71c5ef3938c7 > # skip: [203c018079e13510f913fd0fd426370f4de0fd05] Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.12/drivers > git bisect skip 203c018079e13510f913fd0fd426370f4de0fd05 > # skip: [49d1ec8573f74ff1e23df1d5092211de46baa236] block: manage bio slab cache by xarray > git bisect skip 49d1ec8573f74ff1e23df1d5092211de46baa236 > # bad: [73d90386b559d6f4c3c5db5e6bb1b68aae8fd3e7] nvme: cleanup zone information initialization > git bisect bad 73d90386b559d6f4c3c5db5e6bb1b68aae8fd3e7 > # skip: [71217df39dc67a0aeed83352b0d712b7892036a2] block, bfq: make waker-queue detection more robust > git bisect skip 71217df39dc67a0aeed83352b0d712b7892036a2 > # bad: [8358c28a5d44bf0223a55a2334086c3707bb4185] block: fix memory leak of bvec > git bisect bad 8358c28a5d44bf0223a55a2334086c3707bb4185 > # skip: [3a905c37c3510ea6d7cfcdfd0f272ba731286560] block: skip bio_check_eod for partition-remapped bios > git bisect skip 3a905c37c3510ea6d7cfcdfd0f272ba731286560 > # skip: [3c337690d2ebb7a01fa13bfa59ce4911f358df42] block, bfq: avoid spurious switches to soft_rt of interactive queues > git bisect skip 3c337690d2ebb7a01fa13bfa59ce4911f358df42 > # skip: [3e1a88ec96259282b9a8b45c3f1fda7a3ff4f6ea] bio: add a helper calculating nr segments to alloc > git bisect skip 3e1a88ec96259282b9a8b45c3f1fda7a3ff4f6ea > # skip: [4eb1d689045552eb966ebf25efbc3ce648797d96] blk-crypto: use bio_kmalloc in blk_crypto_clone_bio > git bisect skip 4eb1d689045552eb966ebf25efbc3ce648797d96 Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced v5.11..v5.12-rc1 #regzbot ignore-activity This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/ Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight. ^ permalink raw reply [flat|nested] 58+ messages in thread
end of thread, other threads:[~2022-09-06 9:50 UTC | newest] Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-08-10 16:35 stalling IO regression in linux 5.12 Chris Murphy 2022-08-10 17:48 ` Josef Bacik 2022-08-10 18:33 ` Chris Murphy 2022-08-10 18:42 ` Chris Murphy 2022-08-10 19:31 ` Josef Bacik 2022-08-10 19:34 ` Chris Murphy 2022-08-12 16:05 ` stalling IO regression since linux 5.12, through 5.18 Chris Murphy 2022-08-12 17:59 ` Josef Bacik 2022-08-12 18:02 ` Jens Axboe 2022-08-14 20:28 ` Chris Murphy 2022-08-16 14:22 ` Chris Murphy 2022-08-16 15:25 ` Nikolay Borisov 2022-08-16 15:34 ` Chris Murphy 2022-08-17 9:52 ` Holger Hoffstätte 2022-08-17 11:49 ` Jan Kara 2022-08-17 14:37 ` Chris Murphy 2022-08-17 15:09 ` Chris Murphy 2022-08-17 16:30 ` Jan Kara 2022-08-17 16:47 ` Chris Murphy 2022-08-17 17:57 ` Chris Murphy 2022-08-17 18:15 ` Jan Kara 2022-08-17 18:18 ` Chris Murphy 2022-08-17 18:33 ` Jan Kara 2022-08-17 18:54 ` Chris Murphy 2022-08-17 19:23 ` Chris Murphy 2022-08-18 2:31 ` Chris Murphy 2022-08-17 18:21 ` Holger Hoffstätte 2022-08-17 11:57 ` Chris Murphy 2022-08-17 12:31 ` Holger Hoffstätte 2022-08-17 18:16 ` Chris Murphy 2022-08-17 18:38 ` Holger Hoffstätte 2022-08-17 12:06 ` Ming Lei 2022-08-17 14:34 ` Chris Murphy 2022-08-17 14:53 ` Ming Lei 2022-08-17 15:02 ` Chris Murphy 2022-08-17 15:34 ` Ming Lei 2022-08-17 16:34 ` Chris Murphy 2022-08-18 1:03 ` Ming Lei 2022-08-18 2:30 ` Chris Murphy 2022-08-18 3:24 ` Ming Lei 2022-08-18 4:12 ` Chris Murphy 2022-08-18 4:18 ` Chris Murphy 2022-08-18 4:27 ` Chris Murphy 2022-08-18 4:32 ` Chris Murphy 2022-08-18 5:15 ` Ming Lei 2022-08-18 18:52 ` Chris Murphy 2022-08-18 5:24 ` Ming Lei 2022-08-18 13:50 ` Chris Murphy 2022-08-18 15:10 ` Ming Lei 2022-08-19 19:20 ` Chris Murphy 2022-08-20 7:00 ` Ming Lei 2022-09-01 7:02 ` Yu Kuai 2022-09-01 8:03 ` Jan Kara 2022-09-01 8:19 ` Yu Kuai 2022-09-06 9:49 ` Paolo Valente 2022-09-02 16:53 ` Chris Murphy 2022-09-06 9:45 ` Paolo Valente 2022-08-15 11:25 ` stalling IO regression in linux 5.12 Thorsten Leemhuis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).