linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Valente <paolo.valente@linaro.org>
To: Yi Zhang <yi.zhang@redhat.com>
Cc: linux-block <linux-block@vger.kernel.org>
Subject: Re: [bisected] bfq regression on latest linux-block/for-next
Date: Fri, 2 Apr 2021 15:39:51 +0200	[thread overview]
Message-ID: <4F41414B-05F8-4E7D-A312-8A47B8468C78@linaro.org> (raw)
In-Reply-To: <1366443410.1203736.1617240454513.JavaMail.zimbra@redhat.com>



> Il giorno 1 apr 2021, alle ore 03:27, Yi Zhang <yi.zhang@redhat.com> ha scritto:
> 
> Hi

Hi

> We reproduced this bfq regression[3] on ppc64le with blktests[2] on the latest linux-block/for-next branch, seems it was introduced with [1] from my bisecting, pls help check it. Let me know if you need any testing for it, thanks.
> 

Thanks for reporting this bug and finding the candidate offending commit. Could you try this test with my dev kernel, which might provide more information? The kernel is here:
https://github.com/Algodev-github/bfq-mq

Alternatively, I could try to provide you with patches to instrument your kernel.

The first option may be quicker.

In both cases, having KASAN active could be rather helpful too.

Looking forward to your feedback,
Paolo

> [1]
> commit 430a67f9d6169a7b3e328bceb2ef9542e4153c7c (HEAD, refs/bisect/bad)
> Author: Paolo Valente <paolo.valente@linaro.org>
> Date:   Thu Mar 4 18:46:27 2021 +0100
> 
>    block, bfq: merge bursts of newly-created queues
> 
> [2] blktests: nvme_trtype=tcp ./check nvme/011
> 
> [3]
> [  109.342525] run blktests nvme/011 at 2021-03-31 20:58:58
> [  109.497429] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [  109.512868] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> [  109.584932] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:9b73e8531afa4320963cc96571e0acb1.
> [  109.585809] nvme nvme0: creating 128 I/O queues.
> [  109.596570] nvme nvme0: mapped 128/0/0 default/read/poll queues.
> [  109.654170] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420
> [  155.366535] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [  155.366568] nvmet: ctrl 1 fatal error occurred!
> [  155.366608] nvme nvme0: starting error recovery
> [  155.374861] block nvme0n1: no usable path - requeuing I/O
> [  155.374891] block nvme0n1: no usable path - requeuing I/O
> [  155.374911] block nvme0n1: no usable path - requeuing I/O
> [  155.374923] block nvme0n1: no usable path - requeuing I/O
> [  155.374934] block nvme0n1: no usable path - requeuing I/O
> [  155.374954] block nvme0n1: no usable path - requeuing I/O
> [  155.374973] block nvme0n1: no usable path - requeuing I/O
> [  155.374984] block nvme0n1: no usable path - requeuing I/O
> [  155.375004] block nvme0n1: no usable path - requeuing I/O
> [  155.375024] block nvme0n1: no usable path - requeuing I/O
> [  155.375674] nvme nvme0: Reconnecting in 10 seconds...
> [  180.967462] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
> [  180.967486] nvmet: ctrl 2 fatal error occurred!
> [  193.427906] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  193.427934] rcu: 	40-...0: (1 GPs behind) idle=f3e/1/0x4000000000000000 softirq=535/535 fqs=2839 
> [  193.427966] rcu: 	42-...0: (1 GPs behind) idle=792/1/0x4000000000000000 softirq=160/160 fqs=2839 
> [  193.427995] 	(detected by 5, t=6002 jiffies, g=7961, q=7219)
> [  193.428030] Sending NMI from CPU 5 to CPUs 40:
> [  199.235530] CPU 40 didn't respond to backtrace IPI, inspecting paca.
> [  199.235551] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 217 (kworker/40:0H)
> [  199.235579] Back trace of paca->saved_r1 (0xc000000012973380) (possibly stale):
> [  199.235594] Call Trace:
> [  199.235601] [c000000012973380] [c0000000129733e0] 0xc0000000129733e0 (unreliable)
> [  199.235633] [c000000012973420] [c000000000933008] bfq_allow_bio_merge+0x78/0x110
> [  199.235673] [c000000012973460] [c0000000008d7470] elv_bio_merge_ok+0x90/0xb0
> [  199.235711] [c000000012973490] [c0000000008d811c] elv_merge+0x6c/0x1b0
> [  199.235747] [c0000000129734e0] [c0000000008ea1d8] blk_mq_sched_try_merge+0x48/0x250
> [  199.235785] [c000000012973540] [c00000000092d29c] bfq_bio_merge+0xfc/0x230
> [  199.235820] [c0000000129735c0] [c0000000008f9e24] __blk_mq_sched_bio_merge+0xa4/0x210
> [  199.235848] [c000000012973620] [c0000000008f288c] blk_mq_submit_bio+0xec/0x710
> [  199.235877] [c0000000129736c0] [c0000000008dfad4] submit_bio_noacct+0x534/0x680
> [  199.235915] [c000000012973760] [c0000000005e0308] iomap_dio_submit_bio+0xd8/0x100
> [  199.235953] [c000000012973790] [c0000000005e0ed4] iomap_dio_bio_actor+0x3c4/0x570
> [  199.235991] [c000000012973830] [c0000000005db284] iomap_apply+0x1f4/0x3e0
> [  199.236017] [c000000012973920] [c0000000005e06e4] __iomap_dio_rw+0x204/0x5b0
> [  199.236054] [c0000000129739f0] [c0000000005e0ab0] iomap_dio_rw+0x20/0x80
> [  199.236090] [c000000012973a10] [c00800000e4ea660] xfs_file_dio_write_aligned+0xb8/0x1b0 [xfs]
> [  199.236230] [c000000012973a60] [c00800000e4eabfc] xfs_file_write_iter+0x104/0x190 [xfs]
> [  199.236368] [c000000012973a90] [c008000010440724] nvmet_file_submit_bvec+0xfc/0x160 [nvmet]
> [  199.236412] [c000000012973b10] [c008000010440b54] nvmet_file_execute_io+0x2cc/0x3b0 [nvmet]
> [  199.236455] [c000000012973b90] [c00800000fa142d8] nvmet_tcp_io_work+0xce0/0xd10 [nvmet_tcp]
> [  199.236493] [c000000012973c70] [c000000000179534] process_one_work+0x294/0x580
> [  199.236533] [c000000012973d10] [c0000000001798c8] worker_thread+0xa8/0x650
> [  199.236568] [c000000012973da0] [c000000000184180] kthread+0x190/0x1a0
> [  199.236595] [c000000012973e10] [c00000000000d4ec] ret_from_kernel_thread+0x5c/0x70
> [  199.236710] Sending NMI from CPU 5 to CPUs 42:
> [  205.044364] CPU 42 didn't respond to backtrace IPI, inspecting paca.
> [  205.044381] irq_soft_mask: 0x01 in_mce: 0 in_nmi: 0 current: 3004 (kworker/42:2H)
> [  205.044408] Back trace of paca->saved_r1 (0xc00000085194b520) (possibly stale):
> [  205.044423] Call Trace:
> [  205.044440] [c00000085194b520] [c00000079ac7f700] 0xc00000079ac7f700 (unreliable)
> [  205.044468] [c00000085194b540] [c00000000092d248] bfq_bio_merge+0xa8/0x230
> [  205.044494] [c00000085194b5c0] [c0000000008f9e24] __blk_mq_sched_bio_merge+0xa4/0x210
> [  205.044531] [c00000085194b620] [c0000000008f288c] blk_mq_submit_bio+0xec/0x710
> [  205.044569] [c00000085194b6c0] [c0000000008dfad4] submit_bio_noacct+0x534/0x680
> [  205.044607] [c00000085194b760] [c0000000005e0308] iomap_dio_submit_bio+0xd8/0x100
> [  205.044645] [c00000085194b790] [c0000000005e0ed4] iomap_dio_bio_actor+0x3c4/0x570
> [  205.044672] [c00000085194b830] [c0000000005db284] iomap_apply+0x1f4/0x3e0
> [  205.044698] [c00000085194b920] [c0000000005e06e4] __iomap_dio_rw+0x204/0x5b0
> [  205.044735] [c00000085194b9f0] [c0000000005e0ab0] iomap_dio_rw+0x20/0x80
> [  205.044771] [c00000085194ba10] [c00800000e4ea660] xfs_file_dio_write_aligned+0xb8/0x1b0 [xfs]
> [  205.044897] [c00000085194ba60] [c00800000e4eabfc] xfs_file_write_iter+0x104/0x190 [xfs]
> [  205.045021] [c00000085194ba90] [c008000010440724] nvmet_file_submit_bvec+0xfc/0x160 [nvmet]
> [  205.045064] [c00000085194bb10] [c008000010440b54] nvmet_file_execute_io+0x2cc/0x3b0 [nvmet]
> [  205.045106] [c00000085194bb90] [c00800000fa142d8] nvmet_tcp_io_work+0xce0/0xd10 [nvmet_tcp]
> [  205.045144] [c00000085194bc70] [c000000000179534] process_one_work+0x294/0x580
> [  205.045181] [c00000085194bd10] [c0000000001798c8] worker_thread+0xa8/0x650
> [  205.045216] [c00000085194bda0] [c000000000184180] kthread+0x190/0x1a0
> [  205.045242] [c00000085194be10] [c00000000000d4ec] ret_from_kernel_thread+0x5c/0x70
> 
> 
> 
> 
> Best Regards,
>  Yi Zhang
> 
> 


  parent reply	other threads:[~2021-04-02 13:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <841664844.1202812.1617239260115.JavaMail.zimbra@redhat.com>
2021-04-01  1:27 ` [bisected] bfq regression on latest linux-block/for-next Yi Zhang
2021-04-01  1:55   ` Chaitanya Kulkarni
2021-04-01  2:05     ` Yi Zhang
2021-04-02 13:39   ` Paolo Valente [this message]
2021-04-07 15:15     ` Yi Zhang
     [not found]       ` <CAHj4cs9+q-vH9qar+MTP-aECb2whT7O8J5OmR240yss1y=kWKw@mail.gmail.com>
2021-04-20  7:33         ` Paolo Valente
2021-05-03  9:37           ` Paolo Valente
2021-05-07  8:30             ` Yi Zhang
2021-05-11 17:10               ` Paolo Valente

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F41414B-05F8-4E7D-A312-8A47B8468C78@linaro.org \
    --to=paolo.valente@linaro.org \
    --cc=linux-block@vger.kernel.org \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).