linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Steffen Maier <maier@linux.ibm.com>,
	Yi Zhang <yi.zhang@redhat.com>,
	linux-block <linux-block@vger.kernel.org>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [bug report] WARNING: CPU: 1 PID: 1386 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178
Date: Tue, 2 Nov 2021 14:03:37 -0600	[thread overview]
Message-ID: <4f3811f6-88d9-c0c6-055f-1a3220357e22@kernel.dk> (raw)
In-Reply-To: <e9965a7c-faba-496e-8110-dbe8f7065080@kernel.dk>

On 11/2/21 1:02 PM, Jens Axboe wrote:
> On 11/2/21 1:00 PM, Steffen Maier wrote:
>> On 11/2/21 07:42, Yi Zhang wrote:
>>> Below WARNING was triggered with blktests srp/001 on the latest
>>> linux-block/for-next, and it cannot be reproduced with v5.15, pls help
>>> check it, thanks.
>>>
>>> 88d2c6ab15f7 (origin/for-next) Merge branch 'for-5.16/block' into for-next
>>
>> Same warning here with a slightly different stack trace.
>> It breaks root-fs on zfcp-attached SCSI disks for us, because we run our CI 
>> intentionally with panic_on_warn.
>>
>>> [    9.031740] ------------[ cut here ]------------
>>> [    9.031743] WARNING: CPU: 13 PID: 196 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178
>>> [    9.031751] Modules linked in: nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) dm_service_time(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) ip_set(E) nf_tables(E) nfnetlink(E) sunrpc(E) zfcp(E) scsi_transport_fc(E) dm_multipath(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) s390_trng(E) vfio_ccw(E) mdev(E) vfio_iommu_type1(E) zcrypt_cex4(E) vfio(E) eadm_sch(E) sch_fq_codel(E) configfs(E) ip_tables(E) x_tables(E) ghash_s390(E) prng(E) aes_s390(E) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha256_s390(E) sha1_s390(E) sha_common(E) pkey(E) zcrypt(E) rng_core(E) autofs4(E)
>>> [    9.031785] CPU: 13 PID: 196 Comm: kworker/13:2 Tainted: G            E     5.16.0-20211102.rc0.git0.9febf1194306.300.fc34.s390x+next #1
>>> [    9.031789] Hardware name: IBM 3906 M04 704 (LPAR)
>>> [    9.031791] Workqueue: kaluad alua_rtpg_work [scsi_dh_alua]
>>> [    9.031795] Krnl PSW : 0704e00180000000 000000006558e948 (blk_mq_sched_insert_request+0x58/0x178)
>>> [    9.031800]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
>>> [    9.031803] Krnl GPRS: 0000000000000080 00000000000004c6 00000000ade56000 0000000000000001
>>> [    9.031806]            0000000000000001 0000000000000000 00000000a2d6a400 0000000084003c00
>>> [    9.031808]            0000000000000000 0000000000000001 0000000000000001 00000000ade56000
>>> [    9.031810]            000000008aef0000 000003ff7af59400 000003800e3d7b00 000003800e3d7a90
>>> [    9.031817] Krnl Code: 000000006558e93c: a71effff		chi	%r1,-1
>>>                           000000006558e940: a7840004		brc	8,000000006558e948
>>>                          #000000006558e944: af000000		mc	0,0
>>>                          >000000006558e948: 5810b01c		l	%r1,28(%r11)
>>>                           000000006558e94c: ec213bbb0055	risbg	%r2,%r1,59,187,0
>>>                           000000006558e952: a7740057		brc	7,000000006558ea00
>>>                           000000006558e956: 5810b018		l	%r1,24(%r11)
>>>                           000000006558e95a: c01b000000ff	nilf	%r1,255
>>> [    9.031833] Call Trace:
>>> [    9.031835]  [<000000006558e948>] blk_mq_sched_insert_request+0x58/0x178 
>>> [    9.031838]  [<000000006557effe>] blk_execute_rq+0x56/0xd8 
>>> [    9.031841]  [<0000000065768708>] __scsi_execute+0x118/0x240 
>>> [    9.031847]  [<000003ff803c3788>] alua_rtpg+0x120/0x8f8 [scsi_dh_alua] 
>>> [    9.031849]  [<000003ff803c402c>] alua_rtpg_work+0xcc/0x648 [scsi_dh_alua] 
>>> [    9.031852]  [<0000000064f024d2>] process_one_work+0x1fa/0x470 
>>> [    9.031856]  [<0000000064f02c74>] worker_thread+0x64/0x498 
>>> [    9.031859]  [<0000000064f0a894>] kthread+0x17c/0x188 
>>> [    9.031861]  [<0000000064e933c4>] __ret_from_fork+0x3c/0x58 
>>> [    9.031864]  [<0000000065a71cea>] ret_from_fork+0xa/0x40 
>>> [    9.031868] Last Breaking-Event-Address:
>>> [    9.031869]  [<000000006557ef72>] blk_execute_rq_nowait+0x82/0x98
>>> [    9.031871] Kernel panic - not syncing: panic_on_warn set ...
> 
> I'm looking into this one, it's a bit puzzling. The WARN is:
> 
> WARN_ON(e && (rq->tag != BLK_MQ_NO_TAG));
> 
> which is "we have an elevator", yet the tag isn't initialized to BLK_MQ_NO_TAG.
> That seems to hint at the initialization changes there, but nothing sticks out
> there for me.
> 
> I'll keep looking.

Can either one of you try with this patch? Won't fix anything, but it'll
hopefully shine a bit of light on the issue.


diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 4a6789e4398b..1b7647722ec0 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -429,7 +429,8 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head,
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 
-	WARN_ON(e && (rq->tag != BLK_MQ_NO_TAG));
+	if (e && (rq->tag != BLK_MQ_NO_TAG))
+		printk("tag=%d/%d, e=%lx, rq cmd_flags %x, rq_flags %x\n", rq->tag, rq->internal_tag, (long) e, rq->cmd_flags, rq->rq_flags);
 
 	if (blk_mq_sched_bypass_insert(hctx, rq)) {
 		/*

-- 
Jens Axboe


  reply	other threads:[~2021-11-02 20:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHj4cs-NUKzGj5pgzRhDgdrGGbgPBqUoQ44+xgvk6njH9a_RYQ@mail.gmail.com>
2021-11-02 19:00 ` [bug report] WARNING: CPU: 1 PID: 1386 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178 Steffen Maier
2021-11-02 19:02   ` Jens Axboe
2021-11-02 20:03     ` Jens Axboe [this message]
2021-11-03  2:21       ` Yi Zhang
2021-11-03  3:21         ` Jens Axboe
2021-11-03  3:51           ` Ming Lei
2021-11-03  3:54             ` Jens Axboe
2021-11-03  4:00               ` Yi Zhang
2021-11-03 19:03                 ` Jens Axboe
2021-11-05 11:13                   ` Yi Zhang
2021-11-03 11:59               ` Jens Axboe
2021-11-03 13:59                 ` Yi Zhang
2021-11-03 14:26                   ` Jens Axboe
2021-11-03 14:57                   ` Ming Lei
2021-11-03 15:03                     ` Jens Axboe
2021-11-03 15:09                       ` Ming Lei
2021-11-03 15:12                         ` Jens Axboe
2021-11-03 15:10                       ` Jens Axboe
2021-11-03 15:16                         ` Ming Lei
2021-11-03 15:41                           ` Jens Axboe
2021-11-03 15:49                             ` Jens Axboe
2021-11-03 16:09                               ` Ming Lei
2021-11-03 16:36                                 ` Jens Axboe
     [not found]         ` <CGME20211103032116epcas2p13b9f3fad0fe84f58c9b7f36320c71854@epcms2p2>
2021-11-03  3:28           ` Daejun Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f3811f6-88d9-c0c6-055f-1a3220357e22@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).