From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2130.oracle.com ([156.151.31.86]:48220 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733169AbeHAJEt (ORCPT ); Wed, 1 Aug 2018 05:04:49 -0400 Subject: Re: [next-20180727][qla2xxx][BUG] WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280 To: Abdul Haleem , linuxppc-dev , "Madhani, Himanshu" Cc: linux-block , linux-fsdevel , linux-ext4 , linux-scsi , linux-next , Stephen Rothwell , linux-kernel , jejb@linux.vnet.ibm.com, Jens Axboe , dgilbert@interlog.com, "bart.vanassche" , rosattig@br.ibm.com, kyle.mahlkuch@ibm.com References: <1533105183.23332.15.camel@abdul> From: "jianchao.wang" Message-ID: Date: Wed, 1 Aug 2018 15:19:58 +0800 MIME-Version: 1.0 In-Reply-To: <1533105183.23332.15.camel@abdul> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi Abdul On 08/01/2018 02:33 PM, Abdul Haleem wrote: > # mkfs -t ext4 /dev/mapper/mpatha > mke2fs 1.43.1 (08-Jun-2016) > Found a dos partition table in /dev/mapper/mpatha > Proceed anyway? (y,n) y > Discarding device blocks: > qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 -- 1 2002. > qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 -- 1 2002. > qla2xxx [0106:a0:00.1]-801c:2: Abort command issued nexus=2:1:0 -- 1 2002. > qla2xxx [0106:a0:00.0]-801c:0: Abort command issued nexus=0:1:0 -- 1 2002. > WARNING: CPU: 12 PID: 511 at drivers/scsi/scsi_lib.c:691 scsi_end_request+0x250/0x280 ... > NIP [c000000000690080] scsi_end_request+0x250/0x280 > LR [c00000000068fe80] scsi_end_request+0x50/0x280 > Call Trace: > [c00000027d39b600] [c00000000068fe80] scsi_end_request+0x50/0x280 (unreliable) > [c00000027d39b660] [c0000000006904ac] scsi_io_completion+0x29c/0x7d0 > [c00000027d39b710] [c0000000006848e4] scsi_finish_command+0x104/0x1c0 > [c00000027d39b790] [c00000000068f148] scsi_softirq_done+0x198/0x1f0 > [c00000027d39b820] [c0000000004f2b80] blk_mq_complete_request+0x130/0x1d0 > [c00000027d39b860] [c00000000068d27c] scsi_mq_done+0x2c/0xe0 > [c00000027d39b890] [d000000004291080] qla2xxx_qpair_sp_compl+0xa8/0x140 [qla2xxx] > [c00000027d39b900] [d0000000042cc9d0] qla2x00_process_completed_request+0x68/0x140 [qla2xxx] > ------------[ cut here ]------------ > kernel BUG at block/blk-core.c:3196! blk_finish_request BUG_ON(blk_queued_rq(req)) We are also suffering a similar issue on qla2xxx, the BUG_ON in blk_finish_request is triggered while there are lots of command aborted. The root cause should be qla2xxx driver still invoke scsi_done for an aborted command and cause race between requeue path and normal complete path. Add Himanshu Madhani from qlogic team. It seems that they are working on this. Thanks Jianchao