From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Matthew R. Ochs" Subject: Re: [PATCH 1/7] cxlflash: Yield to active send threads Date: Wed, 16 May 2018 10:09:28 -0500 Message-ID: <20180516150928.GA8754@p8tul1-build.aus.stglabs.ibm.com> References: <1526065440-38806-1-git-send-email-ukrishn@linux.vnet.ibm.com> <1526065486-38860-1-git-send-email-ukrishn@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1526065486-38860-1-git-send-email-ukrishn@linux.vnet.ibm.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" To: Uma Krishnan Cc: James Bottomley , linux-scsi@vger.kernel.org, "Martin K. Petersen" , Frederic Barrat , "Manoj N. Kumar" , Andrew Donnellan , linuxppc-dev@lists.ozlabs.org, Christophe Lombard List-Id: linux-scsi@vger.kernel.org On Fri, May 11, 2018 at 02:04:46PM -0500, Uma Krishnan wrote: > The following Oops may be encountered if the device is reset, i.e. EEH > recovery, while there is heavy I/O traffic: > > 59:mon> t > [c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500 > [cxlflash] > [c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0 > [c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0 > [c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0 > [c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0 > [c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0 > [c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0 > [c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520 > [c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610 > [c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280 > [c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30 > [c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0 > [c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00 > [c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130 > [c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c > > When there is no room to send the I/O request, the cached room is refreshed > by reading the memory mapped command room value from the AFU. The AFU > register mapping is refreshed during a reset, creating a race condition > that can lead to the Oops above. > > During a device reset, the AFU should not be unmapped until all the active > send threads quiesce. An atomic counter, cmds_active, is currently used to > track internal AFU commands and quiesce during reset. This same counter can > also be used for the active send threads. > > Signed-off-by: Uma Krishnan Acked-by: Matthew R. Ochs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 40mHsg2RWxzDrpY for ; Thu, 17 May 2018 01:09:37 +1000 (AEST) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4GF2VWm084695 for ; Wed, 16 May 2018 11:09:34 -0400 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 2j0kyusxpy-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 16 May 2018 11:09:33 -0400 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 16 May 2018 09:09:33 -0600 Date: Wed, 16 May 2018 10:09:28 -0500 From: "Matthew R. Ochs" To: Uma Krishnan Cc: linux-scsi@vger.kernel.org, James Bottomley , "Martin K. Petersen" , "Manoj N. Kumar" , linuxppc-dev@lists.ozlabs.org, Andrew Donnellan , Frederic Barrat , Christophe Lombard Subject: Re: [PATCH 1/7] cxlflash: Yield to active send threads References: <1526065440-38806-1-git-send-email-ukrishn@linux.vnet.ibm.com> <1526065486-38860-1-git-send-email-ukrishn@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <1526065486-38860-1-git-send-email-ukrishn@linux.vnet.ibm.com> Message-Id: <20180516150928.GA8754@p8tul1-build.aus.stglabs.ibm.com> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Fri, May 11, 2018 at 02:04:46PM -0500, Uma Krishnan wrote: > The following Oops may be encountered if the device is reset, i.e. EEH > recovery, while there is heavy I/O traffic: > > 59:mon> t > [c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500 > [cxlflash] > [c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0 > [c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0 > [c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0 > [c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0 > [c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0 > [c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0 > [c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520 > [c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610 > [c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280 > [c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30 > [c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0 > [c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00 > [c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130 > [c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c > > When there is no room to send the I/O request, the cached room is refreshed > by reading the memory mapped command room value from the AFU. The AFU > register mapping is refreshed during a reset, creating a race condition > that can lead to the Oops above. > > During a device reset, the AFU should not be unmapped until all the active > send threads quiesce. An atomic counter, cmds_active, is currently used to > track internal AFU commands and quiesce during reset. This same counter can > also be used for the active send threads. > > Signed-off-by: Uma Krishnan Acked-by: Matthew R. Ochs