From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752167AbcKGTyH (ORCPT ); Mon, 7 Nov 2016 14:54:07 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:42286 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751736AbcKGTyE (ORCPT ); Mon, 7 Nov 2016 14:54:04 -0500 From: Mauricio Faria de Oliveira To: qla2xxx-upstream@qlogic.com Cc: jejb@linux.vnet.ibm.com, martin.petersen@oracle.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] qla2xxx: fix invalid DMA access after command aborts in PCI device remove Date: Mon, 7 Nov 2016 17:53:31 -0200 X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1478548411-17932-1-git-send-email-mauricfo@linux.vnet.ibm.com> References: <1478548411-17932-1-git-send-email-mauricfo@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16110719-0032-0000-0000-000002A722AC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16110719-0033-0000-0000-00000F251FB1 Message-Id: <1478548411-17932-3-git-send-email-mauricfo@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-07_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611070365 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If a command is aborted in the kernel but not in the adapter, it might be considered complete and its DMA memory released, but it is still alive in the adapter, which will trigger an invalid DMA access upon its completion (in the DMA operations to deliver the command response to the driver). On powerpc platforms with IOMMU/EEH capabilities, the problem is observed during PCI device removal with ongoing IO requests -- which might trigger an EEH event very often, pointing to a 'TCE Request Page Access Error'. In that path, which is qla2x00_remove_one(), the commands are aborted in qla2x00_abort_all_cmds(), which does not perform an abort in the adapter as is done in qla2xxx_eh_abort() for example. So, this patch changes qla2x00_abort_all_cmds() to abort commands in the adapter too, with a call to qla2xxx_eh_abort(), which already implements all the logic to submit abort requests and handle responses. Reported-by: Naresh Bannoth Signed-off-by: Mauricio Faria de Oliveira --- drivers/scsi/qla2xxx/qla_os.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index fdb135b..c50dd22 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1456,6 +1456,15 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { sp = req->outstanding_cmds[cnt]; if (sp) { + /* Get a reference to the sp and drop the lock. + * The reference ensures this sp->done() call + * - and not the call in qla2xxx_eh_abort() - + * ends the SCSI command (with result 'res'). + */ + sp_get(sp); + spin_unlock_irqrestore(&ha->hardware_lock, flags); + qla2xxx_eh_abort(GET_CMD_SP(sp)); + spin_lock_irqsave(&ha->hardware_lock, flags); req->outstanding_cmds[cnt] = NULL; sp->done(vha, sp, res); } -- 1.8.3.1