From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B776BC11F69 for ; Sat, 10 Jul 2021 02:32:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9456D6145C for ; Sat, 10 Jul 2021 02:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235126AbhGJCeg (ORCPT ); Fri, 9 Jul 2021 22:34:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:43262 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233655AbhGJC2P (ORCPT ); Fri, 9 Jul 2021 22:28:15 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6ADF26141A; Sat, 10 Jul 2021 02:25:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625883908; bh=xcdolrNeNMaw5/ECuZ9AThaPRl2bBtKdmrWZmvYR+2U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vQCDnv/o/Jh2VoOLVlyuSHJ+WqRgtcyQsY2QYHoikgD7QUeGd9FXDPaUOl1rl1BEw XUUQld9kFRKO/jUOrsxwep+q/ysptQi8pCClkU4aSyxdRSM4JQBTh8X0wm3aJ3oR5N GZUGtuaMXjEVvSzNZzzP4PPzRi+MNUFhdnUY8xABZLMWI+5ihEuMLiH57oXuRQiHO5 oQOosBwqa/oCIIMQVpj2KL4g5L5MPrsPAT2wTqIbFv2Vxc8V4X6VtSzrObnmBZFl0X o7Ae8NPsM2Zc3KK9Ag7XzOUf262WaUrUPe/Nt7V3xRxLc8IgvYxOp3L4KI6Ms20oCZ cfPUMQJf02bxA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Suganath Prabu S , "Martin K . Petersen" , Sasha Levin , MPT-FusionLinux.pdl@broadcom.com, linux-scsi@vger.kernel.org Subject: [PATCH AUTOSEL 5.10 30/93] scsi: mpt3sas: Fix deadlock while cancelling the running firmware event Date: Fri, 9 Jul 2021 22:23:24 -0400 Message-Id: <20210710022428.3169839-30-sashal@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210710022428.3169839-1-sashal@kernel.org> References: <20210710022428.3169839-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Suganath Prabu S [ Upstream commit e2fac6c44ae06e58ac02181b048af31195883c31 ] Do not cancel current running firmware event work if the event type is different from MPT3SAS_REMOVE_UNRESPONDING_DEVICES. Otherwise a deadlock can be observed while cancelling the current firmware event work if a hard reset operation is called as part of processing the current event. Link: https://lore.kernel.org/r/20210518051625.1596742-2-suganath-prabu.subramani@broadcom.com Signed-off-by: Suganath Prabu S Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 5f845d7094fc..738b71653e9c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3526,6 +3526,28 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc) ioc->fw_events_cleanup = 1; while ((fw_event = dequeue_next_fw_event(ioc)) || (fw_event = ioc->current_event)) { + + /* + * Don't call cancel_work_sync() for current_event + * other than MPT3SAS_REMOVE_UNRESPONDING_DEVICES; + * otherwise we may observe deadlock if current + * hard reset issued as part of processing the current_event. + * + * Orginal logic of cleaning the current_event is added + * for handling the back to back host reset issued by the user. + * i.e. during back to back host reset, driver use to process + * the two instances of MPT3SAS_REMOVE_UNRESPONDING_DEVICES + * event back to back and this made the drives to unregister + * the devices from SML. + */ + + if (fw_event == ioc->current_event && + ioc->current_event->event != + MPT3SAS_REMOVE_UNRESPONDING_DEVICES) { + ioc->current_event = NULL; + continue; + } + /* * Wait on the fw_event to complete. If this returns 1, then * the event was never executed, and we need a put for the -- 2.30.2