From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4625FC4338F for ; Mon, 2 Aug 2021 11:27:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 24CDA60F36 for ; Mon, 2 Aug 2021 11:27:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233520AbhHBL10 (ORCPT ); Mon, 2 Aug 2021 07:27:26 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:37090 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233382AbhHBL1S (ORCPT ); Mon, 2 Aug 2021 07:27:18 -0400 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 1542F21EED; Mon, 2 Aug 2021 11:27:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1627903628; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vuFuvhKlUfFYXf6FmlE9wh0koABi63V3tWPF92dXgDs=; b=T+MCcYPZkMEz3giv6QkqVZ1zrxfHWKkAC2gT3JhJf7eVaG2+2yvvod+t/C5LZxwiO/NBtS YuUXvL/fXmi/4a/D2O2/5QXWnJpyMF0iuoVSjmt0UHaKxeklDfGUpIll2CU6wDxcfgDCTo iVUhQtOw+W90Vbu0/PAU7ZOKaiQX7P8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1627903628; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vuFuvhKlUfFYXf6FmlE9wh0koABi63V3tWPF92dXgDs=; b=IF7WxU6P2YZflUcw3M5+xTBNQ5tqbzJJZma9sxSxUo/Ry/G4P3Na54uVIgiZP8sXs2PYD8 kCXgGf0+VMdcZ/BA== Received: from adalid.arch.suse.de (adalid.arch.suse.de [10.161.8.13]) by relay2.suse.de (Postfix) with ESMTP id 00E2CA3BC8; Mon, 2 Aug 2021 11:27:08 +0000 (UTC) Received: by adalid.arch.suse.de (Postfix, from userid 17828) id EC323518C0B0; Mon, 2 Aug 2021 13:27:07 +0200 (CEST) From: Daniel Wagner To: linux-nvme@lists.infradead.org Cc: linux-kernel@vger.kernel.org, James Smart , Keith Busch , Ming Lei , Sagi Grimberg , Hannes Reinecke , Wen Xiong , James Smart , Daniel Wagner Subject: [PATCH v4 6/8] nvme-fc: fix controller reset hang during traffic Date: Mon, 2 Aug 2021 13:26:56 +0200 Message-Id: <20210802112658.75875-7-dwagner@suse.de> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210802112658.75875-1-dwagner@suse.de> References: <20210802112658.75875-1-dwagner@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: James Smart commit fe35ec58f0d3 ("block: update hctx map when use multiple maps") exposed an issue where we may hang trying to wait for queue freeze during I/O. We call blk_mq_update_nr_hw_queues which may attempt to freeze the queue. However we never started queue freeze when starting the reset, which means that we have inflight pending requests that entered the queue that we will not complete once the queue is quiesced. So start a freeze before we quiesce the queue, and unfreeze the queue after we successfully connected the I/O queues (the unfreeze is already present in the code). blk_mq_update_nr_hw_queues will be called only after we are sure that the queue was already frozen. This follows to how the pci driver handles resets. This patch added logic introduced in commit 9f98772ba307 "nvme-rdma: fix controller reset hang during traffic". Signed-off-by: James Smart CC: Sagi Grimberg [dwagner: call nvme_unfreeze() unconditionally in nvme_fc_recreate_io_queues() to match the nvme_start_freeze()] Tested-by: Daniel Wagner Reviewed-by: Daniel Wagner --- drivers/nvme/host/fc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 133b87db4f1d..b292af0fd655 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2486,6 +2486,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues) * (but with error status). */ if (ctrl->ctrl.queue_count > 1) { + nvme_start_freeze(&ctrl->ctrl); nvme_stop_queues(&ctrl->ctrl); nvme_sync_io_queues(&ctrl->ctrl); blk_mq_tagset_busy_iter(&ctrl->tag_set, @@ -2966,8 +2967,8 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl) return -ENODEV; } blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues); - nvme_unfreeze(&ctrl->ctrl); } + nvme_unfreeze(&ctrl->ctrl); ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1); if (ret) -- 2.29.2