From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D99C4C433E1 for ; Mon, 17 Aug 2020 18:31:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A89F12063A for ; Mon, 17 Aug 2020 18:31:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1597689091; bh=obDiQ1N67GFkAj7InPl86bhjnnFnZMxhTP60Tjy/xc0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=lHruHhqIXlOmS3huJsFvkMYjFSjRtZkgPFg1fMQbzwjdYjyupT7yHf8Nqk1U5IRrH 46XdiwYVU9qQqnu013upUGhezc+7fpzEZpmOLh9DCEl/IpfNqGJEdmtueJNicJ9Icj Ffz+65BeOZuF4jahx7nHd3eeqHThDPg9lv+67u+Y= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389570AbgHQSba (ORCPT ); Mon, 17 Aug 2020 14:31:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:48624 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387979AbgHQQBS (ORCPT ); Mon, 17 Aug 2020 12:01:18 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 11E9720829; Mon, 17 Aug 2020 16:01:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1597680072; bh=obDiQ1N67GFkAj7InPl86bhjnnFnZMxhTP60Tjy/xc0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pKz3KtkXzMxKhp9L6VZI66s6s+lLbHWlN83vdbEEAlTY+iJh8modbpU1sa982ItBM U4lkJpr6BywO6aoMtFfSvxbdcdQok417JB16FZF7j8zFWG3OXWJu7Ff7rDyuMeTbme pqMdw94hlPVrs8P/1LpDJ4AlzhvxLJVXM6zKN4oQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sagi Grimberg , Christoph Hellwig , Sasha Levin Subject: [PATCH 5.4 042/270] nvme-tcp: fix controller reset hang during traffic Date: Mon, 17 Aug 2020 17:14:03 +0200 Message-Id: <20200817143757.899851628@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200817143755.807583758@linuxfoundation.org> References: <20200817143755.807583758@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sagi Grimberg [ Upstream commit 2875b0aecabe2f081a8432e2bc85b85df0529490 ] commit fe35ec58f0d3 ("block: update hctx map when use multiple maps") exposed an issue where we may hang trying to wait for queue freeze during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple queue maps (which we have now for default/read/poll) is attempting to freeze the queue. However we never started queue freeze when starting the reset, which means that we have inflight pending requests that entered the queue that we will not complete once the queue is quiesced. So start a freeze before we quiesce the queue, and unfreeze the queue after we successfully connected the I/O queues (and make sure to call blk_mq_update_nr_hw_queues only after we are sure that the queue was already frozen). This follows to how the pci driver handles resets. Fixes: fe35ec58f0d3 ("block: update hctx map when use multiple maps") Signed-off-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin --- drivers/nvme/host/tcp.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 53e113a18a549..0166ff0e4738e 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1684,15 +1684,20 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) ret = PTR_ERR(ctrl->connect_q); goto out_free_tag_set; } - } else { - blk_mq_update_nr_hw_queues(ctrl->tagset, - ctrl->queue_count - 1); } ret = nvme_tcp_start_io_queues(ctrl); if (ret) goto out_cleanup_connect_q; + if (!new) { + nvme_start_queues(ctrl); + nvme_wait_freeze(ctrl); + blk_mq_update_nr_hw_queues(ctrl->tagset, + ctrl->queue_count - 1); + nvme_unfreeze(ctrl); + } + return 0; out_cleanup_connect_q: @@ -1797,6 +1802,7 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, { if (ctrl->queue_count <= 1) return; + nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); if (ctrl->tagset) { -- 2.25.1