From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77F39EB64DC for ; Tue, 11 Jul 2023 09:41:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229612AbjGKJlr (ORCPT ); Tue, 11 Jul 2023 05:41:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229468AbjGKJlq (ORCPT ); Tue, 11 Jul 2023 05:41:46 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88D2E10DF for ; Tue, 11 Jul 2023 02:41:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689068461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XsHk3s5WKMPNtRSOR2hho7ctM2lgfEHLxGIuRz1PzpE=; b=NYK/7FSooZ0d7DTHOl06DdOMattU6fVdZkh9ZY2r9GXU9h1KB3NiQCBL79KMDxMpX9XhvO t2wR1Hx96PAUBKZv4Zvr92UH1R4xBfMu7M1tI5V7mDbNs+s2SrL/Auz3Hz1Q+4APJ5FP6E zZ+uDXyRQy8Mlmh84i1tUIrwPFRURrE= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-52-xcXtflm3M6Cx6BYMoIu5Bg-1; Tue, 11 Jul 2023 05:40:58 -0400 X-MC-Unique: xcXtflm3M6Cx6BYMoIu5Bg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A24571C4FDAC; Tue, 11 Jul 2023 09:40:56 +0000 (UTC) Received: from localhost (ovpn-8-22.pek2.redhat.com [10.72.8.22]) by smtp.corp.redhat.com (Postfix) with ESMTP id C3F31207B356; Tue, 11 Jul 2023 09:40:55 +0000 (UTC) From: Ming Lei To: Christoph Hellwig , Sagi Grimberg , Keith Busch , linux-nvme@lists.infradead.org Cc: Yi Zhang , Chunguang Xu , Ming Lei , stable@vger.kernel.org Subject: [PATCH V2 1/3] nvme: fix possible hang when removing a controller during error recovery Date: Tue, 11 Jul 2023 17:40:39 +0800 Message-Id: <20230711094041.1819102-2-ming.lei@redhat.com> In-Reply-To: <20230711094041.1819102-1-ming.lei@redhat.com> References: <20230711094041.1819102-1-ming.lei@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Error recovery can be interrupted by controller removal, then the controller is left as quiesced, and IO hang can be caused. Fix the issue by unquiescing controller unconditionally when removing namespaces. This way is reasonable and safe given forward progress can be made when removing namespaces. Reviewed-by: Keith Busch Reviewed-by: Sagi Grimberg Reported-by: Chunguang Xu Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/ Cc: stable@vger.kernel.org Signed-off-by: Ming Lei --- drivers/nvme/host/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 47d7ba2827ff..98fa8315bc65 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3903,6 +3903,12 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl) */ nvme_mpath_clear_ctrl_paths(ctrl); + /* + * Unquiesce io queues so any pending IO won't hang, especially + * those submitted from scan work + */ + nvme_unquiesce_io_queues(ctrl); + /* prevent racing with ns scanning */ flush_work(&ctrl->scan_work); @@ -3912,10 +3918,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl) * removing the namespaces' disks; fail all the queues now to avoid * potentially having to clean up the failed sync later. */ - if (ctrl->state == NVME_CTRL_DEAD) { + if (ctrl->state == NVME_CTRL_DEAD) nvme_mark_namespaces_dead(ctrl); - nvme_unquiesce_io_queues(ctrl); - } /* this is a no-op when called from the controller reset handler */ nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO); -- 2.40.1