From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1479C433EF for ; Fri, 10 Dec 2021 17:55:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244856AbhLJR6k (ORCPT ); Fri, 10 Dec 2021 12:58:40 -0500 Received: from foss.arm.com ([217.140.110.172]:45312 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241538AbhLJR6i (ORCPT ); Fri, 10 Dec 2021 12:58:38 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3916613A1; Fri, 10 Dec 2021 09:55:03 -0800 (PST) Received: from e121345-lin.cambridge.arm.com (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DE04F3F73B; Fri, 10 Dec 2021 09:55:01 -0800 (PST) From: Robin Murphy To: joro@8bytes.org, will@kernel.org Cc: iommu@lists.linux-foundation.org, suravee.suthikulpanit@amd.com, baolu.lu@linux.intel.com, willy@infradead.org, linux-kernel@vger.kernel.org, john.garry@huawei.com, linux-mm@kvack.org, Xiongfeng Wang Subject: [PATCH v2 01/11] iommu/iova: Fix race between FQ timeout and teardown Date: Fri, 10 Dec 2021 17:54:42 +0000 Message-Id: X-Mailer: git-send-email 2.28.0.dirty In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiongfeng Wang It turns out to be possible for hotplugging out a device to reach the stage of tearing down the device's group and default domain before the domain's flush queue has drained naturally. At this point, it is then possible for the timeout to expire just *before* the del_timer() call from free_iova_flush_queue(), such that we then proceed to free the FQ resources while fq_flush_timeout() is still accessing them on another CPU. Crashes due to this have been observed in the wild while removing NVMe devices. Close the race window by using del_timer_sync() to safely wait for any active timeout handler to finish before we start to free things. We already avoid any locking in free_iova_flush_queue() since the FQ is supposed to be inactive anyway, so the potential deadlock scenario does not apply. Fixes: 9a005a800ae8 ("iommu/iova: Add flush timer") Signed-off-by: Xiongfeng Wang [ rm: rewrite commit message ] Signed-off-by: Robin Murphy --- v2: New to this series, picked up from: https://lore.kernel.org/linux-iommu/1564219269-14346-1-git-send-email-wangxiongfeng2@huawei.com/ drivers/iommu/iova.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 9e8bc802ac05..920fcc27c9a1 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -83,8 +83,7 @@ static void free_iova_flush_queue(struct iova_domain *iovad) if (!has_iova_flush_queue(iovad)) return; - if (timer_pending(&iovad->fq_timer)) - del_timer(&iovad->fq_timer); + del_timer_sync(&iovad->fq_timer); fq_destroy_all_entries(iovad); -- 2.28.0.dirty From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3423C4332F for ; Fri, 10 Dec 2021 17:55:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 5EA9942393; Fri, 10 Dec 2021 17:55:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uvDIYZqqGjwE; Fri, 10 Dec 2021 17:55:07 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 58A074238D; Fri, 10 Dec 2021 17:55:07 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2DE1EC002F; Fri, 10 Dec 2021 17:55:07 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 45672C0012 for ; Fri, 10 Dec 2021 17:55:05 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 23C7840385 for ; Fri, 10 Dec 2021 17:55:05 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id adjtQagvwpXU for ; Fri, 10 Dec 2021 17:55:04 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp2.osuosl.org (Postfix) with ESMTP id 0CAFB400E8 for ; Fri, 10 Dec 2021 17:55:03 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3916613A1; Fri, 10 Dec 2021 09:55:03 -0800 (PST) Received: from e121345-lin.cambridge.arm.com (e121345-lin.cambridge.arm.com [10.1.196.40]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DE04F3F73B; Fri, 10 Dec 2021 09:55:01 -0800 (PST) From: Robin Murphy To: joro@8bytes.org, will@kernel.org Subject: [PATCH v2 01/11] iommu/iova: Fix race between FQ timeout and teardown Date: Fri, 10 Dec 2021 17:54:42 +0000 Message-Id: X-Mailer: git-send-email 2.28.0.dirty In-Reply-To: References: MIME-Version: 1.0 Cc: linux-kernel@vger.kernel.org, willy@infradead.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, Xiongfeng Wang X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" From: Xiongfeng Wang It turns out to be possible for hotplugging out a device to reach the stage of tearing down the device's group and default domain before the domain's flush queue has drained naturally. At this point, it is then possible for the timeout to expire just *before* the del_timer() call from free_iova_flush_queue(), such that we then proceed to free the FQ resources while fq_flush_timeout() is still accessing them on another CPU. Crashes due to this have been observed in the wild while removing NVMe devices. Close the race window by using del_timer_sync() to safely wait for any active timeout handler to finish before we start to free things. We already avoid any locking in free_iova_flush_queue() since the FQ is supposed to be inactive anyway, so the potential deadlock scenario does not apply. Fixes: 9a005a800ae8 ("iommu/iova: Add flush timer") Signed-off-by: Xiongfeng Wang [ rm: rewrite commit message ] Signed-off-by: Robin Murphy --- v2: New to this series, picked up from: https://lore.kernel.org/linux-iommu/1564219269-14346-1-git-send-email-wangxiongfeng2@huawei.com/ drivers/iommu/iova.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 9e8bc802ac05..920fcc27c9a1 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -83,8 +83,7 @@ static void free_iova_flush_queue(struct iova_domain *iovad) if (!has_iova_flush_queue(iovad)) return; - if (timer_pending(&iovad->fq_timer)) - del_timer(&iovad->fq_timer); + del_timer_sync(&iovad->fq_timer); fq_destroy_all_entries(iovad); -- 2.28.0.dirty _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu