From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 681D1C48BE0 for ; Fri, 11 Jun 2021 12:49:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CD9C613B8 for ; Fri, 11 Jun 2021 12:49:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231230AbhFKMvV (ORCPT ); Fri, 11 Jun 2021 08:51:21 -0400 Received: from foss.arm.com ([217.140.110.172]:57510 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230233AbhFKMvR (ORCPT ); Fri, 11 Jun 2021 08:51:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 35FF5D6E; Fri, 11 Jun 2021 05:49:19 -0700 (PDT) Received: from [10.57.6.115] (unknown [10.57.6.115]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5EEB23F73D; Fri, 11 Jun 2021 05:49:18 -0700 (PDT) Subject: Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation To: Will Deacon , Ashish Mhetre Cc: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <1623298614-31755-1-git-send-email-amhetre@nvidia.com> <1623298614-31755-2-git-send-email-amhetre@nvidia.com> <20210611104524.GD15274@willie-the-truck> From: Robin Murphy Message-ID: Date: Fri, 11 Jun 2021 13:49:13 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210611104524.GD15274@willie-the-truck> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-06-11 11:45, Will Deacon wrote: > On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote: >> Domain is getting created more than once during asynchronous multiple >> display heads(devices) probe. All the display heads share same SID and >> are expected to be in same domain. As iommu_alloc_default_domain() call >> is not protected, the group->default_domain and group->domain are ending >> up with different domains and leading to subsequent IOMMU faults. >> Fix this by protecting iommu_alloc_default_domain() call with group->mutex. > > Can you provide some more information about exactly what the h/w > configuration is, and the callstack which exhibits the race, please? It'll be basically the same as the issue reported long ago with PCI groups in the absence of ACS not being constructed correctly. Triggering the iommu_probe_device() replay in of_iommu_configure() off the back of driver probe is way too late and allows calls to happen in the wrong order, or indeed race in parallel as here. Fixing that is still on my radar, but will not be simple, and will probably go hand-in-hand with phasing out the bus ops (for the multiple-driver-coexistence problem). >> Signed-off-by: Ashish Mhetre >> --- >> drivers/iommu/iommu.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index 808ab70..2700500 100644 >> --- a/drivers/iommu/iommu.c >> +++ b/drivers/iommu/iommu.c >> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) >> * support default domains, so the return value is not yet >> * checked. >> */ >> + mutex_lock(&group->mutex); >> iommu_alloc_default_domain(group, dev); >> + mutex_unlock(&group->mutex); > > It feels wrong to serialise this for everybody just to cater for systems > with aliasing SIDs between devices. If two or more devices are racing at this point then they're already going to be serialised by at least iommu_group_add_device(), so I doubt there would be much impact - only the first device through here will hold the mutex for any appreciable length of time. Every other path which modifies group->domain does so with the mutex held (note the "expected" default domain allocation flow in bus_iommu_probe() in particular), so not holding it here does seem like a straightforward oversight. Robin. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_RED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71F87C48BE5 for ; Fri, 11 Jun 2021 12:49:26 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 21034613B8 for ; Fri, 11 Jun 2021 12:49:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 21034613B8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=iommu-bounces@lists.linux-foundation.org Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id E18BC404C1; Fri, 11 Jun 2021 12:49:25 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GRSNKKqNw85C; Fri, 11 Jun 2021 12:49:25 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp2.osuosl.org (Postfix) with ESMTPS id E19B4404BA; Fri, 11 Jun 2021 12:49:24 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id A0E96C000D; Fri, 11 Jun 2021 12:49:24 +0000 (UTC) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id F3053C000B for ; Fri, 11 Jun 2021 12:49:22 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id E18B083D4A for ; Fri, 11 Jun 2021 12:49:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tvC8tzVv43fB for ; Fri, 11 Jun 2021 12:49:20 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp1.osuosl.org (Postfix) with ESMTP id 37B9A83D12 for ; Fri, 11 Jun 2021 12:49:20 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 35FF5D6E; Fri, 11 Jun 2021 05:49:19 -0700 (PDT) Received: from [10.57.6.115] (unknown [10.57.6.115]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5EEB23F73D; Fri, 11 Jun 2021 05:49:18 -0700 (PDT) Subject: Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation To: Will Deacon , Ashish Mhetre References: <1623298614-31755-1-git-send-email-amhetre@nvidia.com> <1623298614-31755-2-git-send-email-amhetre@nvidia.com> <20210611104524.GD15274@willie-the-truck> From: Robin Murphy Message-ID: Date: Fri, 11 Jun 2021 13:49:13 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210611104524.GD15274@willie-the-truck> Content-Language: en-GB Cc: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" On 2021-06-11 11:45, Will Deacon wrote: > On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote: >> Domain is getting created more than once during asynchronous multiple >> display heads(devices) probe. All the display heads share same SID and >> are expected to be in same domain. As iommu_alloc_default_domain() call >> is not protected, the group->default_domain and group->domain are ending >> up with different domains and leading to subsequent IOMMU faults. >> Fix this by protecting iommu_alloc_default_domain() call with group->mutex. > > Can you provide some more information about exactly what the h/w > configuration is, and the callstack which exhibits the race, please? It'll be basically the same as the issue reported long ago with PCI groups in the absence of ACS not being constructed correctly. Triggering the iommu_probe_device() replay in of_iommu_configure() off the back of driver probe is way too late and allows calls to happen in the wrong order, or indeed race in parallel as here. Fixing that is still on my radar, but will not be simple, and will probably go hand-in-hand with phasing out the bus ops (for the multiple-driver-coexistence problem). >> Signed-off-by: Ashish Mhetre >> --- >> drivers/iommu/iommu.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index 808ab70..2700500 100644 >> --- a/drivers/iommu/iommu.c >> +++ b/drivers/iommu/iommu.c >> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) >> * support default domains, so the return value is not yet >> * checked. >> */ >> + mutex_lock(&group->mutex); >> iommu_alloc_default_domain(group, dev); >> + mutex_unlock(&group->mutex); > > It feels wrong to serialise this for everybody just to cater for systems > with aliasing SIDs between devices. If two or more devices are racing at this point then they're already going to be serialised by at least iommu_group_add_device(), so I doubt there would be much impact - only the first device through here will hold the mutex for any appreciable length of time. Every other path which modifies group->domain does so with the mutex held (note the "expected" default domain allocation flow in bus_iommu_probe() in particular), so not holding it here does seem like a straightforward oversight. Robin. _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9490EC48BE0 for ; Fri, 11 Jun 2021 12:51:08 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5C37D61249 for ; Fri, 11 Jun 2021 12:51:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5C37D61249 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:Cc:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=1+w2KLyjx+BWymvvnL3ha9wEh6oezuhPTE3a9lSqWac=; b=WXO4Sa8hf11YqsDzmWnRR77lCe Np2YxTMLSj+oFB3RcUI20HebUlwMU8kmTY75q5IxPpaK3taUiAXILCYG3GP+JRSgSfTHmxDC89upj hZtVVgIBQDrMRWb1tFzRS42v8rGq/7qF8K591Ahij0ZaUHZqnhoYN/DnQ69ozKJp3tduAA+mPk+SQ jxbVpfk8Mv2Gir4SIqFDWlouClI8UIZkEd6Dm0O4h6KSh0MQTJBWOIK1/SPJoOAiQeiokPsETMRQ9 IJSq0ljlT12GoA2HKGGr7Z0s8lMPc3JGeORh5bhArvk4/7WLI9kxmm9zoOipCGpOlp35mnxzxQnEa 7IN20tng==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lrgbF-005JOj-HQ; Fri, 11 Jun 2021 12:49:25 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lrgbB-005JOD-Fx for linux-arm-kernel@lists.infradead.org; Fri, 11 Jun 2021 12:49:22 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 35FF5D6E; Fri, 11 Jun 2021 05:49:19 -0700 (PDT) Received: from [10.57.6.115] (unknown [10.57.6.115]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5EEB23F73D; Fri, 11 Jun 2021 05:49:18 -0700 (PDT) Subject: Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation To: Will Deacon , Ashish Mhetre Cc: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <1623298614-31755-1-git-send-email-amhetre@nvidia.com> <1623298614-31755-2-git-send-email-amhetre@nvidia.com> <20210611104524.GD15274@willie-the-truck> From: Robin Murphy Message-ID: Date: Fri, 11 Jun 2021 13:49:13 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210611104524.GD15274@willie-the-truck> Content-Language: en-GB X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210611_054921_621190_A056DBB2 X-CRM114-Status: GOOD ( 20.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2021-06-11 11:45, Will Deacon wrote: > On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote: >> Domain is getting created more than once during asynchronous multiple >> display heads(devices) probe. All the display heads share same SID and >> are expected to be in same domain. As iommu_alloc_default_domain() call >> is not protected, the group->default_domain and group->domain are ending >> up with different domains and leading to subsequent IOMMU faults. >> Fix this by protecting iommu_alloc_default_domain() call with group->mutex. > > Can you provide some more information about exactly what the h/w > configuration is, and the callstack which exhibits the race, please? It'll be basically the same as the issue reported long ago with PCI groups in the absence of ACS not being constructed correctly. Triggering the iommu_probe_device() replay in of_iommu_configure() off the back of driver probe is way too late and allows calls to happen in the wrong order, or indeed race in parallel as here. Fixing that is still on my radar, but will not be simple, and will probably go hand-in-hand with phasing out the bus ops (for the multiple-driver-coexistence problem). >> Signed-off-by: Ashish Mhetre >> --- >> drivers/iommu/iommu.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >> index 808ab70..2700500 100644 >> --- a/drivers/iommu/iommu.c >> +++ b/drivers/iommu/iommu.c >> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev) >> * support default domains, so the return value is not yet >> * checked. >> */ >> + mutex_lock(&group->mutex); >> iommu_alloc_default_domain(group, dev); >> + mutex_unlock(&group->mutex); > > It feels wrong to serialise this for everybody just to cater for systems > with aliasing SIDs between devices. If two or more devices are racing at this point then they're already going to be serialised by at least iommu_group_add_device(), so I doubt there would be much impact - only the first device through here will hold the mutex for any appreciable length of time. Every other path which modifies group->domain does so with the mutex held (note the "expected" default domain allocation flow in bus_iommu_probe() in particular), so not holding it here does seem like a straightforward oversight. Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel