From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6C9EC433E0 for ; Thu, 7 Jan 2021 22:41:06 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AF004235FA for ; Thu, 7 Jan 2021 22:41:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AF004235FA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=chromium.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4DBh666CHfzDqwj for ; Fri, 8 Jan 2021 09:41:02 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=chromium.org (client-ip=2607:f8b0:4864:20::d29; helo=mail-io1-xd29.google.com; envelope-from=tientzu@chromium.org; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=chromium.org header.i=@chromium.org header.a=rsa-sha256 header.s=google header.b=SoKXOKPj; dkim-atps=neutral Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4DBYZf2Nb1zDqVm for ; Fri, 8 Jan 2021 04:46:46 +1100 (AEDT) Received: by mail-io1-xd29.google.com with SMTP id z5so6928367iob.11 for ; Thu, 07 Jan 2021 09:46:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=lmRpyU02tzYzNNegjIihcCxZyXW1d2vZ/GdMFK5mnJk=; b=SoKXOKPjPIj0bNZqjL8jd3g1x++IpRUGavBgQZB48yjMr3FErg0/LB8QoBzIcQt/ka 3uw3LzC05V0NXsnGkX2E/jViOUmBeqfz+Dm7BPkaUOwck9GE68NEStaDUzo2Ps4jBT44 o0xJvs9naSndqu5VCEiuqFkwP8CanbrdeGCSc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=lmRpyU02tzYzNNegjIihcCxZyXW1d2vZ/GdMFK5mnJk=; b=cX2pDJtqmmj+Sj+cjZKvxYKkbEK7DLrX5aGOeWx/BZVpFtH+tt1u0d23ldTjHT1t5G Prsychf+hX0jUGlnu3KfueyViDnVLab70KzbfCP+RMVIqXKGBhHIiixIjS+X35mDrLdg xhEoAZdPhQHrMyl2SF1OY+V0WMQusNKuYogJftumv1HQu7OhlC3kKqnMMece0TDzF7nq GyKCbUD/+2BhET9kOqWnt4L7R2ub/uAG8NeXB6MLhDGpF0HgZ48ncdg2KMxiQDI9lfm/ 2SPOmrRC+dk0SV8nMe6KqVqwobCKk1590LKP+PMUtCkFmSEq5c4RLcYwP7nxyhKjaSx5 6LmA== X-Gm-Message-State: AOAM532zlqBQHfwrNGfGr1ImBSQHvWCLwmDqM4lvGFE4FRR6cia7rVWI BNCLuklxweNHRFZjD6/mTtG1CODRM6OwuDGG X-Google-Smtp-Source: ABdhPJwKeaJAG/JXBq6iIl6+81OOKPLNhuo1RjIMcJ+dZTX4dgxygSxAEpmRiMLh1q4Ouy75Yv5Gqg== X-Received: by 2002:a05:6638:83:: with SMTP id v3mr8910297jao.106.1610041603062; Thu, 07 Jan 2021 09:46:43 -0800 (PST) Received: from mail-io1-f54.google.com (mail-io1-f54.google.com. [209.85.166.54]) by smtp.gmail.com with ESMTPSA id f20sm4938002ilr.85.2021.01.07.09.46.42 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Jan 2021 09:46:42 -0800 (PST) Received: by mail-io1-f54.google.com with SMTP id o6so6931176iob.10 for ; Thu, 07 Jan 2021 09:46:42 -0800 (PST) X-Received: by 2002:a02:b709:: with SMTP id g9mr9101800jam.90.1610041143173; Thu, 07 Jan 2021 09:39:03 -0800 (PST) MIME-Version: 1.0 References: <20210106034124.30560-1-tientzu@chromium.org> In-Reply-To: From: Claire Chang Date: Fri, 8 Jan 2021 01:38:51 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH v3 0/6] Restricted DMA To: Florian Fainelli Content-Type: multipart/alternative; boundary="0000000000006a175905b852ea63" X-Mailman-Approved-At: Fri, 08 Jan 2021 09:39:06 +1100 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: heikki.krogerus@linux.intel.com, peterz@infradead.org, grant.likely@arm.com, paulus@samba.org, Frank Rowand , mingo@kernel.org, Marek Szyprowski , sstabellini@kernel.org, Saravana Kannan , "list@263.net:IOMMU DRIVERS , Joerg Roedel , " , rafael.j.wysocki@intel.com, Christoph Hellwig , Bartosz Golaszewski , xen-devel@lists.xenproject.org, Thierry Reding , linux-devicetree , will@kernel.org, konrad.wilk@oracle.com, dan.j.williams@intel.com, linuxppc-dev@lists.ozlabs.org, Rob Herring , boris.ostrovsky@oracle.com, Andy Shevchenko , jgross@suse.com, Nicolas Boichat , Greg KH , rdunlap@infradead.org, lkml , Tomasz Figa , "list@263.net:IOMMU DRIVERS , Joerg Roedel , " , Jim Quinlan , xypron.glpk@gmx.de, Robin Murphy , bauerman@linux.ibm.com Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" --0000000000006a175905b852ea63 Content-Type: text/plain; charset="UTF-8" On Thu, Jan 7, 2021 at 2:48 AM Florian Fainelli wrote: > > Hi, > > First of all let me say that I am glad that someone is working on a > upstream solution for this issue, would appreciate if you could CC and > Jim Quinlan on subsequent submissions. Sure! > > > On 1/5/21 7:41 PM, Claire Chang wrote: > > This series implements mitigations for lack of DMA access control on > > systems without an IOMMU, which could result in the DMA accessing the > > system memory at unexpected times and/or unexpected addresses, possibly > > leading to data leakage or corruption. > > > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > > not behind an IOMMU. As PCI-e, by design, gives the device full access to > > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > > full chain of exploits; [2], [3]). > > > > To mitigate the security concerns, we introduce restricted DMA. Restricted > > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > > specially allocated region and does memory allocation from the same region. > > The feature on its own provides a basic level of protection against the DMA > > overwriting buffer contents at unexpected times. However, to protect > > against general data leakage and system memory corruption, the system needs > > to provide a way to restrict the DMA to a predefined memory region (this is > > usually done at firmware level, e.g. in ATF on some ARM platforms). > > Can you explain how ATF gets involved and to what extent it does help, > besides enforcing a secure region from the ARM CPU's perpsective? Does > the PCIe root complex not have an IOMMU but can somehow be denied access > to a region that is marked NS=0 in the ARM CPU's MMU? If so, that is > still some sort of basic protection that the HW enforces, right? We need the ATF support for memory MPU (memory protection unit). Restricted DMA (with reserved-memory in dts) makes sure the predefined memory region is for PCIe DMA only, but we still need MPU to locks down PCIe access to that specific regions. > > On Broadcom STB SoCs we have had something similar for a while however > and while we don't have an IOMMU for the PCIe bridge, we do have a a > basic protection mechanism whereby we can configure a region in DRAM to > be PCIe read/write and CPU read/write which then gets used as the PCIe > inbound region for the PCIe EP. By default the PCIe bridge is not > allowed access to DRAM so we must call into a security agent to allow > the PCIe bridge to access the designated DRAM region. > > We have done this using a private CMA area region assigned via Device > Tree, assigned with a and requiring the PCIe EP driver to use > dma_alloc_from_contiguous() in order to allocate from this device > private CMA area. The only drawback with that approach is that it > requires knowing how much memory you need up front for buffers and DMA > descriptors that the PCIe EP will need to process. The problem is that > it requires driver modifications and that does not scale over the number > of PCIe EP drivers, some we absolutely do not control, but there is no > need to bounce buffer. Your approach scales better across PCIe EP > drivers however it does require bounce buffering which could be a > performance hit. Only the streaming DMA (map/unmap) needs bounce buffering. I also added alloc/free support in this series (https://lore.kernel.org/patchwork/patch/1360995/), so dma_direct_alloc() will try to allocate memory from the predefined memory region. As for the performance hit, it should be similar to the default swiotlb. Here are my experiment results. Both SoCs lack IOMMU for PCIe. PCIe wifi vht80 throughput - MTK SoC tcp_tx tcp_rx udp_tx udp_rx w/o Restricted DMA 244.1 134.66 312.56 350.79 w/ Restricted DMA 246.95 136.59 363.21 351.99 Rockchip SoC tcp_tx tcp_rx udp_tx udp_rx w/o Restricted DMA 237.87 133.86 288.28 361.88 w/ Restricted DMA 256.01 130.95 292.28 353.19 The CPU usage doesn't increase too much either. Although I didn't measure the CPU usage very precisely, it's ~3% with a single big core (Cortex-A72) and ~5% with a single small core (Cortex-A53). Thanks! > > Thanks! On Thu, Jan 7, 2021 at 2:48 AM Florian Fainelli wrote: > Hi, > > First of all let me say that I am glad that someone is working on a > upstream solution for this issue, would appreciate if you could CC and > Jim Quinlan on subsequent submissions. > > On 1/5/21 7:41 PM, Claire Chang wrote: > > This series implements mitigations for lack of DMA access control on > > systems without an IOMMU, which could result in the DMA accessing the > > system memory at unexpected times and/or unexpected addresses, possibly > > leading to data leakage or corruption. > > > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > > not behind an IOMMU. As PCI-e, by design, gives the device full access to > > system memory, a vulnerability in the Wi-Fi firmware could easily > escalate > > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > > full chain of exploits; [2], [3]). > > > > To mitigate the security concerns, we introduce restricted DMA. > Restricted > > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > > specially allocated region and does memory allocation from the same > region. > > The feature on its own provides a basic level of protection against the > DMA > > overwriting buffer contents at unexpected times. However, to protect > > against general data leakage and system memory corruption, the system > needs > > to provide a way to restrict the DMA to a predefined memory region (this > is > > usually done at firmware level, e.g. in ATF on some ARM platforms). > > Can you explain how ATF gets involved and to what extent it does help, > besides enforcing a secure region from the ARM CPU's perpsective? Does > the PCIe root complex not have an IOMMU but can somehow be denied access > to a region that is marked NS=0 in the ARM CPU's MMU? If so, that is > still some sort of basic protection that the HW enforces, right? > > On Broadcom STB SoCs we have had something similar for a while however > and while we don't have an IOMMU for the PCIe bridge, we do have a a > basic protection mechanism whereby we can configure a region in DRAM to > be PCIe read/write and CPU read/write which then gets used as the PCIe > inbound region for the PCIe EP. By default the PCIe bridge is not > allowed access to DRAM so we must call into a security agent to allow > the PCIe bridge to access the designated DRAM region. > > We have done this using a private CMA area region assigned via Device > Tree, assigned with a and requiring the PCIe EP driver to use > dma_alloc_from_contiguous() in order to allocate from this device > private CMA area. The only drawback with that approach is that it > requires knowing how much memory you need up front for buffers and DMA > descriptors that the PCIe EP will need to process. The problem is that > it requires driver modifications and that does not scale over the number > of PCIe EP drivers, some we absolutely do not control, but there is no > need to bounce buffer. Your approach scales better across PCIe EP > drivers however it does require bounce buffering which could be a > performance hit. > > Thanks! > -- > Florian > --0000000000006a175905b852ea63 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Jan 7, 2021 at 2:48 AM Florian Fainelli <f.fainelli@gmail.com> wrote:
&g= t;
> Hi,
>
> First of all let me say that I am glad that = someone is working on a
> upstream solution for this issue, would app= reciate if you could CC and
> Jim Quinlan on subsequent submissions.<= br>
Sure!

>
>
> On 1/5/21 7:41 PM, Claire Chang wr= ote:
> > This series implements mitigations for lack of DMA access= control on
> > systems without an IOMMU, which could result in th= e DMA accessing the
> > system memory at unexpected times and/or u= nexpected addresses, possibly
> > leading to data leakage or corru= ption.
> >
> > For example, we plan to use the PCI-e bus = for Wi-Fi and that PCI-e bus is
> > not behind an IOMMU. As PCI-e,= by design, gives the device full access to
> > system memory, a v= ulnerability in the Wi-Fi firmware could easily escalate
> > to a = full system exploit (remote wifi exploits: [1a], [1b] that shows a
> = > full chain of exploits; [2], [3]).
> >
> > To mitiga= te the security concerns, we introduce restricted DMA. Restricted
> &= gt; DMA utilizes the existing swiotlb to bounce streaming DMA in and out of= a
> > specially allocated region and does memory allocation from = the same region.
> > The feature on its own provides a basic level= of protection against the DMA
> > overwriting buffer contents at = unexpected times. However, to protect
> > against general data lea= kage and system memory corruption, the system needs
> > to provide= a way to restrict the DMA to a predefined memory region (this is
> &= gt; usually done at firmware level, e.g. in ATF on some ARM platforms).
= >
> Can you explain how ATF gets involved and to what extent it do= es help,
> besides enforcing a secure region from the ARM CPU's p= erpsective? Does
> the PCIe root complex not have an IOMMU but can so= mehow be denied access
> to a region that is marked NS=3D0 in the ARM= CPU's MMU? If so, that is
> still some sort of basic protection = that the HW enforces, right?

We nee= d the ATF support for memory MPU (memory protection unit).
Restricted DM= A (with reserved-memory in dts) makes sure the predefined memory
region is for PCIe DMA only, but we still= need MPU to locks down PCIe access to
that specific regions.

>
> On Broadcom STB So= Cs we have had something similar for a while however
> and while we d= on't have an IOMMU for the PCIe bridge, we do have a a
> basic pr= otection mechanism whereby we can configure a region in DRAM to
> be = PCIe read/write and CPU read/write which then gets used as the PCIe
>= inbound region for the PCIe EP. By default the PCIe bridge is not
> = allowed access to DRAM so we must call into a security agent to allow
&g= t; the PCIe bridge to access the designated DRAM region.
>
> We= have done this using a private CMA area region assigned via Device
>= Tree, assigned with a and requiring the PCIe EP driver to use
> dma_= alloc_from_contiguous() in order to allocate from this device
> priva= te CMA area. The only drawback with that approach is that it
> requir= es knowing how much memory you need up front for buffers and DMA
> de= scriptors that the PCIe EP will need to process. The problem is that
>= ; it requires driver modifications and that does not scale over the number<= br>> of PCIe EP drivers, some we absolutely do not control, but there is= no
> need to bounce buffer. Your approach scales better across PCIe = EP
> drivers however it does require bounce buffering which could be = a
> performance hit.

Only the streaming DMA (map/unmap) needs = bounce buffering.
I also added alloc/free support in this series
(https://lore.kern= el.org/patchwork/patch/1360995/), so dma_direct_alloc() will
try to allocate memory from the pre= defined memory region.

As for the performance hit, it should be simi= lar to the default swiotlb.
Here are my experiment results. Both SoCs la= ck IOMMU for PCIe.

PCIe wifi vht80 throughput -

=C2=A0 MTK So= C =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tcp_tx =C2= =A0 =C2=A0 tcp_rx =C2=A0 =C2=A0udp_tx =C2=A0 udp_rx
=C2=A0 w/o Restricte= d DMA =C2=A0244.1 =C2=A0 =C2=A0 134.66 =C2=A0 312.56 =C2=A0 350.79
=C2= =A0 w/ Restricted DMA =C2=A0 =C2=A0246.95 =C2=A0 136.59 =C2=A0 363.21 =C2= =A0 351.99

=C2=A0 Rockchip SoC =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tc= p_tx =C2=A0 =C2=A0 tcp_rx =C2=A0 =C2=A0udp_tx =C2=A0 udp_rx
=C2=A0 w/o R= estricted DMA =C2=A0237.87 =C2=A0 133.86 =C2=A0 288.28 =C2=A0 361.88
=C2= =A0 w/ Restricted DMA =C2=A0 =C2=A0256.01 =C2=A0 130.95 =C2=A0 292.28 =C2= =A0 353.19

The CPU usage doesn't increase too much either.
Al= though I didn't measure the CPU usage very precisely, it's ~3% with= a single
big core (Corte= x-A72) and ~5% with a single small core (Cortex-A53).

Thanks!
>= ;
> Thanks!


On Thu, Jan 7, 2021 at 2:48 AM Florian Fa= inelli <f.fainelli@gmail.com= > wrote:
Hi,<= br>
First of all let me say that I am glad that someone is working on a
upstream solution for this issue, would appreciate if you could CC and
Jim Quinlan on subsequent submissions.

On 1/5/21 7:41 PM, Claire Chang wrote:
> This series implements mitigations for lack of DMA access control on > systems without an IOMMU, which could result in the DMA accessing the<= br> > system memory at unexpected times and/or unexpected addresses, possibl= y
> leading to data leakage or corruption.
>
> For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus= is
> not behind an IOMMU. As PCI-e, by design, gives the device full access= to
> system memory, a vulnerability in the Wi-Fi firmware could easily esca= late
> to a full system exploit (remote wifi exploits: [1a], [1b] that shows = a
> full chain of exploits; [2], [3]).
>
> To mitigate the security concerns, we introduce restricted DMA. Restri= cted
> DMA utilizes the existing swiotlb to bounce streaming DMA in and out o= f a
> specially allocated region and does memory allocation from the same re= gion.
> The feature on its own provides a basic level of protection against th= e DMA
> overwriting buffer contents at unexpected times. However, to protect > against general data leakage and system memory corruption, the system = needs
> to provide a way to restrict the DMA to a predefined memory region (th= is is
> usually done at firmware level, e.g. in ATF on some ARM platforms).
Can you explain how ATF gets involved and to what extent it does help,
besides enforcing a secure region from the ARM CPU's perpsective? Does<= br> the PCIe root complex not have an IOMMU but can somehow be denied access to a region that is marked NS=3D0 in the ARM CPU's MMU? If so, that is<= br> still some sort of basic protection that the HW enforces, right?

On Broadcom STB SoCs we have had something similar for a while however
and while we don't have an IOMMU for the PCIe bridge, we do have a a basic protection mechanism whereby we can configure a region in DRAM to
be PCIe read/write and CPU read/write which then gets used as the PCIe
inbound region for the PCIe EP. By default the PCIe bridge is not
allowed access to DRAM so we must call into a security agent to allow
the PCIe bridge to access the designated DRAM region.

We have done this using a private CMA area region assigned via Device
Tree, assigned with a and requiring the PCIe EP driver to use
dma_alloc_from_contiguous() in order to allocate from this device
private CMA area. The only drawback with that approach is that it
requires knowing how much memory you need up front for buffers and DMA
descriptors that the PCIe EP will need to process. The problem is that
it requires driver modifications and that does not scale over the number of PCIe EP drivers, some we absolutely do not control, but there is no
need to bounce buffer. Your approach scales better across PCIe EP
drivers however it does require bounce buffering which could be a
performance hit.

Thanks!
--
Florian
--0000000000006a175905b852ea63--