From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Ram Pai <linuxram@us.ibm.com>
Cc: andmike@us.ibm.com, mst@redhat.com, aik@ozlabs.ru,
linux-kernel@vger.kernel.org, leonardo@linux.ibm.com,
ram.n.pai@gmail.com, cai@lca.pw, tglx@linutronix.de,
sukadev@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org,
hch@lst.de, bauerman@linux.ibm.com, david@gibson.dropbear.id.au
Subject: Re: [PATCH v5 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VM.
Date: Thu, 12 Dec 2019 18:19:14 -0600 [thread overview]
Message-ID: <157619635433.3810.2635705421787117448@sif> (raw)
In-Reply-To: <20191212064502.GC5709@oc0525413822.ibm.com>
Quoting Ram Pai (2019-12-12 00:45:02)
> On Tue, Dec 10, 2019 at 07:43:24PM -0600, Michael Roth wrote:
> > Quoting Ram Pai (2019-12-06 19:12:39)
> > > Commit edea902c1c1e ("powerpc/pseries/iommu: Don't use dma_iommu_ops on
> > > secure guests")
> > > disabled dma_iommu_ops path, for secure VMs. Disabling dma_iommu_ops
> > > path for secure VMs, helped enable dma_direct path. This enabled
> > > support for bounce-buffering through SWIOTLB. However it fails to
> > > operate when IOMMU is enabled, since I/O pages are not TCE mapped.
> > >
> > > Renable dma_iommu_ops path for pseries Secure VMs. It handles all
> > > cases including, TCE mapping I/O pages, in the presence of a
> > > IOMMU.
> >
> > Wasn't clear to me at first, but I guess the main gist of this series is
> > that we want to continue to use SWIOTLB, but also need to create mappings
> > of it's bounce buffers in the IOMMU, so we revert to using dma_iommu_ops
> > and rely on the various dma_iommu_{map,alloc}_bypass() hooks throughout
> > to call into dma_direct_* ops rather than relying on the dma_is_direct(ops)
> > checks in DMA API functions to do the same.
> >
> > That makes sense, but one issue I see with that is that
> > dma_iommu_map_bypass() only tests true if all the following are true:
> >
> > 1) the device requests a 64-bit DMA mask via
> > dma_set_mask/dma_set_coherent_mask
> > 2) DDW is enabled (i.e. we don't pass disable_ddw on command-line)
> >
> > dma_is_direct() checks don't have this limitation, so I think for
> > anything cases, such as devices that use a smaller DMA mask, we'll
> > end up falling back to the non-bypass functions in dma_iommu_ops, which
> > will likely break for things like dma_alloc_coherent/dma_map_single
> > since they won't use SWIOTLB pages and won't do the necessary calls to
> > set_memory_unencrypted() to share those non-SWIOTLB buffers with
> > hypervisor.
> >
> > Maybe that's ok, but I think we should be clearer about how to
> > fail/handle these cases.
>
> Yes. makes sense. Device that cannot handle 64bit dma mask will not work.
>
> >
> > Though I also agree with some concerns Alexey stated earlier: it seems
> > wasteful to map the entire DDW window just so these bounce buffers can be
> > mapped. Especially if you consider the lack of a mapping to be an additional
> > safe-guard against things like buggy device implementations on the QEMU
> > side. E.g. if we leaked pages to the hypervisor on accident, those pages
> > wouldn't be immediately accessible to a device, and would still require
> > additional work get past the IOMMU.
>
> Well, an accidental unintented page leak to the hypervisor, is a very
> bad thing, regardless of any DMA mapping. The device may not be able to
> access it, but the hypervisor still can access it.
Agreed, but if IOMMU can provide additional isolation we should make use
of it when reasonable.
>
> >
> > What would it look like if we try to make all this work with disable_ddw passed
> > to kernel command-line (or forced for is_secure_guest())?
> >
> > 1) dma_iommu_{alloc,map}_bypass() would no longer get us to dma_direct_* ops,
> > but an additional case or hook that considers is_secure_guest() might do
> > it.
> >
> > 2) We'd also need to set up an IOMMU mapping for the bounce buffers via
> > io_tlb_start/io_tlb_end. We could do it once, on-demand via
> > dma_iommu_bypass_supported() like we do for the 64-bit DDW window, or
> > maybe in some init function.
>
> Hmm... i not sure how to accomplish (2). we need use some DDW window
> to setup the mappings. right? If disable_ddw is set, there wont be any
> ddw. What am i missing?
We have 2 windows, the default 32-bit window that normally covers DMA addresses
0..1GB, and an additional DDW window that's created on demand for 64-bit
devices (since they can address a full linear mapping of all guest
memory at DMA address 1<<59). Your current patch uses the latter, but we
could potentially use the 32-bit window since we only need to map the
SWIOTLB pages.
Not saying that's necessarily better, but one upside is it only requires
devices to support 32-bit DMA addressing, which is a larger class of
devices than those that support 64-bit (since 64-bit device drivers
generally have a 32-bit fallback). Required changes are a bit more
pervasive though.
It might make sense to get both scenarios working at some point, but
maybe it's okay to handle 64-bit first. 64-bit does give us more legroom
if we anticipate changes in where the SWIOTLB memory is allocated from,
or expect it to become larger than 1GB.
>
> >
> > That also has the benefit of not requiring devices to support 64-bit DMA.
> >
> > Alternatively, we could continue to rely on the 64-bit DDW window, but
> > modify call to enable_ddw() to only map the io_tlb_start/end range in
> > the case of is_secure_guest(). This is a little cleaner implementation-wise
> > since we can rely on the existing dma_iommu_{alloc,map}_bypass() hooks.
>
> I have been experimenting with this. Trying to map only the memory
> range from io_tlb_start/io_tlb_end though the 64-bit ddw window. But
> due to some reason, it wants the io_tlb_start to be aligned to some
> boundary. It looks like a 2^28 boundary. Not sure what dictates that
> boundary.
Not sure, but that might be related to 256MB LMB size. Could also be the page
size of the DDW window, but seems large for that. In any case I think it would
be okay if we needed to truncate io_tlb_start to a lower 256MB-boundary and the
subsequent range. We have a few more mappings than strictly necessary, but it's
still better than all guest memory.
>
>
> > , but
> > devices that don't support 64-bit will fail back to not using dma_direct_* ops
> > and fail miserably. We'd probably want to handle that more gracefully.
>
> Yes i will put a warning message to indicate the failure.
>
> >
> > Or we handle both cases gracefully. To me it makes more sense to enable
> > non-DDW case, then consider adding DDW case later if there's some reason
> > why 64-bit DMA is needed. But would be good to hear if there are other
> > opinions.
>
> educate me a bit here. What is a non-DDW case? is it possible for a
> device to acccess memory, in the presence of a IOMMU, without a window-mapping?
It's not indeed, but we have the default 32-bit window for the non-DDW
case. See above.
>
> >
> > >
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > > arch/powerpc/platforms/pseries/iommu.c | 11 +----------
> > > 1 file changed, 1 insertion(+), 10 deletions(-)
> > >
> > > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
> > > index 67b5009..4e27d66 100644
> > > --- a/arch/powerpc/platforms/pseries/iommu.c
> > > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > > @@ -36,7 +36,6 @@
> > > #include <asm/udbg.h>
> > > #include <asm/mmzone.h>
> > > #include <asm/plpar_wrappers.h>
> > > -#include <asm/svm.h>
> > > #include <asm/ultravisor.h>
> > >
> > > #include "pseries.h"
> > > @@ -1346,15 +1345,7 @@ void iommu_init_early_pSeries(void)
> > > of_reconfig_notifier_register(&iommu_reconfig_nb);
> > > register_memory_notifier(&iommu_mem_nb);
> > >
> > > - /*
> > > - * Secure guest memory is inacessible to devices so regular DMA isn't
> > > - * possible.
> > > - *
> > > - * In that case keep devices' dma_map_ops as NULL so that the generic
> > > - * DMA code path will use SWIOTLB to bounce buffers for DMA.
> > > - */
> > > - if (!is_secure_guest())
> > > - set_pci_dma_ops(&dma_iommu_ops);
> > > + set_pci_dma_ops(&dma_iommu_ops);
> > > }
> > >
> > > static int __init disable_multitce(char *str)
> > > --
> > > 1.8.3.1
> > >
>
> --
> Ram Pai
next prev parent reply other threads:[~2019-12-13 0:24 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-07 1:12 [PATCH v5 0/2] Enable IOMMU support for pseries Secure VMs Ram Pai
2019-12-07 1:12 ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Ram Pai
2019-12-07 1:12 ` [PATCH v5 2/2] powerpc/pseries/iommu: Use dma_iommu_ops for Secure VM Ram Pai
2019-12-10 3:08 ` Alexey Kardashevskiy
2019-12-10 22:09 ` Thiago Jung Bauermann
2019-12-11 1:43 ` Michael Roth
2019-12-11 8:36 ` Alexey Kardashevskiy
2019-12-11 18:07 ` Michael Roth
2019-12-11 18:20 ` Christoph Hellwig
2019-12-12 6:45 ` Ram Pai
2019-12-13 0:19 ` Michael Roth [this message]
2019-12-10 3:07 ` [PATCH v5 1/2] powerpc/pseries/iommu: Share the per-cpu TCE page with the hypervisor Alexey Kardashevskiy
2019-12-10 5:12 ` Ram Pai
2019-12-10 5:32 ` Alexey Kardashevskiy
2019-12-10 15:35 ` Ram Pai
2019-12-11 8:15 ` Alexey Kardashevskiy
2019-12-11 20:31 ` Michael Roth
2019-12-11 22:47 ` Alexey Kardashevskiy
2019-12-12 2:39 ` Alexey Kardashevskiy
2019-12-13 0:22 ` Michael Roth
2019-12-12 4:11 ` Ram Pai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=157619635433.3810.2635705421787117448@sif \
--to=mdroth@linux.vnet.ibm.com \
--cc=aik@ozlabs.ru \
--cc=andmike@us.ibm.com \
--cc=bauerman@linux.ibm.com \
--cc=cai@lca.pw \
--cc=david@gibson.dropbear.id.au \
--cc=hch@lst.de \
--cc=leonardo@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=linuxram@us.ibm.com \
--cc=mst@redhat.com \
--cc=ram.n.pai@gmail.com \
--cc=sukadev@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).