linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Marek Szyprowski <m.szyprowski@samsung.com>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-s390@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	Janosch Frank <frankja@linux.ibm.com>,
	Viktor Mihajlovski <mihajlov@linux.ibm.com>,
	Ram Pai <linuxram@us.ibm.com>,
	Thiago Jung Bauermann <bauerman@linux.ibm.com>,
	"Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	Michael Mueller <mimu@linux.ibm.com>
Subject: Re: [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h
Date: Tue, 25 Feb 2020 19:08:02 +0100	[thread overview]
Message-ID: <20200225190802.753cffef.cohuck@redhat.com> (raw)
In-Reply-To: <20200224194953.37c0d6b8.pasic@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 8297 bytes --]

On Mon, 24 Feb 2020 19:49:53 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Mon, 24 Feb 2020 14:33:14 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, Feb 21, 2020 at 07:07:02PM +0100, Halil Pasic wrote:  
> > > On Fri, 21 Feb 2020 10:48:15 -0500
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >   
> > > > On Fri, Feb 21, 2020 at 02:06:39PM +0100, Halil Pasic wrote:  
> > > > > On Fri, 21 Feb 2020 14:27:27 +1100
> > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > >   
> > > > > > On Thu, Feb 20, 2020 at 05:31:35PM +0100, Christoph Hellwig wrote:  
> > > > > > > On Thu, Feb 20, 2020 at 05:23:20PM +0100, Christian Borntraeger wrote:  
> > > > > > > > >From a users perspective it makes absolutely perfect sense to use the  
> > > > > > > > bounce buffers when they are NEEDED. 
> > > > > > > > Forcing the user to specify iommu_platform just because you need bounce buffers
> > > > > > > > really feels wrong. And obviously we have a severe performance issue
> > > > > > > > because of the indirections.  
> > > > > > > 
> > > > > > > The point is that the user should not have to specify iommu_platform.
> > > > > > > We need to make sure any new hypervisor (especially one that might require
> > > > > > > bounce buffering) always sets it,  
> > > > > > 
> > > > > > So, I have draft qemu patches which enable iommu_platform by default.
> > > > > > But that's really because of other problems with !iommu_platform, not
> > > > > > anything to do with bounce buffering or secure VMs.
> > > > > > 
> > > > > > The thing is that the hypervisor *doesn't* require bounce buffering.
> > > > > > In the POWER (and maybe s390 as well) models for Secure VMs, it's the
> > > > > > *guest*'s choice to enter secure mode, so the hypervisor has no reason
> > > > > > to know whether the guest needs bounce buffering.  As far as the
> > > > > > hypervisor and qemu are concerned that's a guest internal detail, it
> > > > > > just expects to get addresses it can access whether those are GPAs
> > > > > > (iommu_platform=off) or IOVAs (iommu_platform=on).  
> > > > > 
> > > > > I very much agree!
> > > > >   
> > > > > >   
> > > > > > > as was a rather bogus legacy hack  
> > > > > > 
> > > > > > It was certainly a bad idea, but it was a bad idea that went into a
> > > > > > public spec and has been widely deployed for many years.  We can't
> > > > > > just pretend it didn't happen and move on.
> > > > > > 
> > > > > > Turning iommu_platform=on by default breaks old guests, some of which
> > > > > > we still care about.  We can't (automatically) do it only for guests
> > > > > > that need bounce buffering, because the hypervisor doesn't know that
> > > > > > ahead of time.  

We could default to iommu_platform=on on s390 when the host has active
support for protected virtualization... but that's just another kind of
horrible, so let's just pretend I didn't suggest it.

> > > > > 
> > > > > Turning iommu_platform=on for virtio-ccw makes no sense whatsover,
> > > > > because for CCW I/O there is no such thing as IOMMU and the addresses
> > > > > are always physical addresses.  
> > > > 
> > > > Fix the name then. The spec calls is ACCESS_PLATFORM now, which
> > > > makes much more sense.  
> > > 
> > > I don't quite get it. Sorry. Maybe I will revisit this later.  
> > 
> > Halil, I think I can clarify this.
> > 
> > The "iommu_platform" flag doesn't necessarily have anything to do with
> > an iommu, although it often will.  Basically it means "access guest
> > memory via the bus's normal DMA mechanism" rather than "access guest
> > memory using GPA, because you're the hypervisor and you can do that".
> >   
> 
> Unfortunately, I don't think this is what is conveyed to the end users.
> Let's see what do we have documented:
> 
> Neither Qemu user documentation
> (https://www.qemu.org/docs/master/qemu-doc.html) nor online help:
> qemu-system-s390x -device virtio-net-ccw,?|grep iommu
>   iommu_platform=<bool>  - on/off (default: false)
> has any documentation on it.

Now, that's 'helpful' -- this certainly calls out for a bit of doc...

> 
> But libvirt does have have documenttion on the knob that contros
> iommu_platform for QEMU (when  QEMU is managed by libvirt):
> """
> Virtio-related options 
> 
> QEMU's virtio devices have some attributes related to the virtio
> transport under the driver element: The iommu attribute enables the use
> of emulated IOMMU by the device. The attribute ats controls the Address
> Translation Service support for PCIe devices. This is needed to make use
> of IOTLB support (see IOMMU device). Possible values are on or off.
> Since 3.5.0 
> """
> (https://libvirt.org/formatdomain.html#elementsVirtio)
> 
> Thus it seems the only available documentation says that it "enables the use
> of emulated IOMMU by the device".
> 
> And for vhost-user we have
> """
> When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has not been negotiated:
> 
> * Guest addresses map to the vhost memory region containing that guest
>   address.
> 
> When the ``VIRTIO_F_IOMMU_PLATFORM`` feature has been negotiated:
> 
> * Guest addresses are also called I/O virtual addresses (IOVAs).  They are
>   translated to user addresses via the IOTLB.
> """
> (docs/interop/vhost-user.rst)
> 
> > For the case of ccw, both mechanisms end up being the same thing,
> > since CCW's normal DMA *is* untranslated GPA access.
> >   
> 
> Nod.
> 
> > For this reason, the flag in the spec was renamed to ACCESS_PLATFORM,
> > but the flag in qemu still has the old name.
> >   
> 
> Yes, the name in the spec is more neutral.
> 
> > AIUI, Michael is saying you could trivially change the name in qemu
> > (obviously you'd need to alias the old name to the new one for
> > compatibility).
> >   
> 
> I could, and the I could also ask the libvirt guys to change <driver
> iommu='X'> to <driver access_platform='X'> or similar and to change  
> their documentation to something that is harder to comprehend. Although
> I'm not sure they would like the idea.

Hopefully, the documentation can be changed to something that is _not_
harder to comprehend :) (with a bit more text, I suppose.) Renaming to
something like access_platform seems like a good idea, even with the
required compat dance.

> 
> > 
> > Actually, the fact that ccw has no translation makes things easier for
> > you: you don't really have any impediment to turning ACCESS_PLATFORM
> > on by default, since it doesn't make any real change to how things
> > work.  
> 
> Yeah, it should not, in theory, but currently it does in practice.
> Currently vhost will try to do the IOTLB dance (Jason has a patch that
> should help with that), and we get the 'use dma api' side effects in the
> guest (e.g. virtqueue's data go <2G + some overhead).

Nod.

> 
> > 
> > The remaining difficulty is that the virtio driver - since it can sit
> > on multiple buses - won't know this, and will reject the
> > ACCESS_PLATFORM flag, even though it could just do what it normally
> > does on ccw and it would work.  
> 
> Right ACCESS_PLATFORM is a funny feature where the device refuses to
> work if the driver does not ack.
> 
> > 
> > For that case, we could consider a hack in qemu where for virtio-ccw
> > devices *only* we allow the guest to nack the ACCESS_PLATFORM flag and
> > carry on anyway.  Normally we insist that the guest accept the
> > ACCESS_PLATFORM flag if offered, because on most platforms they
> > *don't* amount to the same thing.  
> 
> Jason found a nice way to differentiate between needs translation and
> does not need translation. But that patch still requires the ack by the
> driver (and as Michael has pointed out we have to consider migration).
> 
> I'm afraid that  F_IOMMU_PLATFORM means different things in different
> contexts, and that this ain't sufficiently documented. I'm tempted to do
> a proper write-up on this (let's hope my motivation will and my time
> will allow). I would also very much like to have Conny's opinion on this.

More documentation is never a bad idea; but I'm afraid I don't have any
further insights at the moment.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2020-02-25 18:08 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-20 16:06 [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Halil Pasic
2020-02-20 16:06 ` [PATCH 1/2] mm: move force_dma_unencrypted() to mem_encrypt.h Halil Pasic
2020-02-20 16:11   ` Christoph Hellwig
2020-02-20 16:23     ` Christian Borntraeger
2020-02-20 16:31       ` Christoph Hellwig
2020-02-20 17:00         ` Christian Borntraeger
2020-02-21  3:27         ` David Gibson
2020-02-21 13:06           ` Halil Pasic
2020-02-21 15:48             ` Michael S. Tsirkin
2020-02-21 18:07               ` Halil Pasic
2020-02-24  3:33                 ` David Gibson
2020-02-24 18:49                   ` Halil Pasic
2020-02-25 18:08                     ` Cornelia Huck [this message]
2020-02-28  0:23                       ` David Gibson
2020-02-20 16:06 ` [PATCH 2/2] virtio: let virtio use DMA API when guest RAM is protected Halil Pasic
2020-02-20 16:13   ` Christoph Hellwig
2020-02-21  2:59     ` David Gibson
2020-02-21  3:41       ` Jason Wang
2020-02-21 13:31         ` Halil Pasic
2020-02-21 13:27       ` Halil Pasic
2020-02-21 16:36       ` Christoph Hellwig
2020-02-24  6:50         ` David Gibson
2020-02-24 18:59         ` Halil Pasic
2020-02-21 14:33     ` Halil Pasic
2020-02-21 16:39       ` Christoph Hellwig
2020-02-21 18:16         ` Halil Pasic
2020-02-22 19:07       ` Michael S. Tsirkin
2020-02-24 17:16         ` Christoph Hellwig
     [not found]           ` <691d8c8e-665c-b05f-383f-78377fcf6741@amazon.com>
2020-10-28 18:01             ` Michael S. Tsirkin
2020-02-20 20:55   ` Michael S. Tsirkin
2020-02-21  1:17     ` Ram Pai
2020-02-21  3:29       ` David Gibson
2020-02-21 13:12     ` Halil Pasic
2020-02-21 15:39       ` Tom Lendacky
2020-02-24  6:40         ` David Gibson
2020-02-21 15:56       ` Michael S. Tsirkin
2020-02-21 16:35         ` Christoph Hellwig
2020-02-21 18:03         ` Halil Pasic
2020-02-20 20:48 ` [PATCH 0/2] virtio: decouple protected guest RAM form VIRTIO_F_IOMMU_PLATFORM Michael S. Tsirkin
2020-02-20 21:29 ` Michael S. Tsirkin
2020-02-21 13:37   ` Halil Pasic
2020-02-20 21:33 ` Michael S. Tsirkin
2020-02-21 13:49   ` Halil Pasic
2020-02-21 16:41   ` Christoph Hellwig
2020-02-24  5:44     ` David Gibson
2020-02-21  6:22 ` Jason Wang
2020-02-21 14:56   ` Halil Pasic
2020-02-24  3:38     ` David Gibson
2020-02-24  4:01     ` Jason Wang
2020-02-24  6:06       ` Michael S. Tsirkin
2020-02-24  6:45         ` Jason Wang
2020-02-24  7:48           ` Michael S. Tsirkin
2020-02-24  9:26             ` Jason Wang
2020-02-24 13:40               ` Michael S. Tsirkin
2020-02-25  3:38                 ` Jason Wang
2020-02-24 13:56               ` Halil Pasic
2020-02-25  3:30                 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200225190802.753cffef.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=bauerman@linux.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=frankja@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxram@us.ibm.com \
    --cc=m.szyprowski@samsung.com \
    --cc=mihajlov@linux.ibm.com \
    --cc=mimu@linux.ibm.com \
    --cc=mst@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).