All of lore.kernel.org
 help / color / mirror / Atom feed
From: Knut Omang <knut.omang@oracle.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>,
	David Woodhouse <dwmw2@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	sparclinux@vger.kernel.org, Joerg Roedel <jroedel@suse.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christoph Hellwig <hch@lst.de>, KVM <kvm@vger.kernel.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v4 0/6] virtio core DMA API conversion
Date: Tue, 10 Nov 2015 10:45:14 +0100	[thread overview]
Message-ID: <1447148714.3005.133.camel@oracle.com> (raw)
In-Reply-To: <1447121076.31884.61.camel@kernel.crashing.org>

On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded.  If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes.  (Same if, say, DPDK gets used, I think.)
> > 
> > > 
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > > 
> > > Any better idea ?
> > 
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> >   I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
> 
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
> 
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
> 
> The "in absence of the new DT binding" doesn't make that much sense.
> 
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
> 
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
> 
>   - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
> 
>   - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
> 
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
> 
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
> 
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
> 
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
> 
> Unless somebody can come up with a better idea...

Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?

Knut

> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Knut Omang <knut.omang@oracle.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>,
	David Woodhouse <dwmw2@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	sparclinux@vger.kernel.org, Joerg Roedel <jroedel@suse.de>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Cornelia Huck <cornelia.huck@de.ibm.com>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Christoph Hellwig <hch@lst.de>, KVM <kvm@vger.kernel.org>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux Virtualization <virtualization@lists.linux-foundation.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v4 0/6] virtio core DMA API conversion
Date: Tue, 10 Nov 2015 09:45:14 +0000	[thread overview]
Message-ID: <1447148714.3005.133.camel@oracle.com> (raw)
In-Reply-To: <1447121076.31884.61.camel@kernel.crashing.org>

On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded.  If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes.  (Same if, say, DPDK gets used, I think.)
> > 
> > > 
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > > 
> > > Any better idea ?
> > 
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> >   I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
> 
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
> 
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
> 
> The "in absence of the new DT binding" doesn't make that much sense.
> 
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
> 
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
> 
>   - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
> 
>   - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
> 
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
> 
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
> 
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
> 
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
> 
> Unless somebody can come up with a better idea...

Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?

Knut

> Cheers,
> Ben.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-11-10  9:47 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-30  1:09 [PATCH v4 0/6] virtio core DMA API conversion Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 1/6] virtio-net: Stop doing DMA from the stack Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30 13:55   ` Christian Borntraeger
2015-10-30 13:55     ` Christian Borntraeger
2015-10-31  5:02     ` Andy Lutomirski
2015-10-31  5:02       ` Andy Lutomirski
2015-10-31  5:02       ` Andy Lutomirski
2015-10-30 13:55   ` Christian Borntraeger
2015-10-30  1:09 ` [PATCH v4 2/6] virtio_ring: Support DMA APIs Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30 12:01   ` Cornelia Huck
2015-10-30 12:01     ` Cornelia Huck
2015-10-30 12:01     ` Cornelia Huck
2015-10-30 12:05     ` Christian Borntraeger
2015-10-30 12:05       ` Christian Borntraeger
2015-10-30 12:05       ` Christian Borntraeger
2015-10-30 18:51       ` Andy Lutomirski
2015-10-30 18:51       ` Andy Lutomirski
2015-10-30 18:51         ` Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 3/6] virtio_pci: Use the DMA API Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 4/6] virtio: Add improved queue allocation API Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 5/6] virtio_mmio: Use the DMA API Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:09 ` [PATCH v4 6/6] virtio_pci: " Andy Lutomirski
2015-10-30  1:09   ` Andy Lutomirski
2015-10-30  1:09 ` Andy Lutomirski
2015-10-30  1:17 ` [PATCH v4 0/6] virtio core DMA API conversion Andy Lutomirski
2015-10-30  1:17   ` Andy Lutomirski
2015-10-30  1:17   ` Andy Lutomirski
2015-10-30  9:57 ` Christian Borntraeger
2015-10-30  9:57   ` Christian Borntraeger
2015-10-30  9:57 ` Christian Borntraeger
2015-11-09 12:15 ` Michael S. Tsirkin
2015-11-09 12:15   ` Michael S. Tsirkin
2015-11-09 12:15   ` Michael S. Tsirkin
2015-11-09 12:27   ` Paolo Bonzini
2015-11-09 12:27     ` Paolo Bonzini
2015-11-09 12:27   ` Paolo Bonzini
2015-11-09 22:58   ` Benjamin Herrenschmidt
2015-11-09 22:58     ` Benjamin Herrenschmidt
2015-11-09 22:58     ` Benjamin Herrenschmidt
2015-11-10  0:46     ` Andy Lutomirski
2015-11-10  0:46       ` Andy Lutomirski
2015-11-10  0:46       ` Andy Lutomirski
2015-11-10  2:04       ` Benjamin Herrenschmidt
2015-11-10  2:04         ` Benjamin Herrenschmidt
2015-11-10  2:04         ` Benjamin Herrenschmidt
2015-11-10  2:18         ` Andy Lutomirski
2015-11-10  2:18         ` Andy Lutomirski
2015-11-10  2:18           ` Andy Lutomirski
2015-11-10  5:26           ` Benjamin Herrenschmidt
2015-11-10  5:26             ` Benjamin Herrenschmidt
2015-11-10  5:26             ` Benjamin Herrenschmidt
2015-11-10  5:33             ` Andy Lutomirski
2015-11-10  5:33             ` Andy Lutomirski
2015-11-10  5:33               ` Andy Lutomirski
2015-11-10  5:28           ` Benjamin Herrenschmidt
2015-11-10  5:28             ` Benjamin Herrenschmidt
2015-11-10  5:28             ` Benjamin Herrenschmidt
2015-11-10  5:35             ` Andy Lutomirski
2015-11-10  5:35             ` Andy Lutomirski
2015-11-10  5:35               ` Andy Lutomirski
2015-11-10 10:37               ` Benjamin Herrenschmidt
2015-11-10 10:37                 ` Benjamin Herrenschmidt
2015-11-10 10:37                 ` Benjamin Herrenschmidt
2015-11-10 12:43                 ` Michael S. Tsirkin
2015-11-10 12:43                   ` Michael S. Tsirkin
2015-11-10 12:43                   ` Michael S. Tsirkin
2015-11-10 19:37                   ` Benjamin Herrenschmidt
2015-11-10 19:37                     ` Benjamin Herrenschmidt
2015-11-10 19:37                     ` Benjamin Herrenschmidt
2015-11-10 12:43                 ` Michael S. Tsirkin
2015-11-10 18:54                 ` Andy Lutomirski
2015-11-10 18:54                   ` Andy Lutomirski
2015-11-10 18:54                   ` Andy Lutomirski
2015-11-10 22:27                   ` Benjamin Herrenschmidt
2015-11-10 22:27                     ` Benjamin Herrenschmidt
2015-11-10 22:27                     ` Benjamin Herrenschmidt
2015-11-10 23:44                     ` Andy Lutomirski
2015-11-10 23:44                     ` Andy Lutomirski
2015-11-10 23:44                       ` Andy Lutomirski
2015-11-11  0:44                       ` Benjamin Herrenschmidt
2015-11-11  0:44                         ` Benjamin Herrenschmidt
2015-11-11  0:44                         ` Benjamin Herrenschmidt
2015-11-11  4:46                         ` Andy Lutomirski
2015-11-11  4:46                           ` Andy Lutomirski
2015-11-11  4:46                           ` Andy Lutomirski
2015-11-11  5:08                           ` Benjamin Herrenschmidt
2015-11-11  5:08                             ` Benjamin Herrenschmidt
2015-11-11  5:08                             ` Benjamin Herrenschmidt
2015-11-10  7:28           ` Jan Kiszka
2015-11-10  7:28             ` Jan Kiszka
2015-11-10  7:28             ` Jan Kiszka
2015-11-10  7:28             ` Jan Kiszka
2015-11-10  9:45         ` Knut Omang
2015-11-10  9:45         ` Knut Omang [this message]
2015-11-10  9:45           ` Knut Omang
2015-11-10 10:26           ` Benjamin Herrenschmidt
2015-11-10 10:26             ` Benjamin Herrenschmidt
2015-11-10 10:26             ` Benjamin Herrenschmidt
2015-11-10 10:27         ` Joerg Roedel
2015-11-10 10:27           ` Joerg Roedel
2015-11-10 10:27           ` Joerg Roedel
2015-11-10 19:36           ` Benjamin Herrenschmidt
2015-11-10 19:36             ` Benjamin Herrenschmidt
2015-11-10 19:36             ` Benjamin Herrenschmidt
  -- strict thread matches above, loose matches on Subject: below --
2015-10-30  1:09 Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1447148714.3005.133.camel@oracle.com \
    --to=knut.omang@oracle.com \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=davem@davemloft.net \
    --cc=dwmw2@infradead.org \
    --cc=hch@lst.de \
    --cc=jroedel@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=sebott@linux.vnet.ibm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.