* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 2:04 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 2:04 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> The problem here is that in some of the problematic cases the virtio
> driver may not even be loaded. If someone runs an L1 guest with an
> IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
>
> >
> > The only way out of this while keeping the "platform" stuff would be to
> > also bump some kind of version in the virtio config (or PCI header). I
> > have no other way to differenciate between "this is an old qemu that
> > doesn't do the 'bypass property' yet" from "this is a virtio device
> > that doesn't bypass".
> >
> > Any better idea ?
>
> I'd suggest that, in the absence of the new DT binding, we assume that
> any PCI device with the virtio vendor ID is passthrough on powerpc. I
> can do this in the virtio driver, but if it's in the platform code
> then vfio gets it right too (i.e. fails to load).
The problem is there isn't *a* virtio vendor ID. It's the RedHat vendor
ID which will be used by more than just virtio, so we need to
specifically list the devices.
Additionally, that still means that once we have a virtio device that
actually uses the iommu, powerpc will not work since the "workaround"
above will kick in.
The "in absence of the new DT binding" doesn't make that much sense.
Those platforms use device-trees defined since the dawn of ages by
actual open firmware implementations, they either have no iommu
representation in there (Macs, the platform code hooks it all up) or
have various properties related to the iommu but no concept of "bypass"
in there.
We can *add* a new property under some circumstances that indicates a
bypass on a per-device basis, however that doesn't completely solve it:
- As I said above, what does the absence of that property mean ? An
old qemu that does bypass on all virtio or a new qemu trying to tell
you that the virtio device actually does use the iommu (or some other
environment that isn't qemu) ?
- On things like macs, the device-tree is generated by openbios, it
would have to have some added logic to try to figure that out, which
means it needs to know *via different means* that some or all virtio
devices bypass the iommu.
I thus go back to my original statement, it's a LOT easier to handle if
the device itself is self describing, indicating whether it is set to
bypass a host iommu or not. For L1->L2, well, that wouldn't be the
first time qemu/VFIO plays tricks with the passed through device
configuration space...
Note that the above can be solved via some kind of compromise: The
device self describes the ability to honor the iommu, along with the
property (or ACPI table entry) that indicates whether or not it does.
IE. We could use the revision or ProgIf field of the config space for
example. Or something in virtio config. If it's an "old" device, we
know it always bypass. If it's a new device, we know it only bypasses
if the corresponding property is in. I still would have to sort out the
openbios case for mac among others but it's at least a workable
direction.
BTW. Don't you have a similar problem on x86 that today qemu claims
that everything honors the iommu in ACPI ?
Unless somebody can come up with a better idea...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 2:04 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 2:04 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> The problem here is that in some of the problematic cases the virtio
> driver may not even be loaded. If someone runs an L1 guest with an
> IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
>
> >
> > The only way out of this while keeping the "platform" stuff would be to
> > also bump some kind of version in the virtio config (or PCI header). I
> > have no other way to differenciate between "this is an old qemu that
> > doesn't do the 'bypass property' yet" from "this is a virtio device
> > that doesn't bypass".
> >
> > Any better idea ?
>
> I'd suggest that, in the absence of the new DT binding, we assume that
> any PCI device with the virtio vendor ID is passthrough on powerpc. I
> can do this in the virtio driver, but if it's in the platform code
> then vfio gets it right too (i.e. fails to load).
The problem is there isn't *a* virtio vendor ID. It's the RedHat vendor
ID which will be used by more than just virtio, so we need to
specifically list the devices.
Additionally, that still means that once we have a virtio device that
actually uses the iommu, powerpc will not work since the "workaround"
above will kick in.
The "in absence of the new DT binding" doesn't make that much sense.
Those platforms use device-trees defined since the dawn of ages by
actual open firmware implementations, they either have no iommu
representation in there (Macs, the platform code hooks it all up) or
have various properties related to the iommu but no concept of "bypass"
in there.
We can *add* a new property under some circumstances that indicates a
bypass on a per-device basis, however that doesn't completely solve it:
- As I said above, what does the absence of that property mean ? An
old qemu that does bypass on all virtio or a new qemu trying to tell
you that the virtio device actually does use the iommu (or some other
environment that isn't qemu) ?
- On things like macs, the device-tree is generated by openbios, it
would have to have some added logic to try to figure that out, which
means it needs to know *via different means* that some or all virtio
devices bypass the iommu.
I thus go back to my original statement, it's a LOT easier to handle if
the device itself is self describing, indicating whether it is set to
bypass a host iommu or not. For L1->L2, well, that wouldn't be the
first time qemu/VFIO plays tricks with the passed through device
configuration space...
Note that the above can be solved via some kind of compromise: The
device self describes the ability to honor the iommu, along with the
property (or ACPI table entry) that indicates whether or not it does.
IE. We could use the revision or ProgIf field of the config space for
example. Or something in virtio config. If it's an "old" device, we
know it always bypass. If it's a new device, we know it only bypasses
if the corresponding property is in. I still would have to sort out the
openbios case for mac among others but it's at least a workable
direction.
BTW. Don't you have a similar problem on x86 that today qemu claims
that everything honors the iommu in ACPI ?
Unless somebody can come up with a better idea...
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:04 ` Benjamin Herrenschmidt
(?)
(?)
@ 2015-11-10 2:18 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 2:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
>> The problem here is that in some of the problematic cases the virtio
>> driver may not even be loaded. If someone runs an L1 guest with an
>> IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
>> *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
>>
>> >
>> > The only way out of this while keeping the "platform" stuff would be to
>> > also bump some kind of version in the virtio config (or PCI header). I
>> > have no other way to differenciate between "this is an old qemu that
>> > doesn't do the 'bypass property' yet" from "this is a virtio device
>> > that doesn't bypass".
>> >
>> > Any better idea ?
>>
>> I'd suggest that, in the absence of the new DT binding, we assume that
>> any PCI device with the virtio vendor ID is passthrough on powerpc. I
>> can do this in the virtio driver, but if it's in the platform code
>> then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
Really?
/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
static const struct pci_device_id virtio_pci_id_table[] = {
{ PCI_DEVICE(0x1af4, PCI_ANY_ID) },
{ 0 }
};
Can we match on that range?
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
I don't know how to solve that problem, though, especially since the
vendor of such a device (especially if it's real hardware) might not
set any new bit.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
Which leaves the special case of Xen, where even preexisting devices
don't bypass the IOMMU. Can we keep this specific to powerpc and
sparc? On x86, this problem is basically nonexistent, since the IOMMU
is properly self-describing.
IOW, I think that on x86 we should assume that all virtio devices
honor the IOMMU.
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
Only on a single experimental configuration, and that can apparently
just be fixed going forward without any real problems being caused.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:04 ` Benjamin Herrenschmidt
@ 2015-11-10 2:18 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 2:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
>> The problem here is that in some of the problematic cases the virtio
>> driver may not even be loaded. If someone runs an L1 guest with an
>> IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
>> *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
>>
>> >
>> > The only way out of this while keeping the "platform" stuff would be to
>> > also bump some kind of version in the virtio config (or PCI header). I
>> > have no other way to differenciate between "this is an old qemu that
>> > doesn't do the 'bypass property' yet" from "this is a virtio device
>> > that doesn't bypass".
>> >
>> > Any better idea ?
>>
>> I'd suggest that, in the absence of the new DT binding, we assume that
>> any PCI device with the virtio vendor ID is passthrough on powerpc. I
>> can do this in the virtio driver, but if it's in the platform code
>> then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
Really?
/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
static const struct pci_device_id virtio_pci_id_table[] = {
{ PCI_DEVICE(0x1af4, PCI_ANY_ID) },
{ 0 }
};
Can we match on that range?
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
I don't know how to solve that problem, though, especially since the
vendor of such a device (especially if it's real hardware) might not
set any new bit.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
Which leaves the special case of Xen, where even preexisting devices
don't bypass the IOMMU. Can we keep this specific to powerpc and
sparc? On x86, this problem is basically nonexistent, since the IOMMU
is properly self-describing.
IOW, I think that on x86 we should assume that all virtio devices
honor the IOMMU.
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
Only on a single experimental configuration, and that can apparently
just be fixed going forward without any real problems being caused.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 2:18 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 2:18 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
>> The problem here is that in some of the problematic cases the virtio
>> driver may not even be loaded. If someone runs an L1 guest with an
>> IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
>> *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
>>
>> >
>> > The only way out of this while keeping the "platform" stuff would be to
>> > also bump some kind of version in the virtio config (or PCI header). I
>> > have no other way to differenciate between "this is an old qemu that
>> > doesn't do the 'bypass property' yet" from "this is a virtio device
>> > that doesn't bypass".
>> >
>> > Any better idea ?
>>
>> I'd suggest that, in the absence of the new DT binding, we assume that
>> any PCI device with the virtio vendor ID is passthrough on powerpc. I
>> can do this in the virtio driver, but if it's in the platform code
>> then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
Really?
/* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF. */
static const struct pci_device_id virtio_pci_id_table[] = {
{ PCI_DEVICE(0x1af4, PCI_ANY_ID) },
{ 0 }
};
Can we match on that range?
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
I don't know how to solve that problem, though, especially since the
vendor of such a device (especially if it's real hardware) might not
set any new bit.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
Which leaves the special case of Xen, where even preexisting devices
don't bypass the IOMMU. Can we keep this specific to powerpc and
sparc? On x86, this problem is basically nonexistent, since the IOMMU
is properly self-describing.
IOW, I think that on x86 we should assume that all virtio devices
honor the IOMMU.
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
Only on a single experimental configuration, and that can apparently
just be fixed going forward without any real problems being caused.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:18 ` Andy Lutomirski
(?)
@ 2015-11-10 5:26 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:26 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
You don't like performances ? :-)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:26 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
You don't like performances ? :-)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:26 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
You don't like performances ? :-)
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 5:26 ` Benjamin Herrenschmidt
(?)
(?)
@ 2015-11-10 5:33 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:33 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, Nov 9, 2015 at 9:26 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> Which leaves the special case of Xen, where even preexisting devices
>> don't bypass the IOMMU. Can we keep this specific to powerpc and
>> sparc? On x86, this problem is basically nonexistent, since the IOMMU
>> is properly self-describing.
>>
>> IOW, I think that on x86 we should assume that all virtio devices
>> honor the IOMMU.
>
> You don't like performances ? :-)
This should have basically no effect. Every non-experimental x86
virtio setup in existence either doesn't work at all (Xen) or has DMA
ops that are no-ops.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 5:26 ` Benjamin Herrenschmidt
@ 2015-11-10 5:33 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:33 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 9:26 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> Which leaves the special case of Xen, where even preexisting devices
>> don't bypass the IOMMU. Can we keep this specific to powerpc and
>> sparc? On x86, this problem is basically nonexistent, since the IOMMU
>> is properly self-describing.
>>
>> IOW, I think that on x86 we should assume that all virtio devices
>> honor the IOMMU.
>
> You don't like performances ? :-)
This should have basically no effect. Every non-experimental x86
virtio setup in existence either doesn't work at all (Xen) or has DMA
ops that are no-ops.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:33 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:33 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 9:26 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> Which leaves the special case of Xen, where even preexisting devices
>> don't bypass the IOMMU. Can we keep this specific to powerpc and
>> sparc? On x86, this problem is basically nonexistent, since the IOMMU
>> is properly self-describing.
>>
>> IOW, I think that on x86 we should assume that all virtio devices
>> honor the IOMMU.
>
> You don't like performances ? :-)
This should have basically no effect. Every non-experimental x86
virtio setup in existence either doesn't work at all (Xen) or has DMA
ops that are no-ops.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:18 ` Andy Lutomirski
(?)
@ 2015-11-10 5:28 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:28 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
> */
> static const struct pci_device_id virtio_pci_id_table[] = {
> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
> { 0 }
> };
>
> Can we match on that range?
We can, but the problem remains, how do we differenciate an existing
device that does bypass vs. a newer one that needs the IOMMU and thus
doesn't have the new "bypass" property in the device-tree.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:28 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:28 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
> */
> static const struct pci_device_id virtio_pci_id_table[] = {
> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
> { 0 }
> };
>
> Can we match on that range?
We can, but the problem remains, how do we differenciate an existing
device that does bypass vs. a newer one that needs the IOMMU and thus
doesn't have the new "bypass" property in the device-tree.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:28 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 5:28 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>
> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
> */
> static const struct pci_device_id virtio_pci_id_table[] = {
> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
> { 0 }
> };
>
> Can we match on that range?
We can, but the problem remains, how do we differenciate an existing
device that does bypass vs. a newer one that needs the IOMMU and thus
doesn't have the new "bypass" property in the device-tree.
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 5:28 ` Benjamin Herrenschmidt
(?)
(?)
@ 2015-11-10 5:35 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:35 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, Nov 9, 2015 at 9:28 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
>> */
>> static const struct pci_device_id virtio_pci_id_table[] = {
>> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
>> { 0 }
>> };
>>
>> Can we match on that range?
>
> We can, but the problem remains, how do we differenciate an existing
> device that does bypass vs. a newer one that needs the IOMMU and thus
> doesn't have the new "bypass" property in the device-tree.
>
We could do it the other way around: on powerpc, if a PCI device is in
that range and doesn't have the "bypass" property at all, then it's
assumed to bypass the IOMMU. This means that everything that
currently works continues working. If someone builds a physical
virtio device or uses another system in PCIe target mode speaking
virtio, then it won't work until they upgrade their firmware to set
bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
bypass=0 and no ambiguity.
vfio will presumably notice the bypass and correctly refuse to map any
current virtio devices.
Would that work?
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 5:28 ` Benjamin Herrenschmidt
@ 2015-11-10 5:35 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:35 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 9:28 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
>> */
>> static const struct pci_device_id virtio_pci_id_table[] = {
>> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
>> { 0 }
>> };
>>
>> Can we match on that range?
>
> We can, but the problem remains, how do we differenciate an existing
> device that does bypass vs. a newer one that needs the IOMMU and thus
> doesn't have the new "bypass" property in the device-tree.
>
We could do it the other way around: on powerpc, if a PCI device is in
that range and doesn't have the "bypass" property at all, then it's
assumed to bypass the IOMMU. This means that everything that
currently works continues working. If someone builds a physical
virtio device or uses another system in PCIe target mode speaking
virtio, then it won't work until they upgrade their firmware to set
bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
bypass=0 and no ambiguity.
vfio will presumably notice the bypass and correctly refuse to map any
current virtio devices.
Would that work?
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 5:35 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 5:35 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, Nov 9, 2015 at 9:28 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Mon, 2015-11-09 at 18:18 -0800, Andy Lutomirski wrote:
>>
>> /* Qumranet donated their vendor ID for devices 0x1000 thru 0x10FF.
>> */
>> static const struct pci_device_id virtio_pci_id_table[] = {
>> { PCI_DEVICE(0x1af4, PCI_ANY_ID) },
>> { 0 }
>> };
>>
>> Can we match on that range?
>
> We can, but the problem remains, how do we differenciate an existing
> device that does bypass vs. a newer one that needs the IOMMU and thus
> doesn't have the new "bypass" property in the device-tree.
>
We could do it the other way around: on powerpc, if a PCI device is in
that range and doesn't have the "bypass" property at all, then it's
assumed to bypass the IOMMU. This means that everything that
currently works continues working. If someone builds a physical
virtio device or uses another system in PCIe target mode speaking
virtio, then it won't work until they upgrade their firmware to set
bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
bypass=0 and no ambiguity.
vfio will presumably notice the bypass and correctly refuse to map any
current virtio devices.
Would that work?
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 5:35 ` Andy Lutomirski
(?)
@ 2015-11-10 10:37 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:37 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
>
> We could do it the other way around: on powerpc, if a PCI device is in
> that range and doesn't have the "bypass" property at all, then it's
> assumed to bypass the IOMMU. This means that everything that
> currently works continues working. If someone builds a physical
> virtio device or uses another system in PCIe target mode speaking
> virtio, then it won't work until they upgrade their firmware to set
> bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> bypass=0 and no ambiguity.
>
> vfio will presumably notice the bypass and correctly refuse to map any
> current virtio devices.
>
> Would that work?
That would be extremely strange from a platform perspective. Any device
in that vendor/device range would bypass the iommu unless some new
property "actually-works-like-a-real-pci-device" happens to exist in
the device-tree, which we would then need to define somewhere and
handle accross at least 3 different platforms who get their device-tree
from widly different places.
Also if tomorrow I create a PCI device that implements virtio-net and
put it in a machine running IBM proprietary firmware (or Apple's or
Sun's), it won't have that property...
This is not hypothetical. People are using virtio to do point-to-point
communication between machines via PCIe today.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:37 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:37 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
>
> We could do it the other way around: on powerpc, if a PCI device is in
> that range and doesn't have the "bypass" property at all, then it's
> assumed to bypass the IOMMU. This means that everything that
> currently works continues working. If someone builds a physical
> virtio device or uses another system in PCIe target mode speaking
> virtio, then it won't work until they upgrade their firmware to set
> bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> bypass=0 and no ambiguity.
>
> vfio will presumably notice the bypass and correctly refuse to map any
> current virtio devices.
>
> Would that work?
That would be extremely strange from a platform perspective. Any device
in that vendor/device range would bypass the iommu unless some new
property "actually-works-like-a-real-pci-device" happens to exist in
the device-tree, which we would then need to define somewhere and
handle accross at least 3 different platforms who get their device-tree
from widly different places.
Also if tomorrow I create a PCI device that implements virtio-net and
put it in a machine running IBM proprietary firmware (or Apple's or
Sun's), it won't have that property...
This is not hypothetical. People are using virtio to do point-to-point
communication between machines via PCIe today.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:37 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:37 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
>
> We could do it the other way around: on powerpc, if a PCI device is in
> that range and doesn't have the "bypass" property at all, then it's
> assumed to bypass the IOMMU. This means that everything that
> currently works continues working. If someone builds a physical
> virtio device or uses another system in PCIe target mode speaking
> virtio, then it won't work until they upgrade their firmware to set
> bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> bypass=0 and no ambiguity.
>
> vfio will presumably notice the bypass and correctly refuse to map any
> current virtio devices.
>
> Would that work?
That would be extremely strange from a platform perspective. Any device
in that vendor/device range would bypass the iommu unless some new
property "actually-works-like-a-real-pci-device" happens to exist in
the device-tree, which we would then need to define somewhere and
handle accross at least 3 different platforms who get their device-tree
from widly different places.
Also if tomorrow I create a PCI device that implements virtio-net and
put it in a machine running IBM proprietary firmware (or Apple's or
Sun's), it won't have that property...
This is not hypothetical. People are using virtio to do point-to-point
communication between machines via PCIe today.
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 10:37 ` Benjamin Herrenschmidt
(?)
@ 2015-11-10 12:43 ` Michael S. Tsirkin
-1 siblings, 0 replies; 83+ messages in thread
From: Michael S. Tsirkin @ 2015-11-10 12:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Joerg Roedel, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization
On Tue, Nov 10, 2015 at 09:37:54PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
Then we are back to virtio driver telling DMA core
whether it wants a 1:1 mapping in the iommu?
If that's acceptable to others, I don't think that's too bad.
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
>
> Cheers,
> Ben.
But not virtio-pci I think - that's broken for that usecase since we use
weaker barriers than required for real IO, as these have measureable
overhead. We could have a feature "is a real PCI device",
that's completely reasonable.
--
MST
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 12:43 ` Michael S. Tsirkin
0 siblings, 0 replies; 83+ messages in thread
From: Michael S. Tsirkin @ 2015-11-10 12:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Joerg Roedel, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization
On Tue, Nov 10, 2015 at 09:37:54PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
Then we are back to virtio driver telling DMA core
whether it wants a 1:1 mapping in the iommu?
If that's acceptable to others, I don't think that's too bad.
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
>
> Cheers,
> Ben.
But not virtio-pci I think - that's broken for that usecase since we use
weaker barriers than required for real IO, as these have measureable
overhead. We could have a feature "is a real PCI device",
that's completely reasonable.
--
MST
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 12:43 ` Michael S. Tsirkin
0 siblings, 0 replies; 83+ messages in thread
From: Michael S. Tsirkin @ 2015-11-10 12:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Joerg Roedel, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization
On Tue, Nov 10, 2015 at 09:37:54PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
Then we are back to virtio driver telling DMA core
whether it wants a 1:1 mapping in the iommu?
If that's acceptable to others, I don't think that's too bad.
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
>
> Cheers,
> Ben.
But not virtio-pci I think - that's broken for that usecase since we use
weaker barriers than required for real IO, as these have measureable
overhead. We could have a feature "is a real PCI device",
that's completely reasonable.
--
MST
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 12:43 ` Michael S. Tsirkin
(?)
@ 2015-11-10 19:37 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Joerg Roedel, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote:
> But not virtio-pci I think - that's broken for that usecase since we use
> weaker barriers than required for real IO, as these have measureable
> overhead. We could have a feature "is a real PCI device",
> that's completely reasonable.
Do we use weaker barriers on the Linux driver side ? I didn't think so
...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 19:37 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Joerg Roedel, KVM, linux-s390, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote:
> But not virtio-pci I think - that's broken for that usecase since we use
> weaker barriers than required for real IO, as these have measureable
> overhead. We could have a feature "is a real PCI device",
> that's completely reasonable.
Do we use weaker barriers on the Linux driver side ? I didn't think so
...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 19:37 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Joerg Roedel, KVM, linux-s390, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote:
> But not virtio-pci I think - that's broken for that usecase since we use
> weaker barriers than required for real IO, as these have measureable
> overhead. We could have a feature "is a real PCI device",
> that's completely reasonable.
Do we use weaker barriers on the Linux driver side ? I didn't think so
...
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 10:37 ` Benjamin Herrenschmidt
` (2 preceding siblings ...)
(?)
@ 2015-11-10 12:43 ` Michael S. Tsirkin
-1 siblings, 0 replies; 83+ messages in thread
From: Michael S. Tsirkin @ 2015-11-10 12:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, Nov 10, 2015 at 09:37:54PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
Then we are back to virtio driver telling DMA core
whether it wants a 1:1 mapping in the iommu?
If that's acceptable to others, I don't think that's too bad.
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
>
> Cheers,
> Ben.
But not virtio-pci I think - that's broken for that usecase since we use
weaker barriers than required for real IO, as these have measureable
overhead. We could have a feature "is a real PCI device",
that's completely reasonable.
--
MST
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 10:37 ` Benjamin Herrenschmidt
(?)
@ 2015-11-10 18:54 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 18:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Christian Borntraeger, Paolo Bonzini, David Woodhouse,
Martin Schwidefsky, Michael S. Tsirkin, Sebastian Ott,
David S. Miller, linux-s390, Cornelia Huck, Joerg Roedel, KVM,
Christoph Hellwig, Linux Virtualization, linux-kernel,
sparclinux
On Nov 10, 2015 2:38 AM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
>
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
Does that work on powerpc on existing kernels?
Anyway, here's another crazy idea: make the quirk assume that the
IOMMU is bypasses if and only if the weak barriers bit is set on
systems that are missing the new DT binding.
--Andy
>
> Cheers,
> Ben.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 18:54 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 18:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Nov 10, 2015 2:38 AM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
>
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
Does that work on powerpc on existing kernels?
Anyway, here's another crazy idea: make the quirk assume that the
IOMMU is bypasses if and only if the weak barriers bit is set on
systems that are missing the new DT binding.
--Andy
>
> Cheers,
> Ben.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 18:54 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 18:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Nov 10, 2015 2:38 AM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:
> >
> > We could do it the other way around: on powerpc, if a PCI device is in
> > that range and doesn't have the "bypass" property at all, then it's
> > assumed to bypass the IOMMU. This means that everything that
> > currently works continues working. If someone builds a physical
> > virtio device or uses another system in PCIe target mode speaking
> > virtio, then it won't work until they upgrade their firmware to set
> > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets
> > bypass=0 and no ambiguity.
> >
> > vfio will presumably notice the bypass and correctly refuse to map any
> > current virtio devices.
> >
> > Would that work?
>
> That would be extremely strange from a platform perspective. Any device
> in that vendor/device range would bypass the iommu unless some new
> property "actually-works-like-a-real-pci-device" happens to exist in
> the device-tree, which we would then need to define somewhere and
> handle accross at least 3 different platforms who get their device-tree
> from widly different places.
>
> Also if tomorrow I create a PCI device that implements virtio-net and
> put it in a machine running IBM proprietary firmware (or Apple's or
> Sun's), it won't have that property...
>
> This is not hypothetical. People are using virtio to do point-to-point
> communication between machines via PCIe today.
Does that work on powerpc on existing kernels?
Anyway, here's another crazy idea: make the quirk assume that the
IOMMU is bypasses if and only if the weak barriers bit is set on
systems that are missing the new DT binding.
--Andy
>
> Cheers,
> Ben.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 18:54 ` Andy Lutomirski
(?)
@ 2015-11-10 22:27 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 22:27 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Christian Borntraeger, Paolo Bonzini, David Woodhouse,
Martin Schwidefsky, Michael S. Tsirkin, Sebastian Ott,
David S. Miller, linux-s390, Cornelia Huck, Joerg Roedel, KVM,
Christoph Hellwig, Linux Virtualization, linux-kernel,
sparclinux
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>
> Does that work on powerpc on existing kernels?
>
> Anyway, here's another crazy idea: make the quirk assume that the
> IOMMU is bypasses if and only if the weak barriers bit is set on
> systems that are missing the new DT binding.
"New DT bindings" doesn't mean much ... how do we change DT bindings on
existing machines with a FW in flash ?
What about partition <-> partition virtio such as what we could do on
PAPR systems. That would have the weak barrier bit.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 22:27 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 22:27 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>Â
> Does that work on powerpc on existing kernels?
>
> Anyway, here's another crazy idea: make the quirk assume that the
> IOMMU is bypasses if and only if the weak barriers bit is set on
> systems that are missing the new DT binding.
"New DT bindings" doesn't mean much ... how do we change DT bindings on
existing machines with a FW in flash ?
What about partition <-> partition virtio such as what we could do on
PAPR systems. That would have the weak barrier bit.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 22:27 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 22:27 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>
> Does that work on powerpc on existing kernels?
>
> Anyway, here's another crazy idea: make the quirk assume that the
> IOMMU is bypasses if and only if the weak barriers bit is set on
> systems that are missing the new DT binding.
"New DT bindings" doesn't mean much ... how do we change DT bindings on
existing machines with a FW in flash ?
What about partition <-> partition virtio such as what we could do on
PAPR systems. That would have the weak barrier bit.
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 22:27 ` Benjamin Herrenschmidt
(?)
(?)
@ 2015-11-10 23:44 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 23:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, Nov 10, 2015 at 2:27 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>>
>> Does that work on powerpc on existing kernels?
>>
>> Anyway, here's another crazy idea: make the quirk assume that the
>> IOMMU is bypasses if and only if the weak barriers bit is set on
>> systems that are missing the new DT binding.
>
> "New DT bindings" doesn't mean much ... how do we change DT bindings on
> existing machines with a FW in flash ?
>
> What about partition <-> partition virtio such as what we could do on
> PAPR systems. That would have the weak barrier bit.
>
Is it partition <-> partition, bypassing IOMMU?
I think I'd settle for just something that doesn't regress
non-experimental setups that actually work today and that allow new
setups (x86 with fixed QEMU and maybe something more complicated on
powerpc and/or sparc) to work in all cases.
We could certainly just make powerpc and sparc continue bypassing the
IOMMU until someone comes up with a way to fix it. I'll send out some
patches that do that, and maybe that'll help this make progress.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 22:27 ` Benjamin Herrenschmidt
@ 2015-11-10 23:44 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 23:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Christian Borntraeger, Paolo Bonzini, David Woodhouse,
Martin Schwidefsky, Michael S. Tsirkin, Sebastian Ott,
David S. Miller, linux-s390, Cornelia Huck, Joerg Roedel, KVM,
Christoph Hellwig, Linux Virtualization, linux-kernel,
sparclinux
On Tue, Nov 10, 2015 at 2:27 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>>
>> Does that work on powerpc on existing kernels?
>>
>> Anyway, here's another crazy idea: make the quirk assume that the
>> IOMMU is bypasses if and only if the weak barriers bit is set on
>> systems that are missing the new DT binding.
>
> "New DT bindings" doesn't mean much ... how do we change DT bindings on
> existing machines with a FW in flash ?
>
> What about partition <-> partition virtio such as what we could do on
> PAPR systems. That would have the weak barrier bit.
>
Is it partition <-> partition, bypassing IOMMU?
I think I'd settle for just something that doesn't regress
non-experimental setups that actually work today and that allow new
setups (x86 with fixed QEMU and maybe something more complicated on
powerpc and/or sparc) to work in all cases.
We could certainly just make powerpc and sparc continue bypassing the
IOMMU until someone comes up with a way to fix it. I'll send out some
patches that do that, and maybe that'll help this make progress.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 23:44 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-10 23:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Christian Borntraeger, Paolo Bonzini, David Woodhouse,
Martin Schwidefsky, Michael S. Tsirkin, Sebastian Ott,
David S. Miller, linux-s390, Cornelia Huck, Joerg Roedel, KVM,
Christoph Hellwig, Linux Virtualization, linux-kernel,
sparclinux
On Tue, Nov 10, 2015 at 2:27 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:
>>
>> Does that work on powerpc on existing kernels?
>>
>> Anyway, here's another crazy idea: make the quirk assume that the
>> IOMMU is bypasses if and only if the weak barriers bit is set on
>> systems that are missing the new DT binding.
>
> "New DT bindings" doesn't mean much ... how do we change DT bindings on
> existing machines with a FW in flash ?
>
> What about partition <-> partition virtio such as what we could do on
> PAPR systems. That would have the weak barrier bit.
>
Is it partition <-> partition, bypassing IOMMU?
I think I'd settle for just something that doesn't regress
non-experimental setups that actually work today and that allow new
setups (x86 with fixed QEMU and maybe something more complicated on
powerpc and/or sparc) to work in all cases.
We could certainly just make powerpc and sparc continue bypassing the
IOMMU until someone comes up with a way to fix it. I'll send out some
patches that do that, and maybe that'll help this make progress.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 23:44 ` Andy Lutomirski
(?)
@ 2015-11-11 0:44 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 0:44 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Christian Borntraeger, Paolo Bonzini, David Woodhouse,
Martin Schwidefsky, Michael S. Tsirkin, Sebastian Ott,
David S. Miller, linux-s390, Cornelia Huck, Joerg Roedel, KVM,
Christoph Hellwig, Linux Virtualization, linux-kernel,
sparclinux
On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
>
> > What about partition <-> partition virtio such as what we could do on
> > PAPR systems. That would have the weak barrier bit.
> >
>
> Is it partition <-> partition, bypassing IOMMU?
No.
> I think I'd settle for just something that doesn't regress
> non-experimental setups that actually work today and that allow new
> setups (x86 with fixed QEMU and maybe something more complicated on
> powerpc and/or sparc) to work in all cases.
>
> We could certainly just make powerpc and sparc continue bypassing the
> IOMMU until someone comes up with a way to fix it. I'll send out some
> patches that do that, and maybe that'll help this make progress.
But we haven't found a solution that works. All we have come up with is
a quirk that will force bypass on virtio always and will not allow us
to operate non-bypassing devices on either of those architectures in
the future.
I'm not too happy about this.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 0:44 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 0:44 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
>
> > What about partition <-> partition virtio such as what we could do on
> > PAPR systems. That would have the weak barrier bit.
> >
>
> Is it partition <-> partition, bypassing IOMMU?
No.
> I think I'd settle for just something that doesn't regress
> non-experimental setups that actually work today and that allow new
> setups (x86 with fixed QEMU and maybe something more complicated on
> powerpc and/or sparc) to work in all cases.
>
> We could certainly just make powerpc and sparc continue bypassing the
> IOMMU until someone comes up with a way to fix it. I'll send out some
> patches that do that, and maybe that'll help this make progress.
But we haven't found a solution that works. All we have come up with is
a quirk that will force bypass on virtio always and will not allow us
to operate non-bypassing devices on either of those architectures in
the future.
I'm not too happy about this.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 0:44 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 0:44 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, KVM, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Joerg Roedel, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
>
> > What about partition <-> partition virtio such as what we could do on
> > PAPR systems. That would have the weak barrier bit.
> >
>
> Is it partition <-> partition, bypassing IOMMU?
No.
> I think I'd settle for just something that doesn't regress
> non-experimental setups that actually work today and that allow new
> setups (x86 with fixed QEMU and maybe something more complicated on
> powerpc and/or sparc) to work in all cases.
>
> We could certainly just make powerpc and sparc continue bypassing the
> IOMMU until someone comes up with a way to fix it. I'll send out some
> patches that do that, and maybe that'll help this make progress.
But we haven't found a solution that works. All we have come up with is
a quirk that will force bypass on virtio always and will not allow us
to operate non-bypassing devices on either of those architectures in
the future.
I'm not too happy about this.
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-11 0:44 ` Benjamin Herrenschmidt
(?)
@ 2015-11-11 4:46 ` Andy Lutomirski
-1 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-11 4:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Christian Borntraeger, David Woodhouse, Paolo Bonzini,
Martin Schwidefsky, Michael S. Tsirkin, David S. Miller,
Sebastian Ott, linux-s390, Joerg Roedel, Cornelia Huck, KVM,
Christoph Hellwig, linux-kernel, Linux Virtualization,
sparclinux
On Nov 10, 2015 4:44 PM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
> >
> > > What about partition <-> partition virtio such as what we could do on
> > > PAPR systems. That would have the weak barrier bit.
> > >
> >
> > Is it partition <-> partition, bypassing IOMMU?
>
> No.
>
> > I think I'd settle for just something that doesn't regress
> > non-experimental setups that actually work today and that allow new
> > setups (x86 with fixed QEMU and maybe something more complicated on
> > powerpc and/or sparc) to work in all cases.
> >
> > We could certainly just make powerpc and sparc continue bypassing the
> > IOMMU until someone comes up with a way to fix it. I'll send out some
> > patches that do that, and maybe that'll help this make progress.
>
> But we haven't found a solution that works. All we have come up with is
> a quirk that will force bypass on virtio always and will not allow us
> to operate non-bypassing devices on either of those architectures in
> the future.
>
> I'm not too happy about this.
Me neither. At least it wouldn't be a regression, but it's still crappy.
I think that arm is fine, at least. I was unable to find an arm QEMU
config that has any problems with my patches.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 4:46 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-11 4:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, sparclinux, Joerg Roedel, KVM, Michael S. Tsirkin,
linux-kernel, Sebastian Ott, Christoph Hellwig,
Christian Borntraeger, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Nov 10, 2015 4:44 PM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
> >
> > > What about partition <-> partition virtio such as what we could do on
> > > PAPR systems. That would have the weak barrier bit.
> > >
> >
> > Is it partition <-> partition, bypassing IOMMU?
>
> No.
>
> > I think I'd settle for just something that doesn't regress
> > non-experimental setups that actually work today and that allow new
> > setups (x86 with fixed QEMU and maybe something more complicated on
> > powerpc and/or sparc) to work in all cases.
> >
> > We could certainly just make powerpc and sparc continue bypassing the
> > IOMMU until someone comes up with a way to fix it. I'll send out some
> > patches that do that, and maybe that'll help this make progress.
>
> But we haven't found a solution that works. All we have come up with is
> a quirk that will force bypass on virtio always and will not allow us
> to operate non-bypassing devices on either of those architectures in
> the future.
>
> I'm not too happy about this.
Me neither. At least it wouldn't be a regression, but it's still crappy.
I think that arm is fine, at least. I was unable to find an arm QEMU
config that has any problems with my patches.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 4:46 ` Andy Lutomirski
0 siblings, 0 replies; 83+ messages in thread
From: Andy Lutomirski @ 2015-11-11 4:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, sparclinux, Joerg Roedel, KVM, Michael S. Tsirkin,
linux-kernel, Sebastian Ott, Christoph Hellwig,
Christian Borntraeger, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Nov 10, 2015 4:44 PM, "Benjamin Herrenschmidt"
<benh@kernel.crashing.org> wrote:
>
> On Tue, 2015-11-10 at 15:44 -0800, Andy Lutomirski wrote:
> >
> > > What about partition <-> partition virtio such as what we could do on
> > > PAPR systems. That would have the weak barrier bit.
> > >
> >
> > Is it partition <-> partition, bypassing IOMMU?
>
> No.
>
> > I think I'd settle for just something that doesn't regress
> > non-experimental setups that actually work today and that allow new
> > setups (x86 with fixed QEMU and maybe something more complicated on
> > powerpc and/or sparc) to work in all cases.
> >
> > We could certainly just make powerpc and sparc continue bypassing the
> > IOMMU until someone comes up with a way to fix it. I'll send out some
> > patches that do that, and maybe that'll help this make progress.
>
> But we haven't found a solution that works. All we have come up with is
> a quirk that will force bypass on virtio always and will not allow us
> to operate non-bypassing devices on either of those architectures in
> the future.
>
> I'm not too happy about this.
Me neither. At least it wouldn't be a regression, but it's still crappy.
I think that arm is fine, at least. I was unable to find an arm QEMU
config that has any problems with my patches.
--Andy
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-11 4:46 ` Andy Lutomirski
(?)
@ 2015-11-11 5:08 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 5:08 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Christian Borntraeger, David Woodhouse, Paolo Bonzini,
Martin Schwidefsky, Michael S. Tsirkin, David S. Miller,
Sebastian Ott, linux-s390, Joerg Roedel, Cornelia Huck, KVM,
Christoph Hellwig, linux-kernel, Linux Virtualization,
sparclinux
On Tue, 2015-11-10 at 20:46 -0800, Andy Lutomirski wrote:
> Me neither. At least it wouldn't be a regression, but it's still
> crappy.
>
> I think that arm is fine, at least. I was unable to find an arm QEMU
> config that has any problems with my patches.
Ok, give me a few days for my headache & fever to subside see if I can
find something better. David, no idea from your side ? :-)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 5:08 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 5:08 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, Joerg Roedel, KVM, Michael S. Tsirkin,
linux-kernel, Sebastian Ott, Christoph Hellwig,
Christian Borntraeger, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 20:46 -0800, Andy Lutomirski wrote:
> Me neither. At least it wouldn't be a regression, but it's still
> crappy.
>
> I think that arm is fine, at least. I was unable to find an arm QEMU
> config that has any problems with my patches.
Ok, give me a few days for my headache & fever to subside see if I can
find something better. David, no idea from your side ? :-)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-11 5:08 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-11 5:08 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-s390, sparclinux, Joerg Roedel, KVM, Michael S. Tsirkin,
linux-kernel, Sebastian Ott, Christoph Hellwig,
Christian Borntraeger, Martin Schwidefsky, Paolo Bonzini,
Linux Virtualization, David Woodhouse, David S. Miller
On Tue, 2015-11-10 at 20:46 -0800, Andy Lutomirski wrote:
> Me neither. At least it wouldn't be a regression, but it's still
> crappy.
>
> I think that arm is fine, at least. I was unable to find an arm QEMU
> config that has any problems with my patches.
Ok, give me a few days for my headache & fever to subside see if I can
find something better. David, no idea from your side ? :-)
Cheers,
Ben.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:18 ` Andy Lutomirski
(?)
(?)
@ 2015-11-10 7:28 ` Jan Kiszka
-1 siblings, 0 replies; 83+ messages in thread
From: Jan Kiszka @ 2015-11-10 7:28 UTC (permalink / raw)
To: Andy Lutomirski, Benjamin Herrenschmidt
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On 2015-11-10 03:18, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
>> I thus go back to my original statement, it's a LOT easier to handle if
>> the device itself is self describing, indicating whether it is set to
>> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
>> first time qemu/VFIO plays tricks with the passed through device
>> configuration space...
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
>From the guest driver POV, that is OK because either there is no IOMMU
to program (the current situation with qemu), there can be one that
doesn't need it (the current situation with qemu and iommu=on) or there
is (Xen) or will be (future qemu) one that requires it.
>
>>
>> Note that the above can be solved via some kind of compromise: The
>> device self describes the ability to honor the iommu, along with the
>> property (or ACPI table entry) that indicates whether or not it does.
>>
>> IE. We could use the revision or ProgIf field of the config space for
>> example. Or something in virtio config. If it's an "old" device, we
>> know it always bypass. If it's a new device, we know it only bypasses
>> if the corresponding property is in. I still would have to sort out the
>> openbios case for mac among others but it's at least a workable
>> direction.
>>
>> BTW. Don't you have a similar problem on x86 that today qemu claims
>> that everything honors the iommu in ACPI ?
>
> Only on a single experimental configuration, and that can apparently
> just be fixed going forward without any real problems being caused.
BTW, I once tried to describe the current situation on QEMU x86 with
IOMMU enabled via ACPI. While you can easily add IOMMU device exceptions
to the static tables, the fun starts when considering device hotplug for
virtio. Unless I missed some trick, ACPI doesn't seem like being
designed for that level of flexibility.
You would have to reserve a complete PCI bus, declare that one as not
being IOMMU-governed, and then only add new virtio devices to that bus.
Possible, but a lot of restrictions that existing management software
would have to be aware of as well.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 7:28 ` Jan Kiszka
0 siblings, 0 replies; 83+ messages in thread
From: Jan Kiszka @ 2015-11-10 7:28 UTC (permalink / raw)
To: Andy Lutomirski, Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On 2015-11-10 03:18, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
>> I thus go back to my original statement, it's a LOT easier to handle if
>> the device itself is self describing, indicating whether it is set to
>> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
>> first time qemu/VFIO plays tricks with the passed through device
>> configuration space...
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
>From the guest driver POV, that is OK because either there is no IOMMU
to program (the current situation with qemu), there can be one that
doesn't need it (the current situation with qemu and iommu=on) or there
is (Xen) or will be (future qemu) one that requires it.
>
>>
>> Note that the above can be solved via some kind of compromise: The
>> device self describes the ability to honor the iommu, along with the
>> property (or ACPI table entry) that indicates whether or not it does.
>>
>> IE. We could use the revision or ProgIf field of the config space for
>> example. Or something in virtio config. If it's an "old" device, we
>> know it always bypass. If it's a new device, we know it only bypasses
>> if the corresponding property is in. I still would have to sort out the
>> openbios case for mac among others but it's at least a workable
>> direction.
>>
>> BTW. Don't you have a similar problem on x86 that today qemu claims
>> that everything honors the iommu in ACPI ?
>
> Only on a single experimental configuration, and that can apparently
> just be fixed going forward without any real problems being caused.
BTW, I once tried to describe the current situation on QEMU x86 with
IOMMU enabled via ACPI. While you can easily add IOMMU device exceptions
to the static tables, the fun starts when considering device hotplug for
virtio. Unless I missed some trick, ACPI doesn't seem like being
designed for that level of flexibility.
You would have to reserve a complete PCI bus, declare that one as not
being IOMMU-governed, and then only add new virtio devices to that bus.
Possible, but a lot of restrictions that existing management software
would have to be aware of as well.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 7:28 ` Jan Kiszka
0 siblings, 0 replies; 83+ messages in thread
From: Jan Kiszka @ 2015-11-10 7:28 UTC (permalink / raw)
To: Andy Lutomirski, Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On 2015-11-10 03:18, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
>> I thus go back to my original statement, it's a LOT easier to handle if
>> the device itself is self describing, indicating whether it is set to
>> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
>> first time qemu/VFIO plays tricks with the passed through device
>> configuration space...
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
From the guest driver POV, that is OK because either there is no IOMMU
to program (the current situation with qemu), there can be one that
doesn't need it (the current situation with qemu and iommu=on) or there
is (Xen) or will be (future qemu) one that requires it.
>
>>
>> Note that the above can be solved via some kind of compromise: The
>> device self describes the ability to honor the iommu, along with the
>> property (or ACPI table entry) that indicates whether or not it does.
>>
>> IE. We could use the revision or ProgIf field of the config space for
>> example. Or something in virtio config. If it's an "old" device, we
>> know it always bypass. If it's a new device, we know it only bypasses
>> if the corresponding property is in. I still would have to sort out the
>> openbios case for mac among others but it's at least a workable
>> direction.
>>
>> BTW. Don't you have a similar problem on x86 that today qemu claims
>> that everything honors the iommu in ACPI ?
>
> Only on a single experimental configuration, and that can apparently
> just be fixed going forward without any real problems being caused.
BTW, I once tried to describe the current situation on QEMU x86 with
IOMMU enabled via ACPI. While you can easily add IOMMU device exceptions
to the static tables, the fun starts when considering device hotplug for
virtio. Unless I missed some trick, ACPI doesn't seem like being
designed for that level of flexibility.
You would have to reserve a complete PCI bus, declare that one as not
being IOMMU-governed, and then only add new virtio devices to that bus.
Possible, but a lot of restrictions that existing management software
would have to be aware of as well.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 7:28 ` Jan Kiszka
0 siblings, 0 replies; 83+ messages in thread
From: Jan Kiszka @ 2015-11-10 7:28 UTC (permalink / raw)
To: Andy Lutomirski, Benjamin Herrenschmidt
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On 2015-11-10 03:18, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 6:04 PM, Benjamin Herrenschmidt
>> I thus go back to my original statement, it's a LOT easier to handle if
>> the device itself is self describing, indicating whether it is set to
>> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
>> first time qemu/VFIO plays tricks with the passed through device
>> configuration space...
>
> Which leaves the special case of Xen, where even preexisting devices
> don't bypass the IOMMU. Can we keep this specific to powerpc and
> sparc? On x86, this problem is basically nonexistent, since the IOMMU
> is properly self-describing.
>
> IOW, I think that on x86 we should assume that all virtio devices
> honor the IOMMU.
From the guest driver POV, that is OK because either there is no IOMMU
to program (the current situation with qemu), there can be one that
doesn't need it (the current situation with qemu and iommu=on) or there
is (Xen) or will be (future qemu) one that requires it.
>
>>
>> Note that the above can be solved via some kind of compromise: The
>> device self describes the ability to honor the iommu, along with the
>> property (or ACPI table entry) that indicates whether or not it does.
>>
>> IE. We could use the revision or ProgIf field of the config space for
>> example. Or something in virtio config. If it's an "old" device, we
>> know it always bypass. If it's a new device, we know it only bypasses
>> if the corresponding property is in. I still would have to sort out the
>> openbios case for mac among others but it's at least a workable
>> direction.
>>
>> BTW. Don't you have a similar problem on x86 that today qemu claims
>> that everything honors the iommu in ACPI ?
>
> Only on a single experimental configuration, and that can apparently
> just be fixed going forward without any real problems being caused.
BTW, I once tried to describe the current situation on QEMU x86 with
IOMMU enabled via ACPI. While you can easily add IOMMU device exceptions
to the static tables, the fun starts when considering device hotplug for
virtio. Unless I missed some trick, ACPI doesn't seem like being
designed for that level of flexibility.
You would have to reserve a complete PCI bus, declare that one as not
being IOMMU-governed, and then only add new virtio devices to that bus.
Possible, but a lot of restrictions that existing management software
would have to be aware of as well.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:04 ` Benjamin Herrenschmidt
` (3 preceding siblings ...)
(?)
@ 2015-11-10 9:45 ` Knut Omang
-1 siblings, 0 replies; 83+ messages in thread
From: Knut Omang @ 2015-11-10 9:45 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded. If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
> >
> > >
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > >
> > > Any better idea ?
> >
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> > I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
>
> Unless somebody can come up with a better idea...
Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?
Knut
> Cheers,
> Ben.
>
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:04 ` Benjamin Herrenschmidt
@ 2015-11-10 9:45 ` Knut Omang
-1 siblings, 0 replies; 83+ messages in thread
From: Knut Omang @ 2015-11-10 9:45 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded. If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
> >
> > >
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > >
> > > Any better idea ?
> >
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> > I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
>
> Unless somebody can come up with a better idea...
Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?
Knut
> Cheers,
> Ben.
>
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 9:45 ` Knut Omang
0 siblings, 0 replies; 83+ messages in thread
From: Knut Omang @ 2015-11-10 9:45 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Tue, 2015-11-10 at 13:04 +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2015-11-09 at 16:46 -0800, Andy Lutomirski wrote:
> > The problem here is that in some of the problematic cases the
> > virtio
> > driver may not even be loaded. If someone runs an L1 guest with an
> > IOMMU-bypassing virtio device and assigns it to L2 using vfio, then
> > *boom* L1 crashes. (Same if, say, DPDK gets used, I think.)
> >
> > >
> > > The only way out of this while keeping the "platform" stuff would
> > > be to
> > > also bump some kind of version in the virtio config (or PCI
> > > header). I
> > > have no other way to differenciate between "this is an old qemu
> > > that
> > > doesn't do the 'bypass property' yet" from "this is a virtio
> > > device
> > > that doesn't bypass".
> > >
> > > Any better idea ?
> >
> > I'd suggest that, in the absence of the new DT binding, we assume
> > that
> > any PCI device with the virtio vendor ID is passthrough on powerpc.
> > I
> > can do this in the virtio driver, but if it's in the platform code
> > then vfio gets it right too (i.e. fails to load).
>
> The problem is there isn't *a* virtio vendor ID. It's the RedHat
> vendor
> ID which will be used by more than just virtio, so we need to
> specifically list the devices.
>
> Additionally, that still means that once we have a virtio device that
> actually uses the iommu, powerpc will not work since the "workaround"
> above will kick in.
>
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of
> "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve
> it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
>
> - On things like macs, the device-tree is generated by openbios, it
> would have to have some added logic to try to figure that out, which
> means it needs to know *via different means* that some or all virtio
> devices bypass the iommu.
>
> I thus go back to my original statement, it's a LOT easier to handle
> if
> the device itself is self describing, indicating whether it is set to
> bypass a host iommu or not. For L1->L2, well, that wouldn't be the
> first time qemu/VFIO plays tricks with the passed through device
> configuration space...
>
> Note that the above can be solved via some kind of compromise: The
> device self describes the ability to honor the iommu, along with the
> property (or ACPI table entry) that indicates whether or not it does.
>
> IE. We could use the revision or ProgIf field of the config space for
> example. Or something in virtio config. If it's an "old" device, we
> know it always bypass. If it's a new device, we know it only bypasses
> if the corresponding property is in. I still would have to sort out
> the
> openbios case for mac among others but it's at least a workable
> direction.
>
> BTW. Don't you have a similar problem on x86 that today qemu claims
> that everything honors the iommu in ACPI ?
>
> Unless somebody can come up with a better idea...
Can something be done by means of PCIe capabilities?
ATS (Address Translation Support) seems like a natural choice?
Knut
> Cheers,
> Ben.
>
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 9:45 ` Knut Omang
(?)
@ 2015-11-10 10:26 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:26 UTC (permalink / raw)
To: Knut Omang, Andy Lutomirski
Cc: Andy Lutomirski, David Woodhouse, linux-kernel, David S. Miller,
sparclinux, Joerg Roedel, Christian Borntraeger, Cornelia Huck,
Sebastian Ott, Paolo Bonzini, Christoph Hellwig, KVM,
Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Tue, 2015-11-10 at 10:45 +0100, Knut Omang wrote:
> Can something be done by means of PCIe capabilities?
> ATS (Address Translation Support) seems like a natural choice?
Euh no... ATS is something else completely....
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:26 UTC (permalink / raw)
To: Knut Omang, Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 10:45 +0100, Knut Omang wrote:
> Can something be done by means of PCIe capabilities?
> ATS (Address Translation Support) seems like a natural choice?
Euh no... ATS is something else completely....
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:26 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 10:26 UTC (permalink / raw)
To: Knut Omang, Andy Lutomirski
Cc: Joerg Roedel, KVM, linux-s390, Michael S. Tsirkin, Sebastian Ott,
linux-kernel, Christoph Hellwig, Christian Borntraeger,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 10:45 +0100, Knut Omang wrote:
> Can something be done by means of PCIe capabilities?
> ATS (Address Translation Support) seems like a natural choice?
Euh no... ATS is something else completely....
Cheers,
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 2:04 ` Benjamin Herrenschmidt
(?)
@ 2015-11-10 10:27 ` Joerg Roedel
-1 siblings, 0 replies; 83+ messages in thread
From: Joerg Roedel @ 2015-11-10 10:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Tue, Nov 10, 2015 at 01:04:36PM +1100, Benjamin Herrenschmidt wrote:
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
You have the same problem when real PCIe devices appear that speak
virtio. I think the only real (still not very nice) solution is to add a
quirk to powerpc platform code that sets noop dma-ops for the existing
virtio vendor/device-ids and add a DT property to opt-out of that quirk.
New vendor/device-ids (as for real devices) would just not be covered by
the quirk and existing emulated devices continue to work.
The absence of the property just means that the quirk is in place and
the system assumes no translation for virtio devices.
Joerg
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:27 ` Joerg Roedel
0 siblings, 0 replies; 83+ messages in thread
From: Joerg Roedel @ 2015-11-10 10:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, KVM, Michael S. Tsirkin, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, Nov 10, 2015 at 01:04:36PM +1100, Benjamin Herrenschmidt wrote:
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
You have the same problem when real PCIe devices appear that speak
virtio. I think the only real (still not very nice) solution is to add a
quirk to powerpc platform code that sets noop dma-ops for the existing
virtio vendor/device-ids and add a DT property to opt-out of that quirk.
New vendor/device-ids (as for real devices) would just not be covered by
the quirk and existing emulated devices continue to work.
The absence of the property just means that the quirk is in place and
the system assumes no translation for virtio devices.
Joerg
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 10:27 ` Joerg Roedel
0 siblings, 0 replies; 83+ messages in thread
From: Joerg Roedel @ 2015-11-10 10:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-s390, KVM, Michael S. Tsirkin, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, Nov 10, 2015 at 01:04:36PM +1100, Benjamin Herrenschmidt wrote:
> The "in absence of the new DT binding" doesn't make that much sense.
>
> Those platforms use device-trees defined since the dawn of ages by
> actual open firmware implementations, they either have no iommu
> representation in there (Macs, the platform code hooks it all up) or
> have various properties related to the iommu but no concept of "bypass"
> in there.
>
> We can *add* a new property under some circumstances that indicates a
> bypass on a per-device basis, however that doesn't completely solve it:
>
> - As I said above, what does the absence of that property mean ? An
> old qemu that does bypass on all virtio or a new qemu trying to tell
> you that the virtio device actually does use the iommu (or some other
> environment that isn't qemu) ?
You have the same problem when real PCIe devices appear that speak
virtio. I think the only real (still not very nice) solution is to add a
quirk to powerpc platform code that sets noop dma-ops for the existing
virtio vendor/device-ids and add a DT property to opt-out of that quirk.
New vendor/device-ids (as for real devices) would just not be covered by
the quirk and existing emulated devices continue to work.
The absence of the property just means that the quirk is in place and
the system assumes no translation for virtio devices.
Joerg
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
2015-11-10 10:27 ` Joerg Roedel
(?)
@ 2015-11-10 19:36 ` Benjamin Herrenschmidt
-1 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:36 UTC (permalink / raw)
To: Joerg Roedel
Cc: Andy Lutomirski, Andy Lutomirski, David Woodhouse, linux-kernel,
David S. Miller, sparclinux, Christian Borntraeger,
Cornelia Huck, Sebastian Ott, Paolo Bonzini, Christoph Hellwig,
KVM, Martin Schwidefsky, linux-s390, Linux Virtualization,
Michael S. Tsirkin
On Tue, 2015-11-10 at 11:27 +0100, Joerg Roedel wrote:
>
> You have the same problem when real PCIe devices appear that speak
> virtio. I think the only real (still not very nice) solution is to add a
> quirk to powerpc platform code that sets noop dma-ops for the existing
> virtio vendor/device-ids and add a DT property to opt-out of that quirk.
>
> New vendor/device-ids (as for real devices) would just not be covered by
> the quirk and existing emulated devices continue to work.
Why woud real devices use new vendor/device IDs ? Also there are other
cases such as using virtio between 2 partitions, which we could do
under PowerVM ... that would require proper iommu usage with existing
IDs.
> The absence of the property just means that the quirk is in place and
> the system assumes no translation for virtio devices.
The only way that works forward for me (and possibly sparc & others,
what about ARM ?) is if we *change* something in virtio qemu at the
same time as we add some kind of property. For example the ProgIf field
or revision ID field.
That way I can key on that change.
It's still tricky because I would have to somewhat tell my various firmwares
(SLOF, OpenBIOS, OPAL, ...) so they can create the appropriate property, it's
still hacky, but it would be workable.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 19:36 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:36 UTC (permalink / raw)
To: Joerg Roedel
Cc: linux-s390, KVM, Michael S. Tsirkin, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 11:27 +0100, Joerg Roedel wrote:
>
> You have the same problem when real PCIe devices appear that speak
> virtio. I think the only real (still not very nice) solution is to add a
> quirk to powerpc platform code that sets noop dma-ops for the existing
> virtio vendor/device-ids and add a DT property to opt-out of that quirk.
>
> New vendor/device-ids (as for real devices) would just not be covered by
> the quirk and existing emulated devices continue to work.
Why woud real devices use new vendor/device IDs ? Also there are other
cases such as using virtio between 2 partitions, which we could do
under PowerVM ... that would require proper iommu usage with existing
IDs.
> The absence of the property just means that the quirk is in place and
> the system assumes no translation for virtio devices.
The only way that works forward for me (and possibly sparc & others,
what about ARM ?) is if we *change* something in virtio qemu at the
same time as we add some kind of property. For example the ProgIf field
or revision ID field.
That way I can key on that change.
It's still tricky because I would have to somewhat tell my various firmwares
(SLOF, OpenBIOS, OPAL, ...) so they can create the appropriate property, it's
still hacky, but it would be workable.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH v4 0/6] virtio core DMA API conversion
@ 2015-11-10 19:36 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 83+ messages in thread
From: Benjamin Herrenschmidt @ 2015-11-10 19:36 UTC (permalink / raw)
To: Joerg Roedel
Cc: linux-s390, KVM, Michael S. Tsirkin, Sebastian Ott, linux-kernel,
Andy Lutomirski, Christian Borntraeger, Christoph Hellwig,
Andy Lutomirski, sparclinux, Paolo Bonzini, Linux Virtualization,
David Woodhouse, David S. Miller, Martin Schwidefsky
On Tue, 2015-11-10 at 11:27 +0100, Joerg Roedel wrote:
>
> You have the same problem when real PCIe devices appear that speak
> virtio. I think the only real (still not very nice) solution is to add a
> quirk to powerpc platform code that sets noop dma-ops for the existing
> virtio vendor/device-ids and add a DT property to opt-out of that quirk.
>
> New vendor/device-ids (as for real devices) would just not be covered by
> the quirk and existing emulated devices continue to work.
Why woud real devices use new vendor/device IDs ? Also there are other
cases such as using virtio between 2 partitions, which we could do
under PowerVM ... that would require proper iommu usage with existing
IDs.
> The absence of the property just means that the quirk is in place and
> the system assumes no translation for virtio devices.
The only way that works forward for me (and possibly sparc & others,
what about ARM ?) is if we *change* something in virtio qemu at the
same time as we add some kind of property. For example the ProgIf field
or revision ID field.
That way I can key on that change.
It's still tricky because I would have to somewhat tell my various firmwares
(SLOF, OpenBIOS, OPAL, ...) so they can create the appropriate property, it's
still hacky, but it would be workable.
Ben.
^ permalink raw reply [flat|nested] 83+ messages in thread