* VIRTIO - compatibility with different virtualization solutions
@ 2014-02-17 13:23 Daniel Kiper
2014-02-19 0:26 ` Rusty Russell
` (3 more replies)
0 siblings, 4 replies; 28+ messages in thread
From: Daniel Kiper @ 2014-02-17 13:23 UTC (permalink / raw)
To: xen-devel, virtio
Cc: wei.liu2, ian.campbell, rusty, stefano.stabellini, ian, anthony,
sasha.levin
Hi,
Below you could find a summary of work in regards to VIRTIO compatibility with
different virtualization solutions. It was done mainly from Xen point of view
but results are quite generic and can be applied to wide spectrum
of virtualization platforms.
VIRTIO devices were designed as a set of generic devices for virtual environments.
They work without major issues on many currently existing virtualization solutions.
However, there is one VIRTIO specification and implementation issue which could hinder
VRITIO devices/drivers implementation on new or even existing platforms (e.g. Xen).
The problem is that the specification uses guest physical addresses as a pointers
to virtques, buffers and other structures. It means that VIRTIO device controller
(hypervisor/host or special device domain/process) knows the guest physical memory
layout and simply maps required regions as needed. However, this crude mapping
mechanism usually assumes that guests does not impose any access restrictions on
its whole memory. That situation is not desirable because many times guests would
like to put access restrictions on its whole memory and just give access to certain
memory regions needed for device operations. Fortunately many hypervisors have some
more or less advanced memory sharing mechanisms with relevant access control builtin.
However, those mechanisms do not use guest physical addresses as a shared memory
region address/reference but unique identifier which could be called "handle" here
(or anything else which clearly describes idea). It means that specification should
use term "handle" instead of guest physical addresses (in a particular case it can be
guest physical address). This way any virtualization environment could choose the best
way to access guest memory without compromising security if needed.
Above mentioned changes in specification require some changes in VIRTIO devices
and drivers implementation.
>From an implementation perspective of Linux VIRTIO drivers transition from old to
new one should not be very difficult. Linux Kernel itself provides DMA API which
should ease work on drivers. Hence they should use this API instead of omitting it.
Additionally, new IOMMU drivers should be created. Those IOMMU drivers should expose
handles to VIRTIO and hide hypervisor specific details. This way VIRTIO would not so
strongly depend on specific hypervisor behavior. Another part of VIRTIO are devices
which usually do not have an access to DMA API available in Linux Kernel and this may
present some challenges in transition to the new implementation. Additionally, similar
problems may also appear during drivers implementation in systems which do not have
Linux Kernel DMA API. However, even in that situation it should not be very big issue
and prevent transition to handles.
The author does not know FreeBSD and Windows well enough to make assumptions of how
to retool VRITIO drivers there to use some mechanism to get hypervisor handles,
but surely there must be some easy API to plumb this through.
As it can be seen from above description current VIRTIO specification could create
implementation challenges in some virtual environments. However, this issue could
be quite easily solved by migration from guest physical addresses which are used
as a pointers to virtques, bufferrs and other structures to handles. This change
should not be so difficult in implementation. Additionally, it makes VRITIO not so
tightly linked with specific virtual environment. This in turn helps improve fulfillment
of "Standard" assumption (Virtio makes no assumptions about the environment in which
it operates, beyond supporting the bus attaching the device. Virtio devices are
implemented over PCI and other buses, and earlier drafts been implemented on other
buses not included in this spec) made in VIRTIO spec introduction.
Acknowledgments for comments and suggestions: Wei Liu, Ian Pratt, Konrad Rzeszutek Wilk
Daniel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-17 13:23 VIRTIO - compatibility with different virtualization solutions Daniel Kiper @ 2014-02-19 0:26 ` Rusty Russell [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> ` (2 subsequent siblings) 3 siblings, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-19 0:26 UTC (permalink / raw) To: Daniel Kiper, xen-devel, virtio-dev Cc: wei.liu2, ian.campbell, stefano.stabellini, ian, anthony, sasha.levin Daniel Kiper <daniel.kiper@oracle.com> writes: > Hi, > > Below you could find a summary of work in regards to VIRTIO compatibility with > different virtualization solutions. It was done mainly from Xen point of view > but results are quite generic and can be applied to wide spectrum > of virtualization platforms. Hi Daniel, Sorry for the delayed response, I was pondering... CC changed to virtio-dev. >From a standard POV: It's possible to abstract out the where we use 'physical address' for 'address handle'. It's also possible to define this per-platform (ie. Xen-PV vs everyone else). This is sane, since Xen-PV is a distinct platform from x86. For platforms using EPT, I don't think you want anything but guest addresses, do you? >From an implementation POV: On IOMMU, start here for previous Linux discussion: http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650 And this is the real problem. We don't want to use the PCI IOMMU for PCI devices. So it's not just a matter of using existing Linux APIs. Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <87vbwcaqxe.fsf@rustcorp.com.au>]
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> @ 2014-02-19 4:42 ` Anthony Liguori 2014-02-20 1:31 ` Rusty Russell [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> 2014-02-19 10:09 ` Ian Campbell 2014-02-19 10:11 ` Ian Campbell 2 siblings, 2 replies; 28+ messages in thread From: Anthony Liguori @ 2014-02-19 4:42 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > Daniel Kiper <daniel.kiper@oracle.com> writes: >> Hi, >> >> Below you could find a summary of work in regards to VIRTIO compatibility with >> different virtualization solutions. It was done mainly from Xen point of view >> but results are quite generic and can be applied to wide spectrum >> of virtualization platforms. > > Hi Daniel, > > Sorry for the delayed response, I was pondering... CC changed > to virtio-dev. > > From a standard POV: It's possible to abstract out the where we use > 'physical address' for 'address handle'. It's also possible to define > this per-platform (ie. Xen-PV vs everyone else). This is sane, since > Xen-PV is a distinct platform from x86. I'll go even further and say that "address handle" doesn't make sense too. Just using grant table references is not enough to make virtio work well under Xen. You really need to use bounce buffers ala persistent grants. I think what you ultimately want is virtio using a DMA API (I know benh has scoffed at this but I don't buy his argument at face value) and a DMA layer that bounces requests to a pool of persistent grants. > For platforms using EPT, I don't think you want anything but guest > addresses, do you? > > From an implementation POV: > > On IOMMU, start here for previous Linux discussion: > http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650 > > And this is the real problem. We don't want to use the PCI IOMMU for > PCI devices. So it's not just a matter of using existing Linux APIs. Is there any data to back up that claim? Just because power currently does hypercalls for anything that uses the PCI IOMMU layer doesn't mean this cannot be changed. It's pretty hacky that virtio-pci just happens to work well by accident on power today. Not all architectures have this limitation. Regards, Anthony Liguori > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-19 4:42 ` Anthony Liguori @ 2014-02-20 1:31 ` Rusty Russell [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-20 1:31 UTC (permalink / raw) To: Anthony Liguori Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel Anthony Liguori <anthony@codemonkey.ws> writes: > On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: >> Daniel Kiper <daniel.kiper@oracle.com> writes: >>> Hi, >>> >>> Below you could find a summary of work in regards to VIRTIO compatibility with >>> different virtualization solutions. It was done mainly from Xen point of view >>> but results are quite generic and can be applied to wide spectrum >>> of virtualization platforms. >> >> Hi Daniel, >> >> Sorry for the delayed response, I was pondering... CC changed >> to virtio-dev. >> >> From a standard POV: It's possible to abstract out the where we use >> 'physical address' for 'address handle'. It's also possible to define >> this per-platform (ie. Xen-PV vs everyone else). This is sane, since >> Xen-PV is a distinct platform from x86. > > I'll go even further and say that "address handle" doesn't make sense too. I was trying to come up with a unique term, I wasn't trying to define semantics :) There are three debates here now: (1) what should the standard say, and (2) how would Linux implement it, (3) should we use each platform's PCI IOMMU. > Just using grant table references is not enough to make virtio work > well under Xen. You really need to use bounce buffers ala persistent > grants. Wait, if you're using bounce buffers, you didn't make it "work well"! > I think what you ultimately want is virtio using a DMA API (I know > benh has scoffed at this but I don't buy his argument at face value) > and a DMA layer that bounces requests to a pool of persistent grants. We can have a virtio DMA API, sure. It'd be a noop for non-Xen. But emulating the programming of an IOMMU seems masochistic. PowerPC have made it clear they don't want this. And noone else has come up with a compelling reason to want this: virtio passthrough? >> For platforms using EPT, I don't think you want anything but guest >> addresses, do you? >> >> From an implementation POV: >> >> On IOMMU, start here for previous Linux discussion: >> http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650 >> >> And this is the real problem. We don't want to use the PCI IOMMU for >> PCI devices. So it's not just a matter of using existing Linux APIs. > > Is there any data to back up that claim? Yes, for powerpc. Implementer gets to measure, as always. I suspect that if you emulate an IOMMU on Intel, your performance will suck too. > Just because power currently does hypercalls for anything that uses > the PCI IOMMU layer doesn't mean this cannot be changed. Does someone have an implementation of an IOMMU which doesn't use hypercalls, or is this theoretical? > It's pretty > hacky that virtio-pci just happens to work well by accident on power > today. Not all architectures have this limitation. It's a fundamental assumption of virtio that the host can access all of guest memory. That's paravert, not a hack. But tomayto tomatoh aside, it's unclear to me how you'd build an efficient IOMMU today. And it's unclear what benefit you'd gain. But the cost for Power is clear. So if someone wants do to this for PCI, they need to implement it and benchmark. But this is a little orthogonal to the Xen discussion. Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <87ha7ubme0.fsf@rustcorp.com.au>]
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> @ 2014-02-20 12:28 ` Stefano Stabellini 2014-02-20 20:28 ` Daniel Kiper 2014-02-21 2:50 ` Anthony Liguori 2 siblings, 0 replies; 28+ messages in thread From: Stefano Stabellini @ 2014-02-20 12:28 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, Anthony Liguori, sasha.levin, xen-devel On Thu, 20 Feb 2014, Rusty Russell wrote: > It's a fundamental assumption of virtio that the host can access all of > guest memory. I take that by "host" you mean the virtio backends in this context. Do you think that this fundamental assumption should be sustained going forward? I am asking because Xen assumes that the backends are only allowed to access the memory that the guest decides to share with them. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> 2014-02-20 12:28 ` Stefano Stabellini @ 2014-02-20 20:28 ` Daniel Kiper 2014-02-21 2:50 ` Anthony Liguori 2 siblings, 0 replies; 28+ messages in thread From: Daniel Kiper @ 2014-02-20 20:28 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini, ian, Anthony Liguori, sasha.levin, xen-devel Hey, On Thu, Feb 20, 2014 at 12:01:19PM +1030, Rusty Russell wrote: > Anthony Liguori <anthony@codemonkey.ws> writes: > > On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > >> Daniel Kiper <daniel.kiper@oracle.com> writes: > >>> Hi, > >>> > >>> Below you could find a summary of work in regards to VIRTIO compatibility with > >>> different virtualization solutions. It was done mainly from Xen point of view > >>> but results are quite generic and can be applied to wide spectrum > >>> of virtualization platforms. > >> > >> Hi Daniel, > >> > >> Sorry for the delayed response, I was pondering... CC changed > >> to virtio-dev. Do not worry. It is not a problem. It is not easy issue. > >> From a standard POV: It's possible to abstract out the where we use > >> 'physical address' for 'address handle'. It's also possible to define > >> this per-platform (ie. Xen-PV vs everyone else). This is sane, since > >> Xen-PV is a distinct platform from x86. > > > > I'll go even further and say that "address handle" doesn't make sense too. > > I was trying to come up with a unique term, I wasn't trying to define > semantics :) > > There are three debates here now: (1) what should the standard say, and Yep. > (2) how would Linux implement it, It seems to me that we should think about other common OSes too. > (3) should we use each platform's PCI IOMMU. I do not want emulate any hardware. It seems to me that we should think about something which fits best in VIRTIO environment. DMA API with relevant backends looks promising but I have also some worries about performance. Additionally, it is Linux Kernel specific stuff so maybe we should invent something more generic which will fit well in other guest OSes too. [...] > It's a fundamental assumption of virtio that the host can access all of > guest memory. That's paravert, not a hack. Why? What if guests would like to limit access to their memory? I think that it will happen sooner or later. Additionally, I think that your assumption is not hypervisor agnostic which limits implementation of VIRTIO spec. At least for Xen your idea will make difficulties and probably prevent VRITIO implementation. Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> 2014-02-20 12:28 ` Stefano Stabellini 2014-02-20 20:28 ` Daniel Kiper @ 2014-02-21 2:50 ` Anthony Liguori 2014-02-21 10:05 ` Wei Liu 2 siblings, 1 reply; 28+ messages in thread From: Anthony Liguori @ 2014-02-21 2:50 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > Anthony Liguori <anthony@codemonkey.ws> writes: >> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: >>> Daniel Kiper <daniel.kiper@oracle.com> writes: >>>> Hi, >>>> >>>> Below you could find a summary of work in regards to VIRTIO compatibility with >>>> different virtualization solutions. It was done mainly from Xen point of view >>>> but results are quite generic and can be applied to wide spectrum >>>> of virtualization platforms. >>> >>> Hi Daniel, >>> >>> Sorry for the delayed response, I was pondering... CC changed >>> to virtio-dev. >>> >>> From a standard POV: It's possible to abstract out the where we use >>> 'physical address' for 'address handle'. It's also possible to define >>> this per-platform (ie. Xen-PV vs everyone else). This is sane, since >>> Xen-PV is a distinct platform from x86. >> >> I'll go even further and say that "address handle" doesn't make sense too. > > I was trying to come up with a unique term, I wasn't trying to define > semantics :) Understood, that wasn't really directed at you. > There are three debates here now: (1) what should the standard say, and The standard should say, "physical address" > (2) how would Linux implement it, Linux should use the PCI DMA API. > (3) should we use each platform's PCI > IOMMU. Just like any other PCI device :-) >> Just using grant table references is not enough to make virtio work >> well under Xen. You really need to use bounce buffers ala persistent >> grants. > > Wait, if you're using bounce buffers, you didn't make it "work well"! Preaching to the choir man... but bounce buffering is proven to be faster than doing grant mappings on every request. xen-blk does bounce buffering by default and I suspect netfront is heading that direction soon. It would be a lot easier to simply have a global pool of grant tables that effectively becomes the DMA pool. Then the DMA API can bounce into that pool and those addresses can be placed on the ring. It's a little different for Xen because now the backends have to deal with physical addresses but the concept is still the same. >> I think what you ultimately want is virtio using a DMA API (I know >> benh has scoffed at this but I don't buy his argument at face value) >> and a DMA layer that bounces requests to a pool of persistent grants. > > We can have a virtio DMA API, sure. It'd be a noop for non-Xen. > > But emulating the programming of an IOMMU seems masochistic. PowerPC > have made it clear they don't want this. I don't think the argument is all that clear. Wouldn't it be nice for other PCI devices to be faster under Power KVM? Why not have not change the DMA API under Power Linux to detect that it's under KVM and simply not make any hypercalls? > And noone else has come up > with a compelling reason to want this: virtio passthrough? So I can run Xen under QEMU and use virtio-blk and virtio-net as the device model. Xen PV uses the DMA API to do mfn -> pfn mapping and since virtio doesn't use it, it's the only PCI device in the QEMU device model that doesn't actually work when running Xen under QEMU. Regards, Anthony Liguori >>> For platforms using EPT, I don't think you want anything but guest >>> addresses, do you? >>> >>> From an implementation POV: >>> >>> On IOMMU, start here for previous Linux discussion: >>> http://thread.gmane.org/gmane.linux.kernel.virtualization/14410/focus=14650 >>> >>> And this is the real problem. We don't want to use the PCI IOMMU for >>> PCI devices. So it's not just a matter of using existing Linux APIs. >> >> Is there any data to back up that claim? > > Yes, for powerpc. Implementer gets to measure, as always. I suspect > that if you emulate an IOMMU on Intel, your performance will suck too. > >> Just because power currently does hypercalls for anything that uses >> the PCI IOMMU layer doesn't mean this cannot be changed. > > Does someone have an implementation of an IOMMU which doesn't use > hypercalls, or is this theoretical? > >> It's pretty >> hacky that virtio-pci just happens to work well by accident on power >> today. Not all architectures have this limitation. > > It's a fundamental assumption of virtio that the host can access all of > guest memory. That's paravert, not a hack. > > But tomayto tomatoh aside, it's unclear to me how you'd build an > efficient IOMMU today. And it's unclear what benefit you'd gain. But > the cost for Power is clear. > > So if someone wants do to this for PCI, they need to implement it and > benchmark. But this is a little orthogonal to the Xen discussion. > > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-21 2:50 ` Anthony Liguori @ 2014-02-21 10:05 ` Wei Liu 2014-02-21 15:01 ` Konrad Rzeszutek Wilk 0 siblings, 1 reply; 28+ messages in thread From: Wei Liu @ 2014-02-21 10:05 UTC (permalink / raw) To: Anthony Liguori Cc: virtio-dev, Wei Liu, Ian Campbell, Rusty Russell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote: > On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > > Anthony Liguori <anthony@codemonkey.ws> writes: > >> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > >>> Daniel Kiper <daniel.kiper@oracle.com> writes: > >>>> Hi, > >>>> > >>>> Below you could find a summary of work in regards to VIRTIO compatibility with > >>>> different virtualization solutions. It was done mainly from Xen point of view > >>>> but results are quite generic and can be applied to wide spectrum > >>>> of virtualization platforms. > >>> > >>> Hi Daniel, > >>> > >>> Sorry for the delayed response, I was pondering... CC changed > >>> to virtio-dev. > >>> > >>> From a standard POV: It's possible to abstract out the where we use > >>> 'physical address' for 'address handle'. It's also possible to define > >>> this per-platform (ie. Xen-PV vs everyone else). This is sane, since > >>> Xen-PV is a distinct platform from x86. > >> > >> I'll go even further and say that "address handle" doesn't make sense too. > > > > I was trying to come up with a unique term, I wasn't trying to define > > semantics :) > > Understood, that wasn't really directed at you. > > > There are three debates here now: (1) what should the standard say, and > > The standard should say, "physical address" > > > (2) how would Linux implement it, > > Linux should use the PCI DMA API. > > > (3) should we use each platform's PCI > > IOMMU. > > Just like any other PCI device :-) > > >> Just using grant table references is not enough to make virtio work > >> well under Xen. You really need to use bounce buffers ala persistent > >> grants. > > > > Wait, if you're using bounce buffers, you didn't make it "work well"! > > Preaching to the choir man... but bounce buffering is proven to be > faster than doing grant mappings on every request. xen-blk does > bounce buffering by default and I suspect netfront is heading that > direction soon. > FWIW Annie Li @ Oracle once implemented a persistent map prototype for netfront and the result was not satisfying. > It would be a lot easier to simply have a global pool of grant tables > that effectively becomes the DMA pool. Then the DMA API can bounce > into that pool and those addresses can be placed on the ring. > > It's a little different for Xen because now the backends have to deal > with physical addresses but the concept is still the same. > How would you apply this to Xen's security model? How can hypervisor effectively enforce access control? "Handle" and "physical address" are essentially not the same concept, otherwise you wouldn't have proposed this change. Not saying I'm against this change, just this description is too vague for me to understand the bigger picture. But a downside for sure is that if we go with this change we then have to maintain two different paths in backend. However small the difference is it is still a burden. Wei. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-21 10:05 ` Wei Liu @ 2014-02-21 15:01 ` Konrad Rzeszutek Wilk 2014-02-25 0:33 ` Rusty Russell [not found] ` <87y51058vf.fsf@rustcorp.com.au> 0 siblings, 2 replies; 28+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-02-21 15:01 UTC (permalink / raw) To: Wei Liu Cc: virtio-dev, Ian Campbell, Stefano Stabellini, Rusty Russell, Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote: > On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote: > > On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > > > Anthony Liguori <anthony@codemonkey.ws> writes: > > >> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > > >>> Daniel Kiper <daniel.kiper@oracle.com> writes: > > >>>> Hi, > > >>>> > > >>>> Below you could find a summary of work in regards to VIRTIO compatibility with > > >>>> different virtualization solutions. It was done mainly from Xen point of view > > >>>> but results are quite generic and can be applied to wide spectrum > > >>>> of virtualization platforms. > > >>> > > >>> Hi Daniel, > > >>> > > >>> Sorry for the delayed response, I was pondering... CC changed > > >>> to virtio-dev. > > >>> > > >>> From a standard POV: It's possible to abstract out the where we use > > >>> 'physical address' for 'address handle'. It's also possible to define > > >>> this per-platform (ie. Xen-PV vs everyone else). This is sane, since > > >>> Xen-PV is a distinct platform from x86. > > >> > > >> I'll go even further and say that "address handle" doesn't make sense too. > > > > > > I was trying to come up with a unique term, I wasn't trying to define > > > semantics :) > > > > Understood, that wasn't really directed at you. > > > > > There are three debates here now: (1) what should the standard say, and > > > > The standard should say, "physical address" This conversation is heading towards - implementation needs it - hence lets make the design have it. Which I am OK with - but if we are going that route we might as well call this thing 'my-pony-number' because I think each hypervisor will have a different view of it. Some of them might use a physical address with some flag bits on it. Some might use just physical address. And some might want an 32-bit value that has no correlation to to physical nor virtual addresses. > > > > > (2) how would Linux implement it, > > > > Linux should use the PCI DMA API. Aye. > > > > > (3) should we use each platform's PCI > > > IOMMU. > > > > Just like any other PCI device :-) Aye. > > > > >> Just using grant table references is not enough to make virtio work > > >> well under Xen. You really need to use bounce buffers ala persistent > > >> grants. > > > > > > Wait, if you're using bounce buffers, you didn't make it "work well"! > > > > Preaching to the choir man... but bounce buffering is proven to be > > faster than doing grant mappings on every request. xen-blk does > > bounce buffering by default and I suspect netfront is heading that > > direction soon. > > > > FWIW Annie Li @ Oracle once implemented a persistent map prototype for > netfront and the result was not satisfying. Which could be due to the traffic pattern. There is a lot of back/forth traffic on a single ring in network (TCP with ACK/SYN). With block the issue was a bit different and we do more of streaming workloads. > > > It would be a lot easier to simply have a global pool of grant tables > > that effectively becomes the DMA pool. Then the DMA API can bounce > > into that pool and those addresses can be placed on the ring. > > > > It's a little different for Xen because now the backends have to deal > > with physical addresses but the concept is still the same. Rusty, this is a part below is Xen specific - so you are welcome to gloss over it. I presume you would also need some machinary for the hypervisor to give access to this 64MB (or whatever size) pool (and we could make grant pages have 2MB granularity - so we just 32 grants) to the backend. But the backend would have to know the grant entries to at least do the proper mapping and unmapping (if it choose to)? And for that it needs the grant value to make the proper hypercall to map its memory (backend) to the frontend memory. Or are you saying - instead of using grant entries just use physical addresses - and naturally the hypervisor would have to use that as well. Since it is just a number, why not make it at least something and we won't need to keep a 'grant->physical address' lookup machinery? > > > > How would you apply this to Xen's security model? How can hypervisor > effectively enforce access control? "Handle" and "physical address" are > essentially not the same concept, otherwise you wouldn't have proposed > this change. Not saying I'm against this change, just this description > is too vague for me to understand the bigger picture. > > But a downside for sure is that if we go with this change we then have > to maintain two different paths in backend. However small the difference > is it is still a burden. Or just in the grant machinery. The backends just plucks this number in data structures and that is all it cares about. > > Wei. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-21 15:01 ` Konrad Rzeszutek Wilk @ 2014-02-25 0:33 ` Rusty Russell [not found] ` <87y51058vf.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-25 0:33 UTC (permalink / raw) To: Konrad Rzeszutek Wilk, Wei Liu Cc: virtio-dev, Ian Campbell, Stefano Stabellini, Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote: >> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote: >> > The standard should say, "physical address" > > This conversation is heading towards - implementation needs it - hence lets > make the design have it. Which I am OK with - but if we are going that > route we might as well call this thing 'my-pony-number' because I think > each hypervisor will have a different view of it. > > Some of them might use a physical address with some flag bits on it. > Some might use just physical address. > > And some might want an 32-bit value that has no correlation to to physical > nor virtual addresses. True, but if the standard doesn't define what it is, it's not a standard worth anything. Xen is special because it's already requiring guest changes; it's a platform in itself and so can be different from everything else. But it still needs to be defined. At the moment, anything but guest-phys would not be compliant. That's a Good Thing if we simply don't know the best answer for Xen; we'll adjust the standard when we do. Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <87y51058vf.fsf@rustcorp.com.au>]
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87y51058vf.fsf@rustcorp.com.au> @ 2014-02-25 21:09 ` Konrad Rzeszutek Wilk 0 siblings, 0 replies; 28+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-02-25 21:09 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini, Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel On Tue, Feb 25, 2014 at 11:03:24AM +1030, Rusty Russell wrote: > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > > On Fri, Feb 21, 2014 at 10:05:06AM +0000, Wei Liu wrote: > >> On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote: > >> > The standard should say, "physical address" > > > > This conversation is heading towards - implementation needs it - hence lets > > make the design have it. Which I am OK with - but if we are going that > > route we might as well call this thing 'my-pony-number' because I think > > each hypervisor will have a different view of it. > > > > Some of them might use a physical address with some flag bits on it. > > Some might use just physical address. > > > > And some might want an 32-bit value that has no correlation to to physical > > nor virtual addresses. > > True, but if the standard doesn't define what it is, it's not a standard > worth anything. Xen is special because it's already requiring guest > changes; it's a platform in itself and so can be different from > everything else. But it still needs to be defined. > > At the moment, anything but guest-phys would not be compliant. That's a > Good Thing if we simply don't know the best answer for Xen; we'll adjust > the standard when we do. I think Daniel's suggestion of a 'handle' should cover it, no? Or are you saying that the 'handle' should actually say what it is for every platform on which VirtIO will run? For Xen it would be whatever the DMA API gives back as 'dma_addr_t'. Which would require the VirtIO drivers to use the DMA (or PCI) APIs. > > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> 2014-02-19 4:42 ` Anthony Liguori @ 2014-02-19 10:09 ` Ian Campbell 2014-02-20 7:48 ` Rusty Russell [not found] ` <8761oab4y7.fsf@rustcorp.com.au> 2014-02-19 10:11 ` Ian Campbell 2 siblings, 2 replies; 28+ messages in thread From: Ian Campbell @ 2014-02-19 10:09 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: > For platforms using EPT, I don't think you want anything but guest > addresses, do you? No, the arguments for preventing unfettered access by backends to frontend RAM applies to EPT as well. Ian. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions 2014-02-19 10:09 ` Ian Campbell @ 2014-02-20 7:48 ` Rusty Russell [not found] ` <8761oab4y7.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-20 7:48 UTC (permalink / raw) To: Ian Campbell Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel Ian Campbell <Ian.Campbell@citrix.com> writes: > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: >> For platforms using EPT, I don't think you want anything but guest >> addresses, do you? > > No, the arguments for preventing unfettered access by backends to > frontend RAM applies to EPT as well. I can see how you'd parse my sentence that way, I think, but the two are orthogonal. AFAICT your grant-table access restrictions are page granularity, though you don't use page-aligned data (eg. in xen-netfront). This level of access control is possible using the virtio ring too, but noone has implemented such a thing AFAIK. Hope that clarifies, Rusty. PS. Random aside: I greatly enjoyed your blog post on 'Xen on ARM and the Device Tree vs. ACPI debate'. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <8761oab4y7.fsf@rustcorp.com.au>]
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <8761oab4y7.fsf@rustcorp.com.au> @ 2014-02-20 20:37 ` Daniel Kiper [not found] ` <20140220203704.GG3441@olila.local.net-space.pl> 1 sibling, 0 replies; 28+ messages in thread From: Daniel Kiper @ 2014-02-20 20:37 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Ian Campbell, stefano.stabellini, ian, anthony, sasha.levin, xen-devel Hey, On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote: > Ian Campbell <Ian.Campbell@citrix.com> writes: > > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: > >> For platforms using EPT, I don't think you want anything but guest > >> addresses, do you? > > > > No, the arguments for preventing unfettered access by backends to > > frontend RAM applies to EPT as well. > > I can see how you'd parse my sentence that way, I think, but the two > are orthogonal. > > AFAICT your grant-table access restrictions are page granularity, though > you don't use page-aligned data (eg. in xen-netfront). This level of > access control is possible using the virtio ring too, but noone has > implemented such a thing AFAIK. Could you say in short how it should be done? DMA API is an option but if there is a simpler mechanism available in VIRTIO itself we will be happy to use it in Xen. Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <20140220203704.GG3441@olila.local.net-space.pl>]
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <20140220203704.GG3441@olila.local.net-space.pl> @ 2014-02-21 0:54 ` Rusty Russell [not found] ` <8761o99tft.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-21 0:54 UTC (permalink / raw) To: Daniel Kiper Cc: virtio-dev, wei.liu2, Ian Campbell, stefano.stabellini, ian, anthony, sasha.levin, xen-devel Daniel Kiper <daniel.kiper@oracle.com> writes: > Hey, > > On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote: >> Ian Campbell <Ian.Campbell@citrix.com> writes: >> > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: >> >> For platforms using EPT, I don't think you want anything but guest >> >> addresses, do you? >> > >> > No, the arguments for preventing unfettered access by backends to >> > frontend RAM applies to EPT as well. >> >> I can see how you'd parse my sentence that way, I think, but the two >> are orthogonal. >> >> AFAICT your grant-table access restrictions are page granularity, though >> you don't use page-aligned data (eg. in xen-netfront). This level of >> access control is possible using the virtio ring too, but noone has >> implemented such a thing AFAIK. > > Could you say in short how it should be done? DMA API is an option but > if there is a simpler mechanism available in VIRTIO itself we will be > happy to use it in Xen. OK, this challenged me to think harder. The queue itself is effectively a grant table (as long as you don't give the backend write access to it). The available ring tells you where the buffers are and whether they are readable or writable. The used ring tells you when they're used. However, performance would suck due to no caching: you'd end up doing a map and unmap on every packet. I'm assuming Xen currently avoids that somehow? Seems likely... On the other hand, if we wanted a more Xen-like setup, it would looke like this: 1) Abstract away the "physical addresses" to "handles" in the standard, and allow some platform-specific mapping setup and teardown. 2) In Linux, implement a virtio DMA ops which handles the grant table stuff for Xen (returning grant table ids + offset or something?), noop for others. This would be a runtime thing. 3) In Linux, change the drivers to use this API. Now, Xen will not be able to use vhost to accelerate, but it doesn't now anyway. Am I missing anything? Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <8761o99tft.fsf@rustcorp.com.au>]
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <8761o99tft.fsf@rustcorp.com.au> @ 2014-02-21 3:00 ` Anthony Liguori 2014-02-25 0:40 ` Rusty Russell [not found] ` <87vbw458jr.fsf@rustcorp.com.au> 2014-02-21 10:21 ` Wei Liu 2014-02-21 15:11 ` Konrad Rzeszutek Wilk 2 siblings, 2 replies; 28+ messages in thread From: Anthony Liguori @ 2014-02-21 3:00 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel On Thu, Feb 20, 2014 at 4:54 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > Daniel Kiper <daniel.kiper@oracle.com> writes: >> Hey, >> >> On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote: >>> Ian Campbell <Ian.Campbell@citrix.com> writes: >>> > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: >>> >> For platforms using EPT, I don't think you want anything but guest >>> >> addresses, do you? >>> > >>> > No, the arguments for preventing unfettered access by backends to >>> > frontend RAM applies to EPT as well. >>> >>> I can see how you'd parse my sentence that way, I think, but the two >>> are orthogonal. >>> >>> AFAICT your grant-table access restrictions are page granularity, though >>> you don't use page-aligned data (eg. in xen-netfront). This level of >>> access control is possible using the virtio ring too, but noone has >>> implemented such a thing AFAIK. >> >> Could you say in short how it should be done? DMA API is an option but >> if there is a simpler mechanism available in VIRTIO itself we will be >> happy to use it in Xen. > > OK, this challenged me to think harder. > > The queue itself is effectively a grant table (as long as you don't give > the backend write access to it). The available ring tells you where the > buffers are and whether they are readable or writable. The used ring > tells you when they're used. > > However, performance would suck due to no caching: you'd end up doing a > map and unmap on every packet. I'm assuming Xen currently avoids that > somehow? Seems likely... > > On the other hand, if we wanted a more Xen-like setup, it would looke > like this: > > 1) Abstract away the "physical addresses" to "handles" in the standard, > and allow some platform-specific mapping setup and teardown. At the risk of beating a dead horse, passing handles (grant references) is going to be slow. virtio-blk would never be as fast as xen-blkif. I don't want to see virtio adopt a bouncing mechanism like blkfront has developed especially in a way that every driver had to implement it on its own. I really think the best paths forward for virtio on Xen are either (1) reject the memory isolation thing and leave things as is or (2) assume bounce buffering at the transport layer (by using the PCI DMA API). Regards, Anthony Liguori > 2) In Linux, implement a virtio DMA ops which handles the grant table > stuff for Xen (returning grant table ids + offset or something?), > noop for others. This would be a runtime thing. > > 3) In Linux, change the drivers to use this API. > > Now, Xen will not be able to use vhost to accelerate, but it doesn't now > anyway. > > Am I missing anything? > > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions 2014-02-21 3:00 ` Anthony Liguori @ 2014-02-25 0:40 ` Rusty Russell [not found] ` <87vbw458jr.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-02-25 0:40 UTC (permalink / raw) To: Anthony Liguori Cc: virtio-dev, Wei Liu, Ian Campbell, Daniel Kiper, Stefano Stabellini, ian, sasha.levin, xen-devel Anthony Liguori <anthony@codemonkey.ws> writes: > On Thu, Feb 20, 2014 at 4:54 PM, Rusty Russell <rusty@au1.ibm.com> wrote: >> On the other hand, if we wanted a more Xen-like setup, it would looke >> like this: >> >> 1) Abstract away the "physical addresses" to "handles" in the standard, >> and allow some platform-specific mapping setup and teardown. > > At the risk of beating a dead horse, passing handles (grant > references) is going to be slow. ... > I really think the best paths forward for virtio on Xen are either (1) > reject the memory isolation thing and leave things as is or (2) assume > bounce buffering at the transport layer (by using the PCI DMA API). Xen can get memory isolation back by doing the copy in the hypervisor. I've always liked that approach because it doesn't alter the guest semantics, but it's very different from what Xen does now. Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <87vbw458jr.fsf@rustcorp.com.au>]
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87vbw458jr.fsf@rustcorp.com.au> @ 2014-02-25 21:12 ` Konrad Rzeszutek Wilk 2014-02-26 9:38 ` Ian Campbell 1 sibling, 0 replies; 28+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-02-25 21:12 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini, Daniel Kiper, ian, Anthony Liguori, sasha.levin, xen-devel On Tue, Feb 25, 2014 at 11:10:24AM +1030, Rusty Russell wrote: > Anthony Liguori <anthony@codemonkey.ws> writes: > > On Thu, Feb 20, 2014 at 4:54 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > >> On the other hand, if we wanted a more Xen-like setup, it would looke > >> like this: > >> > >> 1) Abstract away the "physical addresses" to "handles" in the standard, > >> and allow some platform-specific mapping setup and teardown. > > > > At the risk of beating a dead horse, passing handles (grant > > references) is going to be slow. > ... > > I really think the best paths forward for virtio on Xen are either (1) > > reject the memory isolation thing and leave things as is or (2) assume > > bounce buffering at the transport layer (by using the PCI DMA API). > > Xen can get memory isolation back by doing the copy in the hypervisor. > I've always liked that approach because it doesn't alter the guest > semantics, but it's very different from what Xen does now. It could. But why do it - the backend can choose it as well to do it and perhaps even do some translation of the payload as it sees fit. Or it can map it - and if using DPDK for example - one has memory pages shared between the domains all the time - where you just need to map once. > > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87vbw458jr.fsf@rustcorp.com.au> 2014-02-25 21:12 ` Konrad Rzeszutek Wilk @ 2014-02-26 9:38 ` Ian Campbell 1 sibling, 0 replies; 28+ messages in thread From: Ian Campbell @ 2014-02-26 9:38 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, Wei Liu, Daniel Kiper, Stefano Stabellini, ian, Anthony Liguori, sasha.levin, xen-devel On Tue, 2014-02-25 at 11:10 +1030, Rusty Russell wrote: > Anthony Liguori <anthony@codemonkey.ws> writes: > > On Thu, Feb 20, 2014 at 4:54 PM, Rusty Russell <rusty@au1.ibm.com> wrote: > >> On the other hand, if we wanted a more Xen-like setup, it would looke > >> like this: > >> > >> 1) Abstract away the "physical addresses" to "handles" in the standard, > >> and allow some platform-specific mapping setup and teardown. > > > > At the risk of beating a dead horse, passing handles (grant > > references) is going to be slow. > ... > > I really think the best paths forward for virtio on Xen are either (1) > > reject the memory isolation thing and leave things as is or (2) assume > > bounce buffering at the transport layer (by using the PCI DMA API). > > Xen can get memory isolation back by doing the copy in the hypervisor. > I've always liked that approach because it doesn't alter the guest > semantics, but it's very different from what Xen does now. Doing the copy in the hypervisor still uses grant references, since the hypervisor needs to know what the source domain is permitting access to for the target domain (or vice versa if you do the copy the other way) and grant tables are the mechanism which achieves this. See the already existing GNTTABOP_copy[0] for example, it is used in the existing Xen PV driver pairs (e.g. network receive into domU). Ian. [0] http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,grant_table.h.html#EnumVal_GNTTABOP_copy ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <8761o99tft.fsf@rustcorp.com.au> 2014-02-21 3:00 ` Anthony Liguori @ 2014-02-21 10:21 ` Wei Liu 2014-02-21 15:11 ` Konrad Rzeszutek Wilk 2 siblings, 0 replies; 28+ messages in thread From: Wei Liu @ 2014-02-21 10:21 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Ian Campbell, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel On Fri, Feb 21, 2014 at 11:24:14AM +1030, Rusty Russell wrote: > Daniel Kiper <daniel.kiper@oracle.com> writes: > > Hey, > > > > On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote: > >> Ian Campbell <Ian.Campbell@citrix.com> writes: > >> > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: > >> >> For platforms using EPT, I don't think you want anything but guest > >> >> addresses, do you? > >> > > >> > No, the arguments for preventing unfettered access by backends to > >> > frontend RAM applies to EPT as well. > >> > >> I can see how you'd parse my sentence that way, I think, but the two > >> are orthogonal. > >> > >> AFAICT your grant-table access restrictions are page granularity, though > >> you don't use page-aligned data (eg. in xen-netfront). This level of > >> access control is possible using the virtio ring too, but noone has > >> implemented such a thing AFAIK. > > > > Could you say in short how it should be done? DMA API is an option but > > if there is a simpler mechanism available in VIRTIO itself we will be > > happy to use it in Xen. > > OK, this challenged me to think harder. > > The queue itself is effectively a grant table (as long as you don't give > the backend write access to it). The available ring tells you where the > buffers are and whether they are readable or writable. The used ring > tells you when they're used. > > However, performance would suck due to no caching: you'd end up doing a > map and unmap on every packet. I'm assuming Xen currently avoids that > somehow? Seems likely... > If you're talking about Xen drivers in Linux kernel... At least for Xen network backend in mainline Linux, it uses copy instead of map. Zoltan Kiss @ Citrix is working on a mapping network backend. He uses batch unmap to avoid performance penalty. Wei. > On the other hand, if we wanted a more Xen-like setup, it would looke > like this: > > 1) Abstract away the "physical addresses" to "handles" in the standard, > and allow some platform-specific mapping setup and teardown. > > 2) In Linux, implement a virtio DMA ops which handles the grant table > stuff for Xen (returning grant table ids + offset or something?), > noop for others. This would be a runtime thing. > > 3) In Linux, change the drivers to use this API. > > Now, Xen will not be able to use vhost to accelerate, but it doesn't now > anyway. > > Am I missing anything? > > Cheers, > Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <8761o99tft.fsf@rustcorp.com.au> 2014-02-21 3:00 ` Anthony Liguori 2014-02-21 10:21 ` Wei Liu @ 2014-02-21 15:11 ` Konrad Rzeszutek Wilk 2014-03-03 5:52 ` Rusty Russell [not found] ` <87ppm325i6.fsf@rustcorp.com.au> 2 siblings, 2 replies; 28+ messages in thread From: Konrad Rzeszutek Wilk @ 2014-02-21 15:11 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Ian Campbell, stefano.stabellini, Daniel Kiper, ian, anthony, sasha.levin, xen-devel On Fri, Feb 21, 2014 at 11:24:14AM +1030, Rusty Russell wrote: > Daniel Kiper <daniel.kiper@oracle.com> writes: > > Hey, > > > > On Thu, Feb 20, 2014 at 06:18:00PM +1030, Rusty Russell wrote: > >> Ian Campbell <Ian.Campbell@citrix.com> writes: > >> > On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: > >> >> For platforms using EPT, I don't think you want anything but guest > >> >> addresses, do you? > >> > > >> > No, the arguments for preventing unfettered access by backends to > >> > frontend RAM applies to EPT as well. > >> > >> I can see how you'd parse my sentence that way, I think, but the two > >> are orthogonal. > >> > >> AFAICT your grant-table access restrictions are page granularity, though > >> you don't use page-aligned data (eg. in xen-netfront). This level of > >> access control is possible using the virtio ring too, but noone has > >> implemented such a thing AFAIK. > > > > Could you say in short how it should be done? DMA API is an option but > > if there is a simpler mechanism available in VIRTIO itself we will be > > happy to use it in Xen. > > OK, this challenged me to think harder. > > The queue itself is effectively a grant table (as long as you don't give > the backend write access to it). The available ring tells you where the > buffers are and whether they are readable or writable. The used ring > tells you when they're used. > > However, performance would suck due to no caching: you'd end up doing a > map and unmap on every packet. I'm assuming Xen currently avoids that > somehow? Seems likely... > > On the other hand, if we wanted a more Xen-like setup, it would looke > like this: > > 1) Abstract away the "physical addresses" to "handles" in the standard, > and allow some platform-specific mapping setup and teardown. +1 > > 2) In Linux, implement a virtio DMA ops which handles the grant table > stuff for Xen (returning grant table ids + offset or something?), > noop for others. This would be a runtime thing. Or perhaps an KVM specific DMA ops (which is nop) and Xen ops. Easy enough to implement. > > 3) In Linux, change the drivers to use this API. +1 > > Now, Xen will not be able to use vhost to accelerate, but it doesn't now > anyway. Correct. Thought one could implement an ring of grant entries system where the frontend and backend share it along with the hypervisor. And when the backend tries to access said memory thinking it has mapped to the frontend (but it has not yet mapped this memory yet), it traps to the hypervisor which then does mapping for the backend of the frontend pages. Kind of lazy-grant system. Anyhow, all of that is just implementation details and hand-waving. If we wanted we can extend vhost for when it plucks entries of the virtq to call an specific platform API. For KVM it would be all nops. For Xen it would do a magic pony show or such <more hand-waving>. > > Am I missing anything? On a bit different topic: I am unclear about the asynchronous vs synchronous nature of Virt configuration. Xen is all about XenBus which is more of a callback mechanism. Virt does its stuff on MMIO and PCI which are slow - but do get you the values. Can we somehow make it clear that the configuration setup can be asynchronous? That would also mean that in the future this configuration (say when migrating) changes can be conveyed to the virtio frontends via an interrupt mechanism (or callback) if the new host has something important to say? > > Cheers, > Rusty. > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions 2014-02-21 15:11 ` Konrad Rzeszutek Wilk @ 2014-03-03 5:52 ` Rusty Russell [not found] ` <87ppm325i6.fsf@rustcorp.com.au> 1 sibling, 0 replies; 28+ messages in thread From: Rusty Russell @ 2014-03-03 5:52 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: virtio-dev, wei.liu2, Ian Campbell, stefano.stabellini, Daniel Kiper, ian, anthony, sasha.levin, xen-devel Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > On Fri, Feb 21, 2014 at 11:24:14AM +1030, Rusty Russell wrote: >> Daniel Kiper <daniel.kiper@oracle.com> writes: >> 2) In Linux, implement a virtio DMA ops which handles the grant table >> stuff for Xen (returning grant table ids + offset or something?), >> noop for others. This would be a runtime thing. > > Or perhaps an KVM specific DMA ops (which is nop) and Xen ops. > Easy enough to implement. Indeed. >> 3) In Linux, change the drivers to use this API. > > +1 >> >> Now, Xen will not be able to use vhost to accelerate, but it doesn't now >> anyway. > > Correct. Thought one could implement an ring of grant entries system > where the frontend and backend share it along with the hypervisor. > > And when the backend tries to access said memory thinking it has mapped > to the frontend (but it has not yet mapped this memory yet), it traps to > the hypervisor which then does mapping for the backend of the frontend > pages. Kind of lazy-grant system. > > Anyhow, all of that is just implementation details and hand-waving. It's unmap which is hard. You can do partial protection with lazy unmap, of course; it's a question of how strict you want your protection to be. > If we wanted we can extend vhost for when it plucks entries of the > virtq to call an specific platform API. For KVM it would be all > nops. For Xen it would do a magic pony show or such <more hand-waving>. > >> >> Am I missing anything? > > On a bit different topic: > > I am unclear about the asynchronous vs synchronous nature of Virt configuration. > Xen is all about XenBus which is more of a callback mechanism. Virt does > its stuff on MMIO and PCI which are slow - but do get you the values. > > Can we somehow make it clear that the configuration setup can be asynchronous? > That would also mean that in the future this configuration (say when migrating) > changes can be conveyed to the virtio frontends via an interrupt mechanism > (or callback) if the new host has something important to say? There are several options. One would be to add a XenBus transport for virtio (we already have PCI, mmio and CCW). But if you support PCI devices, you already deal with their synchronous nature. Cheers, Rusty. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <87ppm325i6.fsf@rustcorp.com.au>]
* Re: [virtio-dev] Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87ppm325i6.fsf@rustcorp.com.au> @ 2014-03-04 23:16 ` Michael S. Tsirkin 0 siblings, 0 replies; 28+ messages in thread From: Michael S. Tsirkin @ 2014-03-04 23:16 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Ian Campbell, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel On Mon, Mar 03, 2014 at 04:22:33PM +1030, Rusty Russell wrote: > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes: > > On Fri, Feb 21, 2014 at 11:24:14AM +1030, Rusty Russell wrote: > >> Daniel Kiper <daniel.kiper@oracle.com> writes: > >> 2) In Linux, implement a virtio DMA ops which handles the grant table > >> stuff for Xen (returning grant table ids + offset or something?), > >> noop for others. This would be a runtime thing. > > > > Or perhaps an KVM specific DMA ops (which is nop) and Xen ops. > > Easy enough to implement. > > Indeed. > > >> 3) In Linux, change the drivers to use this API. > > > > +1 > >> > >> Now, Xen will not be able to use vhost to accelerate, but it doesn't now > >> anyway. > > > > Correct. Thought one could implement an ring of grant entries system > > where the frontend and backend share it along with the hypervisor. > > > > And when the backend tries to access said memory thinking it has mapped > > to the frontend (but it has not yet mapped this memory yet), it traps to > > the hypervisor which then does mapping for the backend of the frontend > > pages. Kind of lazy-grant system. > > > > Anyhow, all of that is just implementation details and hand-waving. > > It's unmap which is hard. You can do partial protection with lazy > unmap, of course; it's a question of how strict you want your protection > to be. > > > If we wanted we can extend vhost for when it plucks entries of the > > virtq to call an specific platform API. For KVM it would be all > > nops. For Xen it would do a magic pony show or such <more hand-waving>. > > > >> > >> Am I missing anything? > > > > On a bit different topic: > > > > I am unclear about the asynchronous vs synchronous nature of Virt configuration. > > Xen is all about XenBus which is more of a callback mechanism. Virt does > > its stuff on MMIO and PCI which are slow - but do get you the values. > > > > Can we somehow make it clear that the configuration setup can be asynchronous? > > That would also mean that in the future this configuration (say when migrating) > > changes can be conveyed to the virtio frontends via an interrupt mechanism > > (or callback) if the new host has something important to say? > > There are several options. One would be to add a XenBus transport for > virtio (we already have PCI, mmio and CCW). > > But if you support PCI devices, you already deal with their synchronous > nature. > > Cheers, > Rusty. > It might be possible to allow adding new features with an asynchronouse callback, e.g. after migration. But what happens if you migrate to a different host that has less features? Device likely won't be able to proceed until features are negotiated, so this makes it synchronous again. > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: VIRTIO - compatibility with different virtualization solutions [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> 2014-02-19 4:42 ` Anthony Liguori 2014-02-19 10:09 ` Ian Campbell @ 2014-02-19 10:11 ` Ian Campbell 2 siblings, 0 replies; 28+ messages in thread From: Ian Campbell @ 2014-02-19 10:11 UTC (permalink / raw) To: Rusty Russell Cc: virtio-dev, wei.liu2, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel On Wed, 2014-02-19 at 10:56 +1030, Rusty Russell wrote: > Sorry for the delayed response, I was pondering... CC changed > to virtio-dev. Which apparently is subscribers only + discard as opposed to moderate, so my previous post won't show up there. Ian. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Is: Wrap-up Was: VIRTIO - compatibility with different virtualization solutions 2014-02-17 13:23 VIRTIO - compatibility with different virtualization solutions Daniel Kiper 2014-02-19 0:26 ` Rusty Russell [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> @ 2014-03-10 7:54 ` Daniel Kiper [not found] ` <20140310075423.GE31874@olila.local.net-space.pl> 3 siblings, 0 replies; 28+ messages in thread From: Daniel Kiper @ 2014-03-10 7:54 UTC (permalink / raw) To: advisory-board, virtio-dev, xen-devel Cc: wei.liu2, ian.campbell, rusty, andreslc, mst, ian, anthony, sasha.levin, stefano.stabellini Hi, After some email exchange it looks that there is no consensus how VRITIO should work on Xen. There is some hand waiving and some preliminary tests but it is not sufficient to take final decision and put all needed stuff into upcoming VIRTIO specification release. However, it is certain that any solution which will be chosen in upcoming future MUST not break Xen security model, has a good performance and easily integrate with currently existing solutions. Additionally, it looks that that some terms (e.g. physical address) in VIRTIO specification shall be changed to something which is more platform independent (e.g. handle). Although such changes should not lead to changes in currently existing VIRTIO implementations. Taking into account above mentioned things I state that more research and development work is needed. I volunteer to lead this task. Hence I would like to propose that VIRTIO-46 task (Verify that VIRTIO devices could be implemented in Xen environment) should be opened and work on it should be deferred (from formal POV but that does not mean that work should be on hold in real) until next VIRTIO release. Additionally, I would like to rise VIRTIO on Xen issue during Xen Hackathon 2014 and in other avenues to discuss this with other stake holders of VIRTIO. Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <20140310075423.GE31874@olila.local.net-space.pl>]
* Re: Is: Wrap-up Was: VIRTIO - compatibility with different virtualization solutions [not found] ` <20140310075423.GE31874@olila.local.net-space.pl> @ 2014-03-10 11:19 ` Fabio Fantoni 2014-03-11 14:29 ` Ian Campbell 0 siblings, 1 reply; 28+ messages in thread From: Fabio Fantoni @ 2014-03-10 11:19 UTC (permalink / raw) To: Daniel Kiper, advisory-board, virtio-dev, xen-devel Cc: wei.liu2, ian.campbell, mst, rusty, andreslc, stefano.stabellini, ian, anthony, sasha.levin Il 10/03/2014 08:54, Daniel Kiper ha scritto: > Hi, > > After some email exchange it looks that there is no consensus how VRITIO > should work on Xen. There is some hand waiving and some preliminary tests > but it is not sufficient to take final decision and put all needed stuff > into upcoming VIRTIO specification release. However, it is certain that > any solution which will be chosen in upcoming future MUST not break Xen > security model, has a good performance and easily integrate with currently > existing solutions. Additionally, it looks that that some terms (e.g. > physical address) in VIRTIO specification shall be changed to something > which is more platform independent (e.g. handle). Although such changes > should not lead to changes in currently existing VIRTIO implementations. > > Taking into account above mentioned things I state that more research > and development work is needed. I volunteer to lead this task. Hence > I would like to propose that VIRTIO-46 task (Verify that VIRTIO devices > could be implemented in Xen environment) should be opened and work on > it should be deferred (from formal POV but that does not mean that work > should be on hold in real) until next VIRTIO release. Additionally, > I would like to rise VIRTIO on Xen issue during Xen Hackathon 2014 and > in other avenues to discuss this with other stake holders of VIRTIO. > > Daniel Thanks for your work about virtio. For now xen uses virtio only with spice vdagent on hvm domUs (that use virtio-serial). Virtio net and disks xl implementation is esperimental and old: http://wiki.xen.org/wiki/Virtio_On_Xen Last year i did some test with virtio-net and I found one regression with qemu 1.6. If can be useful some details from my old mails about: > Based on my tests about virtio: > - virtio-serial seems working out of box with windows domUs and also > with xen pv driver, on linux domUs with old kernel (tested 2.6.32) is > also working out of box but with newer kernel (tested >=3.2) require > pci=nomsi to work correctly and works also with xen pvhvm drivers, for > now I not found solution for msi problem, there are some posts about it. > - virtio-net was working out of box but with recent qemu versions is > broken due qemu regression, I have narrowed down > with bisect (one commit between 4 Jul 2013 and 22 Jul 2013) but I > unable to found the exact commit of regression because there are other > critical problems with xen in the range. > - I not tested virtio-disk and I not know if is working with recent > xen and qemu version. About msi problem with virtio, xen and recent linux kernel see here: http://lists.xen.org/archives/html/xen-devel/2014-02/msg00192.html Recently I have not had time to test with qemu 2. Certainly in the next few months I will further tests. Let me know when there will be particular changesand Iwill test and try to help if possible. Thanks for any reply. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Is: Wrap-up Was: VIRTIO - compatibility with different virtualization solutions 2014-03-10 11:19 ` Fabio Fantoni @ 2014-03-11 14:29 ` Ian Campbell 0 siblings, 0 replies; 28+ messages in thread From: Ian Campbell @ 2014-03-11 14:29 UTC (permalink / raw) To: Fabio Fantoni Cc: virtio-dev, wei.liu2, mst, rusty, andreslc, Daniel Kiper, stefano.stabellini, ian, anthony, sasha.levin, xen-devel, advisory-board On Mon, 2014-03-10 at 12:19 +0100, Fabio Fantoni wrote: > For now xen uses virtio only with ... I think it is important to separate the "virtio-pci happens to mostly work for Xen HVM guests" case (which is what you are referring to here) from the "virtio-xen using the Xen PV interfaces, preserving the useful security and isolation properties of Xen" (which is being discussed here and is a topic for future standardisation). Ian. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <mailman.9276.1392977438.24322.xen-devel@lists.xen.org>]
* Re: VIRTIO - compatibility with different virtualization solutions [not found] <mailman.9276.1392977438.24322.xen-devel@lists.xen.org> @ 2014-02-21 16:41 ` Andres Lagar-Cavilla 0 siblings, 0 replies; 28+ messages in thread From: Andres Lagar-Cavilla @ 2014-02-21 16:41 UTC (permalink / raw) To: xen-devel Cc: virtio-dev, Wei Liu, Ian Campbell, Stefano Stabellini, Rusty Russell, Daniel Kiper, ian, Anthony Liguori, sasha.levin > On Thu, Feb 20, 2014 at 06:50:59PM -0800, Anthony Liguori wrote: >> On Wed, Feb 19, 2014 at 5:31 PM, Rusty Russell <rusty@au1.ibm.com> wrote: >>> Anthony Liguori <anthony@codemonkey.ws> writes: >>>> On Tue, Feb 18, 2014 at 4:26 PM, Rusty Russell <rusty@au1.ibm.com> wrote: >>>>> Daniel Kiper <daniel.kiper@oracle.com> writes: >>>>>> Hi, >>>>>> >>>>>> Below you could find a summary of work in regards to VIRTIO compatibility with >>>>>> different virtualization solutions. It was done mainly from Xen point of view >>>>>> but results are quite generic and can be applied to wide spectrum >>>>>> of virtualization platforms. >>>>> >>>>> Hi Daniel, >>>>> >>>>> Sorry for the delayed response, I was pondering... CC changed >>>>> to virtio-dev. >>>>> >>>>> From a standard POV: It's possible to abstract out the where we use >>>>> 'physical address' for 'address handle'. It's also possible to define >>>>> this per-platform (ie. Xen-PV vs everyone else). This is sane, since >>>>> Xen-PV is a distinct platform from x86. >>>> >>>> I'll go even further and say that "address handle" doesn't make sense too. >>> >>> I was trying to come up with a unique term, I wasn't trying to define >>> semantics :) >> >> Understood, that wasn't really directed at you. >> >>> There are three debates here now: (1) what should the standard say, and >> >> The standard should say, "physical address" >> >>> (2) how would Linux implement it, >> >> Linux should use the PCI DMA API. >> >>> (3) should we use each platform's PCI >>> IOMMU. >> >> Just like any other PCI device :-) >> >>>> Just using grant table references is not enough to make virtio work >>>> well under Xen. You really need to use bounce buffers ala persistent >>>> grants. >>> >>> Wait, if you're using bounce buffers, you didn't make it "work well"! >> >> Preaching to the choir man... but bounce buffering is proven to be >> faster than doing grant mappings on every request. xen-blk does >> bounce buffering by default and I suspect netfront is heading that >> direction soon. >> > > FWIW Annie Li @ Oracle once implemented a persistent map prototype for > netfront and the result was not satisfying. > >> It would be a lot easier to simply have a global pool of grant tables >> that effectively becomes the DMA pool. Then the DMA API can bounce >> into that pool and those addresses can be placed on the ring. >> >> It's a little different for Xen because now the backends have to deal >> with physical addresses but the concept is still the same. >> > > How would you apply this to Xen's security model? How can hypervisor > effectively enforce access control? "Handle" and "physical address" are > essentially not the same concept, otherwise you wouldn't have proposed > this change. Not saying I'm against this change, just this description > is too vague for me to understand the bigger picture. I might be missing something trivial. But the burden of enforcing visibility of memory only for handles befalls on the hypervisor. Taking KVM for example, the whole RAM of a guest is a vma in the mm of the faulting qemu process. That's KVM's way of doing things. "Handles" could be pfns for all that model cares, and translation+mapping from handles to actual guest RAM addresses is trivially O(1). And there's no guest control over ram visibility, and that's happy KVM. Xen, on the other hand, can encode a 64 bit grant handle in the "__u64 addr" field of a virtio descriptor. The negotiation happens up front, the flags field is set to signal the guest is encoding handles in there. Once the Xen virtio backend gets that descriptor out of the ring, what is left is not all that different from what netback/blkback/gntdev do today with a ring request. I'm obviously glossing over serious details (e.g. negotiation of what u64 addr means), but I what I'm going at is that I fail to understand why whole RAM visibility is a requirement for virtio. It seems to me to be a requirement for KVM and other hypervisors, while virtio is a transport and sync mechanism for high(er) level IO descriptors. Can someone please clarify why "under Xen, you really need to use bounce buffers ala persistent grants?" Is that a performance need to avoid backend side repeated mapping and TLB junking? Granted. But why would it be a correctness need? Guest side grant table works requires no hyper calls in the data path. If I am rewinding the conversation, feel free to ignore, but I'm not feeling a lot of clarity in the dialogue right now. Thanks Andres > > But a downside for sure is that if we go with this change we then have > to maintain two different paths in backend. However small the difference > is it is still a burden. > > Wei. > ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2014-03-11 14:29 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-17 13:23 VIRTIO - compatibility with different virtualization solutions Daniel Kiper 2014-02-19 0:26 ` Rusty Russell [not found] ` <87vbwcaqxe.fsf@rustcorp.com.au> 2014-02-19 4:42 ` Anthony Liguori 2014-02-20 1:31 ` Rusty Russell [not found] ` <87ha7ubme0.fsf@rustcorp.com.au> 2014-02-20 12:28 ` Stefano Stabellini 2014-02-20 20:28 ` Daniel Kiper 2014-02-21 2:50 ` Anthony Liguori 2014-02-21 10:05 ` Wei Liu 2014-02-21 15:01 ` Konrad Rzeszutek Wilk 2014-02-25 0:33 ` Rusty Russell [not found] ` <87y51058vf.fsf@rustcorp.com.au> 2014-02-25 21:09 ` Konrad Rzeszutek Wilk 2014-02-19 10:09 ` Ian Campbell 2014-02-20 7:48 ` Rusty Russell [not found] ` <8761oab4y7.fsf@rustcorp.com.au> 2014-02-20 20:37 ` Daniel Kiper [not found] ` <20140220203704.GG3441@olila.local.net-space.pl> 2014-02-21 0:54 ` [virtio-dev] " Rusty Russell [not found] ` <8761o99tft.fsf@rustcorp.com.au> 2014-02-21 3:00 ` Anthony Liguori 2014-02-25 0:40 ` Rusty Russell [not found] ` <87vbw458jr.fsf@rustcorp.com.au> 2014-02-25 21:12 ` Konrad Rzeszutek Wilk 2014-02-26 9:38 ` Ian Campbell 2014-02-21 10:21 ` Wei Liu 2014-02-21 15:11 ` Konrad Rzeszutek Wilk 2014-03-03 5:52 ` Rusty Russell [not found] ` <87ppm325i6.fsf@rustcorp.com.au> 2014-03-04 23:16 ` Michael S. Tsirkin 2014-02-19 10:11 ` Ian Campbell 2014-03-10 7:54 ` Is: Wrap-up Was: " Daniel Kiper [not found] ` <20140310075423.GE31874@olila.local.net-space.pl> 2014-03-10 11:19 ` Fabio Fantoni 2014-03-11 14:29 ` Ian Campbell [not found] <mailman.9276.1392977438.24322.xen-devel@lists.xen.org> 2014-02-21 16:41 ` Andres Lagar-Cavilla
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.