All of lore.kernel.org
 help / color / mirror / Atom feed
* [Hackathon minutes] PV network improvements
@ 2013-05-20 14:08 Stefano Stabellini
  2013-05-20 14:49 ` George Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Stefano Stabellini @ 2013-05-20 14:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Stefano Stabellini

Hi all,
these are Konrad's and my notes (mostly Konrad's) on possible
improvements of the PV network protocol, taken at the Hackathon.


A) Network bandwidth: multipage rings
The max outstanding amount of data the it can have is 898kB (64K of
data use 18 slot, out of 256. 256 / 18 = 14, 14 * 64KB).  This can be
expanded by having multi-page to expand the ring. This would benefit NFS
and bulk data transfer (such as netperf data).


B) Producer and consumer index is on the same cache line
In present hardware that means the reader and writer will compete for
the same cacheline causing a ping-pong between sockets.
This can be solved by having a feature-split-indexes (or better name)
where the req_prod and req_event as a tuple are different from the
rsp_prod and rsp_prod. This would entail using 128bytes of the ring at
the start - each cacheline for each tuple. 


C)  Cache alignment of requests
The fix is to make the request structures more cache-aligned. For
networking that means making it 16 bytes and block 64 bytes.
Since it does not shrink the structure but just expands it, could be
called feature-align-slot.


E) Multiqueue (request-feature-multiqueue)
It means creating many TX and RX rings for each vif.


F) don't gnt_copy all of the requests
Instead don't touch them and let the Xen IOMMU create appropriate
entries. This would require the DMA API in dom0 to be aware whether the
grant has been done and if not (so FOREIGN, aka no m2p_override), then
do the hypercall to tell the hypervisor that this grant is going to be
used by a specific PCI device. This would create the IOMMU entry in Xen.


G) On TX side, do persistent grant mapping
This would only be done from frontend -> backend path.  That means that
we could exhaust initial domains memory.


H) Affinity of the frontend and backend being on the same NUMA node
This touches upon the discussion about NUMA and having PV guests be
aware of memory layout. It also means that each backend kthread needs to
be on a different NUMA node.


I) separate request and response rings for TX and RX


J) Map the whole physical memory of the machine in dom0
If mapping/unmapping or copying slows us down, could we just keep the
whole physical memory of the machine mapped in dom0 (with corresponding
IOMMU entries)?
At that point the frontend could just pass mfn numbers to the backend,
and the backend would already have them mapped.
>From a security perspective it doesn't change anything when running
the backend in dom0, because dom0 is already capable of mapping random
pages of any guests. QEMU instances do that all the time.
But it would take away one of the benefits of deploying driver domains:
we wouldn't be able to run the backends at a lower privilege level.
However it might still be worth considering as an option? The backend is
still trusted and protected from the frontend, but the frontend wouldn't
be protected from the backend.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 14:08 [Hackathon minutes] PV network improvements Stefano Stabellini
@ 2013-05-20 14:49 ` George Dunlap
  2013-05-20 18:33   ` Wei Liu
  2013-05-20 18:31 ` Wei Liu
  2013-05-20 19:36 ` annie li
  2 siblings, 1 reply; 20+ messages in thread
From: George Dunlap @ 2013-05-20 14:49 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

On Mon, May 20, 2013 at 3:08 PM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> Hi all,
> these are Konrad's and my notes (mostly Konrad's) on possible
> improvements of the PV network protocol, taken at the Hackathon.
>
>
> A) Network bandwidth: multipage rings
> The max outstanding amount of data the it can have is 898kB (64K of
> data use 18 slot, out of 256. 256 / 18 = 14, 14 * 64KB).  This can be
> expanded by having multi-page to expand the ring. This would benefit NFS
> and bulk data transfer (such as netperf data).
>
>
> B) Producer and consumer index is on the same cache line
> In present hardware that means the reader and writer will compete for
> the same cacheline causing a ping-pong between sockets.
> This can be solved by having a feature-split-indexes (or better name)
> where the req_prod and req_event as a tuple are different from the
> rsp_prod and rsp_prod. This would entail using 128bytes of the ring at
> the start - each cacheline for each tuple.
>
>
> C)  Cache alignment of requests
> The fix is to make the request structures more cache-aligned. For
> networking that means making it 16 bytes and block 64 bytes.
> Since it does not shrink the structure but just expands it, could be
> called feature-align-slot.
>
>
> E) Multiqueue (request-feature-multiqueue)
> It means creating many TX and RX rings for each vif.
>
>
> F) don't gnt_copy all of the requests
> Instead don't touch them and let the Xen IOMMU create appropriate
> entries. This would require the DMA API in dom0 to be aware whether the
> grant has been done and if not (so FOREIGN, aka no m2p_override), then
> do the hypercall to tell the hypervisor that this grant is going to be
> used by a specific PCI device. This would create the IOMMU entry in Xen.
>
>
> G) On TX side, do persistent grant mapping
> This would only be done from frontend -> backend path.  That means that
> we could exhaust initial domains memory.
>
>
> H) Affinity of the frontend and backend being on the same NUMA node
> This touches upon the discussion about NUMA and having PV guests be
> aware of memory layout. It also means that each backend kthread needs to
> be on a different NUMA node.
>
>
> I) separate request and response rings for TX and RX
>
>
> J) Map the whole physical memory of the machine in dom0
> If mapping/unmapping or copying slows us down, could we just keep the
> whole physical memory of the machine mapped in dom0 (with corresponding
> IOMMU entries)?
> At that point the frontend could just pass mfn numbers to the backend,
> and the backend would already have them mapped.
> >From a security perspective it doesn't change anything when running
> the backend in dom0, because dom0 is already capable of mapping random
> pages of any guests. QEMU instances do that all the time.
> But it would take away one of the benefits of deploying driver domains:
> we wouldn't be able to run the backends at a lower privilege level.
> However it might still be worth considering as an option? The backend is
> still trusted and protected from the frontend, but the frontend wouldn't
> be protected from the backend.

What's missing from this was my side of the discussion:

I was saying that if TLB flushes from grant-unmap is indeed the
problem, then maybe we could have the *front-end* in charge of
requesting a TLB flush for its pages.  The strict TLB flushing is to
protect a frontend from rogue back-ends from reading sensitive data;
if the front-end were willing to just not use the pages for a short
amount of time, and issue a flush say every second or so, that would
reduce the TLB flushes greatly while maintaining the safety advantages
of driver domains.

 -George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 14:08 [Hackathon minutes] PV network improvements Stefano Stabellini
  2013-05-20 14:49 ` George Dunlap
@ 2013-05-20 18:31 ` Wei Liu
  2013-05-21  8:31   ` Ian Campbell
  2013-05-21  9:26   ` Tim Deegan
  2013-05-20 19:36 ` annie li
  2 siblings, 2 replies; 20+ messages in thread
From: Wei Liu @ 2013-05-20 18:31 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, wei.liu2

On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> Hi all,
> these are Konrad's and my notes (mostly Konrad's) on possible
> improvements of the PV network protocol, taken at the Hackathon.
> 

Just for completeness, these items are future working items. I'm now
upstreaming my queues to lay a baseline for these items, which include:

1. split event channels support (generally useful)
2. netback global page pool (prerequisite for 1:1 model)
3. kthread + NAPI 1:1 model (prerequisite for multiqueue)

> 
> A) Network bandwidth: multipage rings
> The max outstanding amount of data the it can have is 898kB (64K of
> data use 18 slot, out of 256. 256 / 18 = 14, 14 * 64KB).  This can be
> expanded by having multi-page to expand the ring. This would benefit NFS
> and bulk data transfer (such as netperf data).
> 

This is in my queue as well. It's generic change in xenbus interface
which can benefit not only network but also block device.

> 
[...]
> J) Map the whole physical memory of the machine in dom0
> If mapping/unmapping or copying slows us down, could we just keep the
> whole physical memory of the machine mapped in dom0 (with corresponding
> IOMMU entries)?
> At that point the frontend could just pass mfn numbers to the backend,
> and the backend would already have them mapped.
> >From a security perspective it doesn't change anything when running
> the backend in dom0, because dom0 is already capable of mapping random
> pages of any guests. QEMU instances do that all the time.
> But it would take away one of the benefits of deploying driver domains:
> we wouldn't be able to run the backends at a lower privilege level.
> However it might still be worth considering as an option? The backend is
> still trusted and protected from the frontend, but the frontend wouldn't
> be protected from the backend.
> 

I think Dom0 mapping all machine memory is a good starting point. As for
the driver domain, can we not have a driver domain mapped all of its
target's machine memory? What's the security implication here?


Wei.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 14:49 ` George Dunlap
@ 2013-05-20 18:33   ` Wei Liu
  2013-05-21  8:22     ` Ian Campbell
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Liu @ 2013-05-20 18:33 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel, wei.liu2, Stefano Stabellini

On Mon, May 20, 2013 at 03:49:32PM +0100, George Dunlap wrote:
[...]
> > J) Map the whole physical memory of the machine in dom0
> > If mapping/unmapping or copying slows us down, could we just keep the
> > whole physical memory of the machine mapped in dom0 (with corresponding
> > IOMMU entries)?
> > At that point the frontend could just pass mfn numbers to the backend,
> > and the backend would already have them mapped.
> > >From a security perspective it doesn't change anything when running
> > the backend in dom0, because dom0 is already capable of mapping random
> > pages of any guests. QEMU instances do that all the time.
> > But it would take away one of the benefits of deploying driver domains:
> > we wouldn't be able to run the backends at a lower privilege level.
> > However it might still be worth considering as an option? The backend is
> > still trusted and protected from the frontend, but the frontend wouldn't
> > be protected from the backend.
> 
> What's missing from this was my side of the discussion:
> 
> I was saying that if TLB flushes from grant-unmap is indeed the
> problem, then maybe we could have the *front-end* in charge of
> requesting a TLB flush for its pages.  The strict TLB flushing is to
> protect a frontend from rogue back-ends from reading sensitive data;
> if the front-end were willing to just not use the pages for a short
> amount of time, and issue a flush say every second or so, that would
> reduce the TLB flushes greatly while maintaining the safety advantages
> of driver domains.
> 

I'm not sure I get what you mean here. Are you saying DomU flushes
Dom0's TLB entries?


Wei.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 14:08 [Hackathon minutes] PV network improvements Stefano Stabellini
  2013-05-20 14:49 ` George Dunlap
  2013-05-20 18:31 ` Wei Liu
@ 2013-05-20 19:36 ` annie li
  2 siblings, 0 replies; 20+ messages in thread
From: annie li @ 2013-05-20 19:36 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel


On 2013-5-20 10:08, Stefano Stabellini wrote:
> Hi all,
> these are Konrad's and my notes (mostly Konrad's) on possible
> improvements of the PV network protocol, taken at the Hackathon.
>
> [...]
>
> G) On TX side, do persistent grant mapping
> This would only be done from frontend -> backend path.  That means that
> we could exhaust initial domains memory.

I did some persistent grant mapping patch on both TX and RX sides a 
while ago,  and could keep TX persistent and optimize it.

Thanks
Annie

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 18:33   ` Wei Liu
@ 2013-05-21  8:22     ` Ian Campbell
  2013-05-21  8:31       ` George Dunlap
  0 siblings, 1 reply; 20+ messages in thread
From: Ian Campbell @ 2013-05-21  8:22 UTC (permalink / raw)
  To: Wei Liu; +Cc: George Dunlap, xen-devel, Stefano Stabellini

On Mon, 2013-05-20 at 19:33 +0100, Wei Liu wrote:
> On Mon, May 20, 2013 at 03:49:32PM +0100, George Dunlap wrote:
> [...]
> > > J) Map the whole physical memory of the machine in dom0
> > > If mapping/unmapping or copying slows us down, could we just keep the
> > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > IOMMU entries)?
> > > At that point the frontend could just pass mfn numbers to the backend,
> > > and the backend would already have them mapped.
> > > >From a security perspective it doesn't change anything when running
> > > the backend in dom0, because dom0 is already capable of mapping random
> > > pages of any guests. QEMU instances do that all the time.
> > > But it would take away one of the benefits of deploying driver domains:
> > > we wouldn't be able to run the backends at a lower privilege level.
> > > However it might still be worth considering as an option? The backend is
> > > still trusted and protected from the frontend, but the frontend wouldn't
> > > be protected from the backend.
> > 
> > What's missing from this was my side of the discussion:
> > 
> > I was saying that if TLB flushes from grant-unmap is indeed the
> > problem, then maybe we could have the *front-end* in charge of
> > requesting a TLB flush for its pages.  The strict TLB flushing is to
> > protect a frontend from rogue back-ends from reading sensitive data;
> > if the front-end were willing to just not use the pages for a short
> > amount of time, and issue a flush say every second or so, that would
> > reduce the TLB flushes greatly while maintaining the safety advantages
> > of driver domains.
> > 
> 
> I'm not sure I get what you mean here. Are you saying DomU flushes
> Dom0's TLB entries?

The gnt_unmap made by dom0 needs to flush the TLB of any physical
processor which may have seen the mapping, which means approximately all
dom0 vcpus.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21  8:22     ` Ian Campbell
@ 2013-05-21  8:31       ` George Dunlap
  0 siblings, 0 replies; 20+ messages in thread
From: George Dunlap @ 2013-05-21  8:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Wei Liu, Stefano Stabellini

On 05/21/2013 09:22 AM, Ian Campbell wrote:
> On Mon, 2013-05-20 at 19:33 +0100, Wei Liu wrote:
>> On Mon, May 20, 2013 at 03:49:32PM +0100, George Dunlap wrote:
>> [...]
>>>> J) Map the whole physical memory of the machine in dom0
>>>> If mapping/unmapping or copying slows us down, could we just keep the
>>>> whole physical memory of the machine mapped in dom0 (with corresponding
>>>> IOMMU entries)?
>>>> At that point the frontend could just pass mfn numbers to the backend,
>>>> and the backend would already have them mapped.
>>>> >From a security perspective it doesn't change anything when running
>>>> the backend in dom0, because dom0 is already capable of mapping random
>>>> pages of any guests. QEMU instances do that all the time.
>>>> But it would take away one of the benefits of deploying driver domains:
>>>> we wouldn't be able to run the backends at a lower privilege level.
>>>> However it might still be worth considering as an option? The backend is
>>>> still trusted and protected from the frontend, but the frontend wouldn't
>>>> be protected from the backend.
>>>
>>> What's missing from this was my side of the discussion:
>>>
>>> I was saying that if TLB flushes from grant-unmap is indeed the
>>> problem, then maybe we could have the *front-end* in charge of
>>> requesting a TLB flush for its pages.  The strict TLB flushing is to
>>> protect a frontend from rogue back-ends from reading sensitive data;
>>> if the front-end were willing to just not use the pages for a short
>>> amount of time, and issue a flush say every second or so, that would
>>> reduce the TLB flushes greatly while maintaining the safety advantages
>>> of driver domains.
>>>
>>
>> I'm not sure I get what you mean here. Are you saying DomU flushes
>> Dom0's TLB entries?
>
> The gnt_unmap made by dom0 needs to flush the TLB of any physical
> processor which may have seen the mapping, which means approximately all
> dom0 vcpus.

That's what I was getting at.  It's Xen that does any actual TLB 
flushes, and for now the "promise" to the front-end is that the page is 
safe from the backend* after the transaction is done.  But it would be 
nicer if we could batch these flushes to happen once every few hundred 
milliseconds, or even one a second.  If we allowed the front-end to 
choose a new the interface, that said "the page is safe from the backend 
once you have made this hypercall", then guests could choose the 
"window" size based on their own parameters.

* Remember that the point of grant maps isn't to allow *dom0* access to 
the guests; dom0 already has all the access it needs.  It's to allow 
driver domains access to the guests.

  -George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 18:31 ` Wei Liu
@ 2013-05-21  8:31   ` Ian Campbell
  2013-05-21  9:26   ` Tim Deegan
  1 sibling, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2013-05-21  8:31 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Stefano Stabellini

On Mon, 2013-05-20 at 19:31 +0100, Wei Liu wrote:
> On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> [...]
> > J) Map the whole physical memory of the machine in dom0
> > If mapping/unmapping or copying slows us down, could we just keep the
> > whole physical memory of the machine mapped in dom0 (with corresponding
> > IOMMU entries)?
> > At that point the frontend could just pass mfn numbers to the backend,
> > and the backend would already have them mapped.
> > >From a security perspective it doesn't change anything when running
> > the backend in dom0, because dom0 is already capable of mapping random
> > pages of any guests. QEMU instances do that all the time.

Actually there are mechanisms in place to remove this privilege from
dom0, specifically there is an XSM class (terminology?) for
non-migratable domains which effectively equates to exactly this
restriction. Of course you need stub qemu too.

> > But it would take away one of the benefits of deploying driver domains:
> > we wouldn't be able to run the backends at a lower privilege level.
> > However it might still be worth considering as an option? The backend is
> > still trusted and protected from the frontend, but the frontend wouldn't
> > be protected from the backend.
> > 
> 
> I think Dom0 mapping all machine memory is a good starting point. As for
> the driver domain, can we not have a driver domain mapped all of its
> target's machine memory? What's the security implication here?

It gives the driver domain an enormous amount of privilege which it
doesn't require and which it could use to compromise the integrity of
the system (i.e. to snoop any guest's memory and extract "secrets"). It
effectively reduces our security/isolation story to "effectively
equivalent to KVM" and this isolation is one of the big selling points
for Xen. I don't think we should go down this path either for dom0 or
driver domains and I am absolutely positive that there are other
approaches we should be investigating before we even start to consider
it.

George's idea of not flushing at unmap time, with co-operation from the
frontend to not reuse the pages until it has batched up a bigger flush
seems like an interesting one to look into. By choosing the sizes and
times correct it may even be that by the time domU wants to reuse the
page the TLB has already been flushed for some other reason (context
switch etc) and the hypervisor can elide the expense.

There are probably mechanisms in the guest kernels which allow us to
hold on to memory but still provide a memory pressure hook so we can
flush immediately instead of OOMing.

Ian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-20 18:31 ` Wei Liu
  2013-05-21  8:31   ` Ian Campbell
@ 2013-05-21  9:26   ` Tim Deegan
  2013-05-21  9:39     ` Wei Liu
                       ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Tim Deegan @ 2013-05-21  9:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Stefano Stabellini

At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > J) Map the whole physical memory of the machine in dom0
> > If mapping/unmapping or copying slows us down, could we just keep the
> > whole physical memory of the machine mapped in dom0 (with corresponding
> > IOMMU entries)?
> > At that point the frontend could just pass mfn numbers to the backend,
> > and the backend would already have them mapped.
> > >From a security perspective it doesn't change anything when running
> > the backend in dom0, because dom0 is already capable of mapping random
> > pages of any guests. QEMU instances do that all the time.
> > But it would take away one of the benefits of deploying driver domains:
> > we wouldn't be able to run the backends at a lower privilege level.
> > However it might still be worth considering as an option? The backend is
> > still trusted and protected from the frontend, but the frontend wouldn't
> > be protected from the backend.
> > 
> 
> I think Dom0 mapping all machine memory is a good starting point.

I _strongly_ disagree.  The opportunity for disaggregation and reduction
of privilege in backends is probably Xen's biggest techical advantage
and we should not be taking any backward steps there.

> As for the driver domain, can we not have a driver domain mapped all
> of its target's machine memory? What's the security implication here?

If, say, a network driver domain is compromised it's the difference
between intercepting network traffic and total control of the OS.
It's probably worth reading some of the Xen papers about this stuff,
if you haven't already:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.6391
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.229.3708

Tim.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21  9:26   ` Tim Deegan
@ 2013-05-21  9:39     ` Wei Liu
  2013-05-21 10:11       ` Tim Deegan
  2013-05-21 10:01     ` George Dunlap
  2013-05-21 10:51     ` Stefano Stabellini
  2 siblings, 1 reply; 20+ messages in thread
From: Wei Liu @ 2013-05-21  9:39 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Wei Liu, Stefano Stabellini

On Tue, May 21, 2013 at 10:26:00AM +0100, Tim Deegan wrote:
> At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > J) Map the whole physical memory of the machine in dom0
> > > If mapping/unmapping or copying slows us down, could we just keep the
> > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > IOMMU entries)?
> > > At that point the frontend could just pass mfn numbers to the backend,
> > > and the backend would already have them mapped.
> > > >From a security perspective it doesn't change anything when running
> > > the backend in dom0, because dom0 is already capable of mapping random
> > > pages of any guests. QEMU instances do that all the time.
> > > But it would take away one of the benefits of deploying driver domains:
> > > we wouldn't be able to run the backends at a lower privilege level.
> > > However it might still be worth considering as an option? The backend is
> > > still trusted and protected from the frontend, but the frontend wouldn't
> > > be protected from the backend.
> > > 
> > 
> > I think Dom0 mapping all machine memory is a good starting point.
> 
> I _strongly_ disagree.  The opportunity for disaggregation and reduction
> of privilege in backends is probably Xen's biggest techical advantage
> and we should not be taking any backward steps there.
> 

I agree with you that disaggregation and reduction of privilege is Xen's
biggest technical advantage.

Just to make clear, this idea was summerized from a discussion among
George, Stefano and I on the way back from hackathon. We want to see if
things like mapping / unmapping incur heavy performance penalty. As now
it is really hard to identify the real performance bottleneck we would
like to have some quick hack to see how things work.

> > As for the driver domain, can we not have a driver domain mapped all
> > of its target's machine memory? What's the security implication here?
> 
> If, say, a network driver domain is compromised it's the difference
> between intercepting network traffic and total control of the OS.
> It's probably worth reading some of the Xen papers about this stuff,
> if you haven't already:
> 
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.6391
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.229.3708
> 

Thanks Tim. I read them before. :-)

We're just talking about some experimental things here, not something
that set in stone and must be done in the future.


Wei.

> Tim.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21  9:26   ` Tim Deegan
  2013-05-21  9:39     ` Wei Liu
@ 2013-05-21 10:01     ` George Dunlap
  2013-05-21 10:06       ` Wei Liu
  2013-05-21 10:51     ` Stefano Stabellini
  2 siblings, 1 reply; 20+ messages in thread
From: George Dunlap @ 2013-05-21 10:01 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Wei Liu, Stefano Stabellini

On Tue, May 21, 2013 at 10:26 AM, Tim Deegan <tim@xen.org> wrote:
> At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
>> On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
>> > J) Map the whole physical memory of the machine in dom0
>> > If mapping/unmapping or copying slows us down, could we just keep the
>> > whole physical memory of the machine mapped in dom0 (with corresponding
>> > IOMMU entries)?
>> > At that point the frontend could just pass mfn numbers to the backend,
>> > and the backend would already have them mapped.
>> > >From a security perspective it doesn't change anything when running
>> > the backend in dom0, because dom0 is already capable of mapping random
>> > pages of any guests. QEMU instances do that all the time.
>> > But it would take away one of the benefits of deploying driver domains:
>> > we wouldn't be able to run the backends at a lower privilege level.
>> > However it might still be worth considering as an option? The backend is
>> > still trusted and protected from the frontend, but the frontend wouldn't
>> > be protected from the backend.
>> >
>>
>> I think Dom0 mapping all machine memory is a good starting point.
>
> I _strongly_ disagree.  The opportunity for disaggregation and reduction
> of privilege in backends is probably Xen's biggest techical advantage
> and we should not be taking any backward steps there.

I think Wei meant, "A good point to start the investigation".  If
having all the memory mapped doesn't give any performance advantage,
then a more complicated interface to avoid TLB flushes is mos likely a
waste of time.  If it does, then we can try see if we can find a way
to get performance without giving up security.

 -George

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 10:01     ` George Dunlap
@ 2013-05-21 10:06       ` Wei Liu
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Liu @ 2013-05-21 10:06 UTC (permalink / raw)
  To: George Dunlap; +Cc: Wei Liu, xen-devel, Tim Deegan, Stefano Stabellini

On Tue, May 21, 2013 at 11:01:51AM +0100, George Dunlap wrote:
> On Tue, May 21, 2013 at 10:26 AM, Tim Deegan <tim@xen.org> wrote:
> > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> >> On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> >> > J) Map the whole physical memory of the machine in dom0
> >> > If mapping/unmapping or copying slows us down, could we just keep the
> >> > whole physical memory of the machine mapped in dom0 (with corresponding
> >> > IOMMU entries)?
> >> > At that point the frontend could just pass mfn numbers to the backend,
> >> > and the backend would already have them mapped.
> >> > >From a security perspective it doesn't change anything when running
> >> > the backend in dom0, because dom0 is already capable of mapping random
> >> > pages of any guests. QEMU instances do that all the time.
> >> > But it would take away one of the benefits of deploying driver domains:
> >> > we wouldn't be able to run the backends at a lower privilege level.
> >> > However it might still be worth considering as an option? The backend is
> >> > still trusted and protected from the frontend, but the frontend wouldn't
> >> > be protected from the backend.
> >> >
> >>
> >> I think Dom0 mapping all machine memory is a good starting point.
> >
> > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > of privilege in backends is probably Xen's biggest techical advantage
> > and we should not be taking any backward steps there.
> 
> I think Wei meant, "A good point to start the investigation".  If

Yes that's what I meant. :-)


Wei.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21  9:39     ` Wei Liu
@ 2013-05-21 10:11       ` Tim Deegan
  0 siblings, 0 replies; 20+ messages in thread
From: Tim Deegan @ 2013-05-21 10:11 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Stefano Stabellini

At 10:39 +0100 on 21 May (1369132774), Wei Liu wrote:
> On Tue, May 21, 2013 at 10:26:00AM +0100, Tim Deegan wrote:
> > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > J) Map the whole physical memory of the machine in dom0
> > > > If mapping/unmapping or copying slows us down, could we just keep the
> > > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > > IOMMU entries)?
> > > > At that point the frontend could just pass mfn numbers to the backend,
> > > > and the backend would already have them mapped.
> > > > >From a security perspective it doesn't change anything when running
> > > > the backend in dom0, because dom0 is already capable of mapping random
> > > > pages of any guests. QEMU instances do that all the time.
> > > > But it would take away one of the benefits of deploying driver domains:
> > > > we wouldn't be able to run the backends at a lower privilege level.
> > > > However it might still be worth considering as an option? The backend is
> > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > be protected from the backend.
> > > > 
> > > 
> > > I think Dom0 mapping all machine memory is a good starting point.
> > 
> > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > of privilege in backends is probably Xen's biggest techical advantage
> > and we should not be taking any backward steps there.
> > 
> 
> I agree with you that disaggregation and reduction of privilege is Xen's
> biggest technical advantage.
> 
> Just to make clear, this idea was summerized from a discussion among
> George, Stefano and I on the way back from hackathon. We want to see if
> things like mapping / unmapping incur heavy performance penalty.

Ah, I see. :)  As an experiment to measure the overheads it's obviously
a Good Thing.  I thought you were considering it as a _solution_ to the
perf problem!

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21  9:26   ` Tim Deegan
  2013-05-21  9:39     ` Wei Liu
  2013-05-21 10:01     ` George Dunlap
@ 2013-05-21 10:51     ` Stefano Stabellini
  2013-05-21 12:52       ` Konrad Rzeszutek Wilk
  2013-05-21 13:42       ` Tim Deegan
  2 siblings, 2 replies; 20+ messages in thread
From: Stefano Stabellini @ 2013-05-21 10:51 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Wei Liu, Stefano Stabellini

On Tue, 21 May 2013, Tim Deegan wrote:
> At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > J) Map the whole physical memory of the machine in dom0
> > > If mapping/unmapping or copying slows us down, could we just keep the
> > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > IOMMU entries)?
> > > At that point the frontend could just pass mfn numbers to the backend,
> > > and the backend would already have them mapped.
> > > >From a security perspective it doesn't change anything when running
> > > the backend in dom0, because dom0 is already capable of mapping random
> > > pages of any guests. QEMU instances do that all the time.
> > > But it would take away one of the benefits of deploying driver domains:
> > > we wouldn't be able to run the backends at a lower privilege level.
> > > However it might still be worth considering as an option? The backend is
> > > still trusted and protected from the frontend, but the frontend wouldn't
> > > be protected from the backend.
> > > 
> > 
> > I think Dom0 mapping all machine memory is a good starting point.
> 
> I _strongly_ disagree.  The opportunity for disaggregation and reduction
> of privilege in backends is probably Xen's biggest techical advantage
> and we should not be taking any backward steps there.

While I agree with you, as a matter of fact the vast majority of Xen
installations today do not use driver domains. That didn't stop them
from enjoying Xen so far. Moreover the frontend/backend interface
remains narrow and difficult to exploit, it's not a fully emulated
interface (AHCI / virtio). The backend is still protected from the
frontend. Having the backend running non-privileged is a great bonus
and certainly required on a product that allows the user to install
third party driver domains. However if the driver domains are "trusted"
then I think they can also be trusted with a full memory map. After all
it has been the case for all XenServer, OVM and SLES releases so far
AFAIK.

An hypothetic future Xen release could offer both increased security
(driver domains) or increased IO performances (backends with a full
physical memory map) and give the user a choice between the two. I am
pretty sure that a non-negligible amount of people would make the
conscious choice to go for the performance option.
Why should we be the ones to force security down their throats?
After all it's all about what the users want from the project.

Obviously in an ideal world we would be able to offer both at the same
time, and maybe George's proposal is exactly what is going to achieve
that. But I was describing the case that requires us to make a choice.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 10:51     ` Stefano Stabellini
@ 2013-05-21 12:52       ` Konrad Rzeszutek Wilk
  2013-05-21 13:32         ` Stefano Stabellini
  2013-05-21 13:42       ` Tim Deegan
  1 sibling, 1 reply; 20+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-21 12:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Wei Liu, xen-devel, Tim Deegan

On Tue, May 21, 2013 at 11:51:03AM +0100, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Tim Deegan wrote:
> > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > J) Map the whole physical memory of the machine in dom0
> > > > If mapping/unmapping or copying slows us down, could we just keep the
> > > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > > IOMMU entries)?
> > > > At that point the frontend could just pass mfn numbers to the backend,
> > > > and the backend would already have them mapped.
> > > > >From a security perspective it doesn't change anything when running
> > > > the backend in dom0, because dom0 is already capable of mapping random
> > > > pages of any guests. QEMU instances do that all the time.
> > > > But it would take away one of the benefits of deploying driver domains:
> > > > we wouldn't be able to run the backends at a lower privilege level.
> > > > However it might still be worth considering as an option? The backend is
> > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > be protected from the backend.
> > > > 
> > > 
> > > I think Dom0 mapping all machine memory is a good starting point.
> > 
> > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > of privilege in backends is probably Xen's biggest techical advantage
> > and we should not be taking any backward steps there.
> 
> While I agree with you, as a matter of fact the vast majority of Xen
> installations today do not use driver domains. That didn't stop them
> from enjoying Xen so far. Moreover the frontend/backend interface
> remains narrow and difficult to exploit, it's not a fully emulated
> interface (AHCI / virtio). The backend is still protected from the
> frontend. Having the backend running non-privileged is a great bonus
> and certainly required on a product that allows the user to install
> third party driver domains. However if the driver domains are "trusted"
> then I think they can also be trusted with a full memory map. After all
> it has been the case for all XenServer, OVM and SLES releases so far
> AFAIK.
> 
> An hypothetic future Xen release could offer both increased security
> (driver domains) or increased IO performances (backends with a full
> physical memory map) and give the user a choice between the two. I am
> pretty sure that a non-negligible amount of people would make the
> conscious choice to go for the performance option.
> Why should we be the ones to force security down their throats?
> After all it's all about what the users want from the project.
> 
> Obviously in an ideal world we would be able to offer both at the same
> time, and maybe George's proposal is exactly what is going to achieve
> that. But I was describing the case that requires us to make a choice.

CC-ing Mukesh here as driver domains have some relevance to PVH work.
Please also CC Malcolm here (I don't have his email).

I would say that perhaps a better option is to do both - as in retain
the security architecture Xen has _and_ also provide increased IO performance.

Concurently everybody is also looking at both backend and frontend having a
persistent pool of grants. This means we do setup an "window" from either
backend -> frontend or vice-versa that persists. Said "window" is bolted
for the life-time of the guest. For networking the kernel stack already
copies the pages from the user-space in the kernel and copying
in the kernel to specific pages is mostly using the CPU cache. We need to
exploit that and also make sure that the path is not interrupted.
The grant_mapping on the TX side also looks a nice path - just have to
make sure that the networking API don't try to free the page once the TX
has been done (and this is where Ian's skb deconstructor would be beneficial).

For block it is a bit different as aio's are mapped from kernel to
user-space. But the neat thing there is that there is no need to inspect
the data - when giving it to the DMA device (the exception is DIF/DIX which
need calculate checksums). That is unless one needs to do the
xen_biovec_phys_mergeable (to check if the next page is contingous and
if so add new bio's and copy the data in).

But with PVH and PVHVM driver domains, and also piggybacking on the work
that Malcolm is doing (Xen IOMMU), we can skip that check. (As the PFNs
for the guest would look contingous).

In essence we can do a lot:
 1). not copying or mapping grants if we detect that they are going to
     a DMA device.
 2). The 1) above + also use the Xen IOMMU to take care of setting the
     proper EPT entries for the pages that we need. This could be done
     as part of a grant_copy or grant_light_mapping in the hypervisor. This is
     a case were we MUST copy those pages in the other domain (say the
     Ethernet header). Whether a copy is done or a light mapping
     (b/c the moment the device does the DMA operation on the granted
     page we might as well remove the mapping. Hence the "light" or
     maybe "expiring" grant.
 3). The 2) above + Intel QuickData (a DMA engine that uses the same
     L3 cache that PCI devices use) to keep the copied pages in the L3.
     This has the benefit that when the PCI device is instructed to
     fetch the data, it would do it from the L3 cache and be incredibly quick.
     This would be using the grant_copy, but instead of the hypervisor
     doing it, it instructs the Intel QuickData chipset to do it. Would
     require some form of asynchronous grant_copy mechanism.
 4). Variants of the above.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 12:52       ` Konrad Rzeszutek Wilk
@ 2013-05-21 13:32         ` Stefano Stabellini
  0 siblings, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2013-05-21 13:32 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Wei Liu, xen-devel, Tim Deegan, Stefano Stabellini

On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> On Tue, May 21, 2013 at 11:51:03AM +0100, Stefano Stabellini wrote:
> > On Tue, 21 May 2013, Tim Deegan wrote:
> > > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > > J) Map the whole physical memory of the machine in dom0
> > > > > If mapping/unmapping or copying slows us down, could we just keep the
> > > > > whole physical memory of the machine mapped in dom0 (with corresponding
> > > > > IOMMU entries)?
> > > > > At that point the frontend could just pass mfn numbers to the backend,
> > > > > and the backend would already have them mapped.
> > > > > >From a security perspective it doesn't change anything when running
> > > > > the backend in dom0, because dom0 is already capable of mapping random
> > > > > pages of any guests. QEMU instances do that all the time.
> > > > > But it would take away one of the benefits of deploying driver domains:
> > > > > we wouldn't be able to run the backends at a lower privilege level.
> > > > > However it might still be worth considering as an option? The backend is
> > > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > > be protected from the backend.
> > > > > 
> > > > 
> > > > I think Dom0 mapping all machine memory is a good starting point.
> > > 
> > > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > > of privilege in backends is probably Xen's biggest techical advantage
> > > and we should not be taking any backward steps there.
> > 
> > While I agree with you, as a matter of fact the vast majority of Xen
> > installations today do not use driver domains. That didn't stop them
> > from enjoying Xen so far. Moreover the frontend/backend interface
> > remains narrow and difficult to exploit, it's not a fully emulated
> > interface (AHCI / virtio). The backend is still protected from the
> > frontend. Having the backend running non-privileged is a great bonus
> > and certainly required on a product that allows the user to install
> > third party driver domains. However if the driver domains are "trusted"
> > then I think they can also be trusted with a full memory map. After all
> > it has been the case for all XenServer, OVM and SLES releases so far
> > AFAIK.
> > 
> > An hypothetic future Xen release could offer both increased security
> > (driver domains) or increased IO performances (backends with a full
> > physical memory map) and give the user a choice between the two. I am
> > pretty sure that a non-negligible amount of people would make the
> > conscious choice to go for the performance option.
> > Why should we be the ones to force security down their throats?
> > After all it's all about what the users want from the project.
> > 
> > Obviously in an ideal world we would be able to offer both at the same
> > time, and maybe George's proposal is exactly what is going to achieve
> > that. But I was describing the case that requires us to make a choice.
> 
> CC-ing Mukesh here as driver domains have some relevance to PVH work.
> Please also CC Malcolm here (I don't have his email).
> 
> I would say that perhaps a better option is to do both - as in retain
> the security architecture Xen has _and_ also provide increased IO performance.

Of course that is the best option.

However I think that we should know exactly what would be the level of
performances if we had all the memory mapped in the backend domain all
the time. It would be very useful to understand what we need to
optimize.  It might turn out that the difference is not that much, and
we need to optimize something else. Or it might turn out that the
difference is huge even after all the optimizations you listed below.



> Concurently everybody is also looking at both backend and frontend having a
> persistent pool of grants. This means we do setup an "window" from either
> backend -> frontend or vice-versa that persists. Said "window" is bolted
> for the life-time of the guest. For networking the kernel stack already
> copies the pages from the user-space in the kernel and copying
> in the kernel to specific pages is mostly using the CPU cache. We need to
> exploit that and also make sure that the path is not interrupted.
> The grant_mapping on the TX side also looks a nice path - just have to
> make sure that the networking API don't try to free the page once the TX
> has been done (and this is where Ian's skb deconstructor would be beneficial).
> 
> For block it is a bit different as aio's are mapped from kernel to
> user-space. But the neat thing there is that there is no need to inspect
> the data - when giving it to the DMA device (the exception is DIF/DIX which
> need calculate checksums). That is unless one needs to do the
> xen_biovec_phys_mergeable (to check if the next page is contingous and
> if so add new bio's and copy the data in).
> 
> But with PVH and PVHVM driver domains, and also piggybacking on the work
> that Malcolm is doing (Xen IOMMU), we can skip that check. (As the PFNs
> for the guest would look contingous).
> 
> In essence we can do a lot:
>  1). not copying or mapping grants if we detect that they are going to
>      a DMA device.
>  2). The 1) above + also use the Xen IOMMU to take care of setting the
>      proper EPT entries for the pages that we need. This could be done
>      as part of a grant_copy or grant_light_mapping in the hypervisor. This is
>      a case were we MUST copy those pages in the other domain (say the
>      Ethernet header). Whether a copy is done or a light mapping
>      (b/c the moment the device does the DMA operation on the granted
>      page we might as well remove the mapping. Hence the "light" or
>      maybe "expiring" grant.
>  3). The 2) above + Intel QuickData (a DMA engine that uses the same
>      L3 cache that PCI devices use) to keep the copied pages in the L3.
>      This has the benefit that when the PCI device is instructed to
>      fetch the data, it would do it from the L3 cache and be incredibly quick.
>      This would be using the grant_copy, but instead of the hypervisor
>      doing it, it instructs the Intel QuickData chipset to do it. Would
>      require some form of asynchronous grant_copy mechanism.
>  4). Variants of the above.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 10:51     ` Stefano Stabellini
  2013-05-21 12:52       ` Konrad Rzeszutek Wilk
@ 2013-05-21 13:42       ` Tim Deegan
  2013-05-21 16:58         ` Stefano Stabellini
  1 sibling, 1 reply; 20+ messages in thread
From: Tim Deegan @ 2013-05-21 13:42 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Wei Liu

At 11:51 +0100 on 21 May (1369137063), Stefano Stabellini wrote:
> On Tue, 21 May 2013, Tim Deegan wrote:
> > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > However it might still be worth considering as an option? The backend is
> > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > be protected from the backend.
> > > > 
> > > 
> > > I think Dom0 mapping all machine memory is a good starting point.
> > 
> > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > of privilege in backends is probably Xen's biggest techical advantage
> > and we should not be taking any backward steps there.
> 
> While I agree with you, as a matter of fact the vast majority of Xen
> installations today do not use driver domains.

Sure, and that's a bad thing, right?

> However if the driver domains are "trusted"
> then I think they can also be trusted with a full memory map. After all
> it has been the case for all XenServer, OVM and SLES releases so far
> AFAIK.

...and that's a bad thing, right? :)

> Obviously in an ideal world we would be able to offer both at the same
> time, and maybe George's proposal is exactly what is going to achieve
> that. But I was describing the case that requires us to make a choice.

Righto.  I don't think we need to worry about that yet.  You're all
smart engineers, and I've heard a bunch of good ideas flying around that
address the costs of mapping and unmapping in backends.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 13:42       ` Tim Deegan
@ 2013-05-21 16:58         ` Stefano Stabellini
  2013-05-22  9:52           ` Tim Deegan
  2013-05-22  9:55           ` Ian Campbell
  0 siblings, 2 replies; 20+ messages in thread
From: Stefano Stabellini @ 2013-05-21 16:58 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel, Wei Liu, Stefano Stabellini

On Tue, 21 May 2013, Tim Deegan wrote:
> At 11:51 +0100 on 21 May (1369137063), Stefano Stabellini wrote:
> > On Tue, 21 May 2013, Tim Deegan wrote:
> > > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > > However it might still be worth considering as an option? The backend is
> > > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > > be protected from the backend.
> > > > > 
> > > > 
> > > > I think Dom0 mapping all machine memory is a good starting point.
> > > 
> > > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > > of privilege in backends is probably Xen's biggest techical advantage
> > > and we should not be taking any backward steps there.
> > 
> > While I agree with you, as a matter of fact the vast majority of Xen
> > installations today do not use driver domains.
> 
> Sure, and that's a bad thing, right?
>
> > However if the driver domains are "trusted"
> > then I think they can also be trusted with a full memory map. After all
> > it has been the case for all XenServer, OVM and SLES releases so far
> > AFAIK.
> 
> ...and that's a bad thing, right? :)

It's a good thing: even though it could be better our users don't seem
to mind. :)


> > Obviously in an ideal world we would be able to offer both at the same
> > time, and maybe George's proposal is exactly what is going to achieve
> > that. But I was describing the case that requires us to make a choice.
> 
> Righto.  I don't think we need to worry about that yet.  You're all
> smart engineers, and I've heard a bunch of good ideas flying around that
> address the costs of mapping and unmapping in backends.

Right. I would consider the performance of "backend with all the memory
mapped" as the limit we should try to achieve even without having all
the memory mapped. But if it turns out that we are very far from it, we
might want to consider allowing it as an option in the meantime.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 16:58         ` Stefano Stabellini
@ 2013-05-22  9:52           ` Tim Deegan
  2013-05-22  9:55           ` Ian Campbell
  1 sibling, 0 replies; 20+ messages in thread
From: Tim Deegan @ 2013-05-22  9:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Wei Liu

At 17:58 +0100 on 21 May (1369159125), Stefano Stabellini wrote:
> On Tue, 21 May 2013, Tim Deegan wrote:
> > At 11:51 +0100 on 21 May (1369137063), Stefano Stabellini wrote:
> > > Obviously in an ideal world we would be able to offer both at the same
> > > time, and maybe George's proposal is exactly what is going to achieve
> > > that. But I was describing the case that requires us to make a choice.
> > 
> > Righto.  I don't think we need to worry about that yet.  You're all
> > smart engineers, and I've heard a bunch of good ideas flying around that
> > address the costs of mapping and unmapping in backends.
> 
> Right. I would consider the performance of "backend with all the memory
> mapped" as the limit we should try to achieve even without having all
> the memory mapped.

Yes, absolutely.

> But if it turns out that we are very far from it, we
> might want to consider allowing it as an option in the meantime.

Understood, and I still strongly disagree.

Tim.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Hackathon minutes] PV network improvements
  2013-05-21 16:58         ` Stefano Stabellini
  2013-05-22  9:52           ` Tim Deegan
@ 2013-05-22  9:55           ` Ian Campbell
  1 sibling, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2013-05-22  9:55 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Wei Liu, xen-devel, Tim Deegan

On Tue, 2013-05-21 at 17:58 +0100, Stefano Stabellini wrote:
> On Tue, 21 May 2013, Tim Deegan wrote:
> > At 11:51 +0100 on 21 May (1369137063), Stefano Stabellini wrote:
> > > On Tue, 21 May 2013, Tim Deegan wrote:
> > > > At 19:31 +0100 on 20 May (1369078279), Wei Liu wrote:
> > > > > On Mon, May 20, 2013 at 03:08:05PM +0100, Stefano Stabellini wrote:
> > > > > > However it might still be worth considering as an option? The backend is
> > > > > > still trusted and protected from the frontend, but the frontend wouldn't
> > > > > > be protected from the backend.
> > > > > > 
> > > > > 
> > > > > I think Dom0 mapping all machine memory is a good starting point.
> > > > 
> > > > I _strongly_ disagree.  The opportunity for disaggregation and reduction
> > > > of privilege in backends is probably Xen's biggest techical advantage
> > > > and we should not be taking any backward steps there.
> > > 
> > > While I agree with you, as a matter of fact the vast majority of Xen
> > > installations today do not use driver domains.
> > 
> > Sure, and that's a bad thing, right?
> >
> > > However if the driver domains are "trusted"
> > > then I think they can also be trusted with a full memory map. After all
> > > it has been the case for all XenServer, OVM and SLES releases so far
> > > AFAIK.
> > 
> > ...and that's a bad thing, right? :)
> 
> It's a good thing: even though it could be better our users don't seem
> to mind. :)

At least in the case of XenServer they are, as you know, actively moving
towards disaggregating. Other users such as XenClient, Qubeos, NSA etc
already do make use of disaggregation to a greater or lesser extent.

For the distros I think the lack of disaggregation is mostly our fault
as upstream for not making it easier to achieve, rather than a lack of
desire on the part of users. 

> > > Obviously in an ideal world we would be able to offer both at the same
> > > time, and maybe George's proposal is exactly what is going to achieve
> > > that. But I was describing the case that requires us to make a choice.
> > 
> > Righto.  I don't think we need to worry about that yet.  You're all
> > smart engineers, and I've heard a bunch of good ideas flying around that
> > address the costs of mapping and unmapping in backends.
> 
> Right. I would consider the performance of "backend with all the memory
> mapped" as the limit we should try to achieve even without having all
> the memory mapped. But if it turns out that we are very far from it, we
> might want to consider allowing it as an option in the meantime.

I think it is incredibly premature to be thinking about even considering
making this an option or anything other than a useful datapoint for
developers.

Ian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-05-22  9:55 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-20 14:08 [Hackathon minutes] PV network improvements Stefano Stabellini
2013-05-20 14:49 ` George Dunlap
2013-05-20 18:33   ` Wei Liu
2013-05-21  8:22     ` Ian Campbell
2013-05-21  8:31       ` George Dunlap
2013-05-20 18:31 ` Wei Liu
2013-05-21  8:31   ` Ian Campbell
2013-05-21  9:26   ` Tim Deegan
2013-05-21  9:39     ` Wei Liu
2013-05-21 10:11       ` Tim Deegan
2013-05-21 10:01     ` George Dunlap
2013-05-21 10:06       ` Wei Liu
2013-05-21 10:51     ` Stefano Stabellini
2013-05-21 12:52       ` Konrad Rzeszutek Wilk
2013-05-21 13:32         ` Stefano Stabellini
2013-05-21 13:42       ` Tim Deegan
2013-05-21 16:58         ` Stefano Stabellini
2013-05-22  9:52           ` Tim Deegan
2013-05-22  9:55           ` Ian Campbell
2013-05-20 19:36 ` annie li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.