All of lore.kernel.org
 help / color / mirror / Atom feed
* Proposal to allow setting up shared memory areas between VMs from xl config file
@ 2017-05-12 17:01 Zhongze Liu
  2017-05-12 17:51 ` Wei Liu
  2017-05-15  8:08 ` Jan Beulich
  0 siblings, 2 replies; 31+ messages in thread
From: Zhongze Liu @ 2017-05-12 17:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu

Hi, Xen developers,

I'm Zhongze Liu, a GSoC student of this year. Glad to meet you in the
Xen Project.  As an initial step to implementing my GSoC proposal, which
is still a draft,  I'm posting it here. And hope to hear from you your
suggestions.

====================================================
1. Motivation and Description
====================================================
Virtual machines use grant table hypercalls to setup a share page for
inter-VMs communications. These hypercalls are used by all PV
protocols today. However, very simple guests, such as baremetal
applications, might not have the infrastructure to handle the grant table.
This project is about setting up several shared memory areas for inter-VMs
communications directly from the VM config file.
So that the guest kernel doesn't have to have grant table support to be
able to communicate with other guests.

====================================================
2. Implementation Plan:
====================================================

======================================
2.1 Introduce a new VM config option in xl:
======================================
The shared areas should be shareable among several VMs,
every shared physical memory area is assigned to a set of VMs.
Therefore, a “token” or “identifier” should be used here to uniquely
identify a backing memory area.


I would suggest using an unsigned integer to serve as the identifier.
For example:

In xl config file of vm1:

    static_shared_mem = [“addr_range1= ID1”, “addr_range2 = ID2”]

In xl config file of vm2:

    static_shared_mem = [“addr_range3 = ID1”]

In xl config file of vm3:

    static_shared_mem = [“addr_range4 = ID2”]


In the example above. A memory area A1 will be shared between
vm1 and vm2 -- vm1 can access this area using addr_range1
and vm2 using addr_range3. Likewise, a memory area A2 will be
shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
and vm3 using addr_range4.

The shared memory area denoted by an identifier IDx will be
allocated when it first appear, and the memory pages will be taken from
the first VM whose static_shared_mem list contains IDx. Take the above
config files for example, if we instantiate vm1, vm2 and vm3, one after
another, the memory areas denoted by ID1 and ID2 will both be allocated
in and taken from vm1.

======================================
2.2 Store the mem-sharing information in xenstore
======================================
This information should include the length and owner of the area. And
it should also include information about where the backing memory areas
are mapped in every VM that are using it. This information should be
known to the xl command and all domains, so we utilize xenstore to keep
this information. A current plan is to place the information under
/local/shared_mem/ID. Still take the above config files as an example:

If we instantiate vm1, vm2 and vm3, one after another,
“xenstore ls -f” should output something like this:


After VM1 was instantiated, the output of “xenstore ls -f”
will be something like this:

    /local/shared_mem/ID1/owner = dom_id_of_vm1

    /local/shared_mem/ID1/size = sizeof_addr_range1

    /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1


    /local/shared_mem/ID2/owner = dom_id_of_vm1

    /local/shared_mem/ID2/size = sizeof_addr_range1

    /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2


After VM2 was instantiated, the following new lines will appear:

    /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3


After VM2 was instantiated, the following new lines will appear:

    /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4

When we encounter an id IDx during "xl create":

  + If it’s not under /local/shared_mem, create the corresponding entries
     (owner, size, and mappings) in xenstore, and allocate the memory from
     the newly created domain.

  + If it’s found under /local/shared_mem, map the pages to the newly
      created domain, and add the current domain to
      /local/shared_mem/IDx/mappings.

======================================
2.3 mapping the memory areas
======================================
Handle the newly added config option in tools/{xl, libxl}
and utilize toos/libxc to do the actual memory mapping


======================================
2.4 error handling
======================================
Add code to handle various errors: Invalid address,
mismatched length of memory area etc.

====================================================
3. Expected Outcomes/Goals:
====================================================
A new VM config option in xl will be introduced, allowing users to setup
several shared memory areas for inter-VMs communications.
This should work on both x86 and ARM.

[See also: https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Share_a_page_in_memory_from_the_VM_config_file]


Cheers,

Zhongze Liu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-12 17:01 Proposal to allow setting up shared memory areas between VMs from xl config file Zhongze Liu
@ 2017-05-12 17:51 ` Wei Liu
  2017-05-13  2:28   ` Zhongze Liu
  2017-05-15  8:08 ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Wei Liu @ 2017-05-12 17:51 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Wei Liu, Ian Jackson

Hi Zhongze

This is a nice write-up. Some comments below. Feel free to disagree with
what I say below, this is more a discussion than picking on your design
or plan.

On Sat, May 13, 2017 at 01:01:39AM +0800, Zhongze Liu wrote:
> Hi, Xen developers,
> 
> I'm Zhongze Liu, a GSoC student of this year. Glad to meet you in the
> Xen Project.  As an initial step to implementing my GSoC proposal, which
> is still a draft,  I'm posting it here. And hope to hear from you your
> suggestions.
> 
> ====================================================
> 1. Motivation and Description
> ====================================================
> Virtual machines use grant table hypercalls to setup a share page for
> inter-VMs communications. These hypercalls are used by all PV
> protocols today. However, very simple guests, such as baremetal
> applications, might not have the infrastructure to handle the grant table.
> This project is about setting up several shared memory areas for inter-VMs
> communications directly from the VM config file.
> So that the guest kernel doesn't have to have grant table support to be
> able to communicate with other guests.
> 
> ====================================================
> 2. Implementation Plan:
> ====================================================
> 
> ======================================
> 2.1 Introduce a new VM config option in xl:
> ======================================
> The shared areas should be shareable among several VMs,
> every shared physical memory area is assigned to a set of VMs.
> Therefore, a “token” or “identifier” should be used here to uniquely
> identify a backing memory area.
> 
> 
> I would suggest using an unsigned integer to serve as the identifier.
> For example:
> 
> In xl config file of vm1:
> 
>     static_shared_mem = [“addr_range1= ID1”, “addr_range2 = ID2”]
> 
> In xl config file of vm2:
> 
>     static_shared_mem = [“addr_range3 = ID1”]
> 
> In xl config file of vm3:
> 
>     static_shared_mem = [“addr_range4 = ID2”]

I can envisage you need some more attributes: what about the attributes
like RW / RO / WO (or even X)?

Also, I assume the granularity of the mapping is a page, but as far as I
can tell there are two page granularity on ARM, you do need to consider
both and what should happen if you mix and match them. What about
mapping several pages and different VM use overlapping ranges?

Can you give some concrete examples? What does addr_rangeX look like in
practice?

> 
> 
> In the example above. A memory area A1 will be shared between
> vm1 and vm2 -- vm1 can access this area using addr_range1
> and vm2 using addr_range3. Likewise, a memory area A2 will be
> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
> and vm3 using addr_range4.
> 
> The shared memory area denoted by an identifier IDx will be
> allocated when it first appear, and the memory pages will be taken from
> the first VM whose static_shared_mem list contains IDx. Take the above
> config files for example, if we instantiate vm1, vm2 and vm3, one after
> another, the memory areas denoted by ID1 and ID2 will both be allocated
> in and taken from vm1.

Hmm... I can see some potential hazards. Currently multiple xl processes
are serialised by a lock, and you assumption is the creation is done in
order, but suppose some time later they can run in parallel. When you
have several "xl create" and they race with each other, what will
happen?

This can be solved by serialising in libxl or hypervisor, I think.
It is up to you to choose where to do it.

Also, please consider what happens when you destroy the owner domain
before the rest. Proper reference counting should be done in the
hypervisor.

> 
> ======================================
> 2.2 Store the mem-sharing information in xenstore
> ======================================
> This information should include the length and owner of the area. And
> it should also include information about where the backing memory areas
> are mapped in every VM that are using it. This information should be
> known to the xl command and all domains, so we utilize xenstore to keep
> this information. A current plan is to place the information under
> /local/shared_mem/ID. Still take the above config files as an example:
> 
> If we instantiate vm1, vm2 and vm3, one after another,
> “xenstore ls -f” should output something like this:
> 
> 
> After VM1 was instantiated, the output of “xenstore ls -f”
> will be something like this:
> 
>     /local/shared_mem/ID1/owner = dom_id_of_vm1
> 
>     /local/shared_mem/ID1/size = sizeof_addr_range1
> 
>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
> 
> 
>     /local/shared_mem/ID2/owner = dom_id_of_vm1
> 
>     /local/shared_mem/ID2/size = sizeof_addr_range1
> 
>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
> 
> 
> After VM2 was instantiated, the following new lines will appear:
> 
>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
> 
> 
> After VM2 was instantiated, the following new lines will appear:
> 
>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
> 
> When we encounter an id IDx during "xl create":
> 
>   + If it’s not under /local/shared_mem, create the corresponding entries
>      (owner, size, and mappings) in xenstore, and allocate the memory from
>      the newly created domain.
> 
>   + If it’s found under /local/shared_mem, map the pages to the newly
>       created domain, and add the current domain to
>       /local/shared_mem/IDx/mappings.
> 

Again, please think about destruction as well.

At this point I think modelling after POSIX shared memory makes more
sense. That is, there isn't one "owner" for the memory. You get hold of
the shared memory via a key (ID in your case?).

I'm not entirely sure if xenstore is right location to store such
information, but I don't have other suggestions either. Maybe when I
have better understanding of the problem I can make better suggestions.

> ======================================
> 2.3 mapping the memory areas
> ======================================
> Handle the newly added config option in tools/{xl, libxl}
> and utilize toos/libxc to do the actual memory mapping
> 
> 
> ======================================
> 2.4 error handling
> ======================================
> Add code to handle various errors: Invalid address,
> mismatched length of memory area etc.
> 
> ====================================================
> 3. Expected Outcomes/Goals:
> ====================================================
> A new VM config option in xl will be introduced, allowing users to setup
> several shared memory areas for inter-VMs communications.
> This should work on both x86 and ARM.
> 
> [See also: https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Share_a_page_in_memory_from_the_VM_config_file]
> 

Overall I think this document is a good starting point. There are
details to hash out but we have a lot of time for that.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-12 17:51 ` Wei Liu
@ 2017-05-13  2:28   ` Zhongze Liu
  2017-05-15 16:46     ` Wei Liu
  0 siblings, 1 reply; 31+ messages in thread
From: Zhongze Liu @ 2017-05-13  2:28 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Ian Jackson

2017-05-13 1:51 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> Hi Zhongze
>
> This is a nice write-up. Some comments below. Feel free to disagree with
> what I say below, this is more a discussion than picking on your design
> or plan.
>

HI, Wei Liu

Thanks for your time reading through my proposal.

>
> On Sat, May 13, 2017 at 01:01:39AM +0800, Zhongze Liu wrote:
>> Hi, Xen developers,
>>
>> I'm Zhongze Liu, a GSoC student of this year. Glad to meet you in the
>> Xen Project.  As an initial step to implementing my GSoC proposal, which
>> is still a draft,  I'm posting it here. And hope to hear from you your
>> suggestions.
>>
>> ====================================================
>> 1. Motivation and Description
>> ====================================================
>> Virtual machines use grant table hypercalls to setup a share page for
>> inter-VMs communications. These hypercalls are used by all PV
>> protocols today. However, very simple guests, such as baremetal
>> applications, might not have the infrastructure to handle the grant table.
>> This project is about setting up several shared memory areas for inter-VMs
>> communications directly from the VM config file.
>> So that the guest kernel doesn't have to have grant table support to be
>> able to communicate with other guests.
>>
>> ====================================================
>> 2. Implementation Plan:
>> ====================================================
>>
>> ======================================
>> 2.1 Introduce a new VM config option in xl:
>> ======================================
>> The shared areas should be shareable among several VMs,
>> every shared physical memory area is assigned to a set of VMs.
>> Therefore, a “token” or “identifier” should be used here to uniquely
>> identify a backing memory area.
>>
>>
>> I would suggest using an unsigned integer to serve as the identifier.
>> For example:
>>
>> In xl config file of vm1:
>>
>>     static_shared_mem = [“addr_range1= ID1”, “addr_range2 = ID2”]
>>
>> In xl config file of vm2:
>>
>>     static_shared_mem = [“addr_range3 = ID1”]
>>
>> In xl config file of vm3:
>>
>>     static_shared_mem = [“addr_range4 = ID2”]
>
> I can envisage you need some more attributes: what about the attributes
> like RW / RO / WO (or even X)?
>
> Also, I assume the granularity of the mapping is a page, but as far as I
> can tell there are two page granularity on ARM, you do need to consider
> both and what should happen if you mix and match them. What about
> mapping several pages and different VM use overlapping ranges?
>
> Can you give some concrete examples? What does addr_rangeX look like in
> practice?
>
>

Yes, those attributes are necessary and should be explicitly specified in the
config file. I'll add them in the next version of this proposal. And taking the
granularity into consideration, what do you say if we change the entries into
something like:
'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.

>
>>
>>
>> In the example above. A memory area A1 will be shared between
>> vm1 and vm2 -- vm1 can access this area using addr_range1
>> and vm2 using addr_range3. Likewise, a memory area A2 will be
>> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
>> and vm3 using addr_range4.
>>
>> The shared memory area denoted by an identifier IDx will be
>> allocated when it first appear, and the memory pages will be taken from
>> the first VM whose static_shared_mem list contains IDx. Take the above
>> config files for example, if we instantiate vm1, vm2 and vm3, one after
>> another, the memory areas denoted by ID1 and ID2 will both be allocated
>> in and taken from vm1.
>
> Hmm... I can see some potential hazards. Currently, multiple xl processes
> are serialized by a lock, and your assumption is the creation is done in
> order, but suppose sometime later they can run in parallel. When you
> have several "xl create" and they race with each other, what will
> happen?
>
> This can be solved by serializing in libxl or hypervisor, I think.
> It is up to you to choose where to do it.
>
> Also, please consider what happens when you destroy the owner domain
> before the rest. Proper reference counting should be done in the
> hypervisor.
>

Yes, the access to xenstore and other shared data should be serialized
using some kind of lock.

>
>>
>> ======================================
>> 2.2 Store the mem-sharing information in xenstore
>> ======================================
>> This information should include the length and owner of the area. And
>> it should also include information about where the backing memory areas
>> are mapped in every VM that are using it. This information should be
>> known to the xl command and all domains, so we utilize xenstore to keep
>> this information. A current plan is to place the information under
>> /local/shared_mem/ID. Still take the above config files as an example:
>>
>> If we instantiate vm1, vm2 and vm3, one after another,
>> “xenstore ls -f” should output something like this:
>>
>>
>> After VM1 was instantiated, the output of “xenstore ls -f”
>> will be something like this:
>>
>>     /local/shared_mem/ID1/owner = dom_id_of_vm1
>>
>>     /local/shared_mem/ID1/size = sizeof_addr_range1
>>
>>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
>>
>>
>>     /local/shared_mem/ID2/owner = dom_id_of_vm1
>>
>>     /local/shared_mem/ID2/size = sizeof_addr_range1
>>
>>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
>>
>>
>> After VM2 was instantiated, the following new lines will appear:
>>
>>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
>>
>>
>> After VM2 was instantiated, the following new lines will appear:
>>
>>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
>>
>> When we encounter an id IDx during "xl create":
>>
>>   + If it’s not under /local/shared_mem, create the corresponding entries
>>      (owner, size, and mappings) in xenstore, and allocate the memory from
>>      the newly created domain.
>>
>>   + If it’s found under /local/shared_mem, map the pages to the newly
>>       created domain, and add the current domain to
>>       /local/shared_mem/IDx/mappings.
>>
>
> Again, please think about destruction as well.
>
> At this point I think modelling after POSIX shared memory makes more
> sense. That is, there isn't one "owner" for the memory. You get hold of
> the shared memory via a key (ID in your case?).
>

Actually, I've thought about the same question and have discussed this with
Julien and Stefano. And this what they told me:

Stefano wrote:
"I think that in your scenario Xen (the hypervisor) wouldn't allow the
first domain to be completely destroyed because it knows that its
memory is still in use by something else in the system. The domain
remains in a zombie state until the memory is not used anymore. We need
to double-check this, but I don't think it will be a problem."

and Julien wrote:
"That's correct. A domain will not be destroyed until all the memory
associated to it will be freed.
A page will be considered free when all the reference on it will be
removed. This means that if the domain who allocated the page die, it
will not be fully destroyed until the page is not used by another
domain.
This is assuming that every domain using the page is taking a
reference (similar to foreign mapping). Actually, I think we might be
able to re-use the mapspace XENMAPSPACE_gmfn_foreign.
Actually, I think we can re-use the same mechanism as foreign mapping (see
Note that Xen on ARM (and x86?) does not take reference when mapping a
page to a stage-2 page table (e.g the page table holding the
translation between a guest physical address and host physical
address)."

I've also thought about modeling after the POSIX way of sharing memory.
If we do so, the owner of the shared pages should be Dom0, and we
will have to do the reference counting ourselves, and free pages when they're
no longer needed. I'm not sure which method is better. What do you say?

>
> I'm not entirely sure if xenstore is right location to store such
> information, but I don't have other suggestions either. Maybe when I
> have better understanding of the problem I can make better suggestions.
>
>> ======================================
>> 2.3 mapping the memory areas
>> ======================================
>> Handle the newly added config option in tools/{xl, libxl}
>> and utilize toos/libxc to do the actual memory mapping
>>
>>
>> ======================================
>> 2.4 error handling
>> ======================================
>> Add code to handle various errors: Invalid address,
>> mismatched length of memory area etc.
>>
>> ====================================================
>> 3. Expected Outcomes/Goals:
>> ====================================================
>> A new VM config option in xl will be introduced, allowing users to setup
>> several shared memory areas for inter-VMs communications.
>> This should work on both x86 and ARM.
>>
>> [See also: https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Share_a_page_in_memory_from_the_VM_config_file]
>>
>
> Overall I think this document is a good starting point. There are
> details to hash out but we have a lot of time for that.
>
> Wei.

Thanks for your comments. Hope to hear back from you soon.

Cheers,

Zhongze Liu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-12 17:01 Proposal to allow setting up shared memory areas between VMs from xl config file Zhongze Liu
  2017-05-12 17:51 ` Wei Liu
@ 2017-05-15  8:08 ` Jan Beulich
  2017-05-15  8:20   ` Julien Grall
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2017-05-15  8:08 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: Ian Jackson, Julien Grall, Stefano Stabellini, Wei Liu, xen-devel

>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
> ====================================================
> 1. Motivation and Description
> ====================================================
> Virtual machines use grant table hypercalls to setup a share page for
> inter-VMs communications. These hypercalls are used by all PV
> protocols today. However, very simple guests, such as baremetal
> applications, might not have the infrastructure to handle the grant table.
> This project is about setting up several shared memory areas for inter-VMs
> communications directly from the VM config file.
> So that the guest kernel doesn't have to have grant table support to be
> able to communicate with other guests.

I think it would help to compare your proposal with the alternative of
adding grant table infrastructure to such environments (which I
wouldn't expect to be all that difficult). After all introduction of a
(seemingly) redundant mechanism comes at the price of extra /
duplicate code in the tool stack and maybe even in the hypervisor.
Hence there needs to be a meaningfully higher gain than price here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15  8:08 ` Jan Beulich
@ 2017-05-15  8:20   ` Julien Grall
  2017-05-15  8:52     ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Julien Grall @ 2017-05-15  8:20 UTC (permalink / raw)
  To: Jan Beulich, Zhongze Liu
  Cc: Ian Jackson, nd, Wei Liu, Stefano Stabellini, xen-devel

Hi Jan,

On 15/05/2017 09:08, Jan Beulich wrote:
>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
>> ====================================================
>> 1. Motivation and Description
>> ====================================================
>> Virtual machines use grant table hypercalls to setup a share page for
>> inter-VMs communications. These hypercalls are used by all PV
>> protocols today. However, very simple guests, such as baremetal
>> applications, might not have the infrastructure to handle the grant table.
>> This project is about setting up several shared memory areas for inter-VMs
>> communications directly from the VM config file.
>> So that the guest kernel doesn't have to have grant table support to be
>> able to communicate with other guests.
>
> I think it would help to compare your proposal with the alternative of
> adding grant table infrastructure to such environments (which I
> wouldn't expect to be all that difficult). After all introduction of a
> (seemingly) redundant mechanism comes at the price of extra /
> duplicate code in the tool stack and maybe even in the hypervisor.
> Hence there needs to be a meaningfully higher gain than price here.

This is a key feature for embedded because they want to be able to share 
buffer very easily at domain creation time between two guests.

Adding the grant table driver in the guest OS as a high a cost when the 
goal is to run unmodified OS in a VM. This is achievable on ARM if you 
use passthrough.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15  8:20   ` Julien Grall
@ 2017-05-15  8:52     ` Jan Beulich
  2017-05-15 10:21       ` Julien Grall
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2017-05-15  8:52 UTC (permalink / raw)
  To: Julien Grall, Zhongze Liu
  Cc: Ian Jackson, nd, Wei Liu, Stefano Stabellini, xen-devel

>>> On 15.05.17 at 10:20, <julien.grall@arm.com> wrote:
> On 15/05/2017 09:08, Jan Beulich wrote:
>>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
>>> ====================================================
>>> 1. Motivation and Description
>>> ====================================================
>>> Virtual machines use grant table hypercalls to setup a share page for
>>> inter-VMs communications. These hypercalls are used by all PV
>>> protocols today. However, very simple guests, such as baremetal
>>> applications, might not have the infrastructure to handle the grant table.
>>> This project is about setting up several shared memory areas for inter-VMs
>>> communications directly from the VM config file.
>>> So that the guest kernel doesn't have to have grant table support to be
>>> able to communicate with other guests.
>>
>> I think it would help to compare your proposal with the alternative of
>> adding grant table infrastructure to such environments (which I
>> wouldn't expect to be all that difficult). After all introduction of a
>> (seemingly) redundant mechanism comes at the price of extra /
>> duplicate code in the tool stack and maybe even in the hypervisor.
>> Hence there needs to be a meaningfully higher gain than price here.
> 
> This is a key feature for embedded because they want to be able to share 
> buffer very easily at domain creation time between two guests.
> 
> Adding the grant table driver in the guest OS as a high a cost when the 
> goal is to run unmodified OS in a VM. This is achievable on ARM if you 
> use passthrough.

"high cost" is pretty abstract and vague. And I admit I have difficulty
seeing how an entirely unmodified OS could leverage this newly
proposed sharing model.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15  8:52     ` Jan Beulich
@ 2017-05-15 10:21       ` Julien Grall
  2017-05-15 12:28         ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Julien Grall @ 2017-05-15 10:21 UTC (permalink / raw)
  To: Jan Beulich, Zhongze Liu
  Cc: Ian Jackson, nd, Wei Liu, Stefano Stabellini, xen-devel

Hi Jan,

On 05/15/2017 09:52 AM, Jan Beulich wrote:
>>>> On 15.05.17 at 10:20, <julien.grall@arm.com> wrote:
>> On 15/05/2017 09:08, Jan Beulich wrote:
>>>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
>>>> ====================================================
>>>> 1. Motivation and Description
>>>> ====================================================
>>>> Virtual machines use grant table hypercalls to setup a share page for
>>>> inter-VMs communications. These hypercalls are used by all PV
>>>> protocols today. However, very simple guests, such as baremetal
>>>> applications, might not have the infrastructure to handle the grant table.
>>>> This project is about setting up several shared memory areas for inter-VMs
>>>> communications directly from the VM config file.
>>>> So that the guest kernel doesn't have to have grant table support to be
>>>> able to communicate with other guests.
>>>
>>> I think it would help to compare your proposal with the alternative of
>>> adding grant table infrastructure to such environments (which I
>>> wouldn't expect to be all that difficult). After all introduction of a
>>> (seemingly) redundant mechanism comes at the price of extra /
>>> duplicate code in the tool stack and maybe even in the hypervisor.
>>> Hence there needs to be a meaningfully higher gain than price here.
>>
>> This is a key feature for embedded because they want to be able to share
>> buffer very easily at domain creation time between two guests.
>>
>> Adding the grant table driver in the guest OS as a high a cost when the
>> goal is to run unmodified OS in a VM. This is achievable on ARM if you
>> use passthrough.
>
> "high cost" is pretty abstract and vague. And I admit I have difficulty
> seeing how an entirely unmodified OS could leverage this newly
> proposed sharing model.

Let's step back for a moment, I will come back on Zhongze proposal 
afterwards.

Using grant table in the guest will obviously require the grant-table 
driver. It is not that bad. However, how do you pass the grant ref 
number to the other guest? The only way I can see is xenstore, so yet 
another driver to port.

On Zhongze proposal, the share page will be mapped at the a specific 
address in the guest memory. I agree this will require some work in the 
toolstack, on the hypervisor side we could re-use the foreign mapping 
API. But on the guest side there are nothing to do Xen specific.

What's the benefit? Baremetal guest are usually tiny, you could use the 
device-tree (and hence generic way) to present the share page for 
communicating. This means no Xen PV drivers, and therefore easier to 
move an OS in Xen VM.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 10:21       ` Julien Grall
@ 2017-05-15 12:28         ` Jan Beulich
  2017-05-15 14:13           ` Julien Grall
  2017-05-15 17:40           ` Stefano Stabellini
  0 siblings, 2 replies; 31+ messages in thread
From: Jan Beulich @ 2017-05-15 12:28 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson, xen-devel, nd

>>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
> On 05/15/2017 09:52 AM, Jan Beulich wrote:
>>>>> On 15.05.17 at 10:20, <julien.grall@arm.com> wrote:
>>> On 15/05/2017 09:08, Jan Beulich wrote:
>>>>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
>>>>> ====================================================
>>>>> 1. Motivation and Description
>>>>> ====================================================
>>>>> Virtual machines use grant table hypercalls to setup a share page for
>>>>> inter-VMs communications. These hypercalls are used by all PV
>>>>> protocols today. However, very simple guests, such as baremetal
>>>>> applications, might not have the infrastructure to handle the grant table.
>>>>> This project is about setting up several shared memory areas for inter-VMs
>>>>> communications directly from the VM config file.
>>>>> So that the guest kernel doesn't have to have grant table support to be
>>>>> able to communicate with other guests.
>>>>
>>>> I think it would help to compare your proposal with the alternative of
>>>> adding grant table infrastructure to such environments (which I
>>>> wouldn't expect to be all that difficult). After all introduction of a
>>>> (seemingly) redundant mechanism comes at the price of extra /
>>>> duplicate code in the tool stack and maybe even in the hypervisor.
>>>> Hence there needs to be a meaningfully higher gain than price here.
>>>
>>> This is a key feature for embedded because they want to be able to share
>>> buffer very easily at domain creation time between two guests.
>>>
>>> Adding the grant table driver in the guest OS as a high a cost when the
>>> goal is to run unmodified OS in a VM. This is achievable on ARM if you
>>> use passthrough.
>>
>> "high cost" is pretty abstract and vague. And I admit I have difficulty
>> seeing how an entirely unmodified OS could leverage this newly
>> proposed sharing model.
> 
> Let's step back for a moment, I will come back on Zhongze proposal 
> afterwards.
> 
> Using grant table in the guest will obviously require the grant-table 
> driver. It is not that bad. However, how do you pass the grant ref 
> number to the other guest? The only way I can see is xenstore, so yet 
> another driver to port.

Just look at the amount of code that was needed to get PV drivers
to work in x86 HVM guests. It's not all that much. Plus making such
available in a new environment doesn't normally mean everything
needs to be written from scratch.

> On Zhongze proposal, the share page will be mapped at the a specific 
> address in the guest memory. I agree this will require some work in the 
> toolstack, on the hypervisor side we could re-use the foreign mapping 
> API. But on the guest side there are nothing to do Xen specific.

So what is the equivalent of the shared page on bare hardware?

> What's the benefit? Baremetal guest are usually tiny, you could use the 
> device-tree (and hence generic way) to present the share page for 
> communicating. This means no Xen PV drivers, and therefore easier to 
> move an OS in Xen VM.

Is this intended to be an ARM-specific extension, or a generic one?
There's no DT on x86 to pass such information, and I can't easily
see alternatives there. Also the consumer of the shared page info
is still a PV component of the guest. You simply can't have an
entirely unmodified guest which at the same time is Xen (or
whatever other component sits at the other end of the shared
page) aware.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 12:28         ` Jan Beulich
@ 2017-05-15 14:13           ` Julien Grall
  2017-05-15 14:25             ` Jan Beulich
  2017-05-15 17:40           ` Stefano Stabellini
  1 sibling, 1 reply; 31+ messages in thread
From: Julien Grall @ 2017-05-15 14:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson, xen-devel, nd

Hi Jan,

On 15/05/17 13:28, Jan Beulich wrote:
>>>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
>> On 05/15/2017 09:52 AM, Jan Beulich wrote:
>>>>>> On 15.05.17 at 10:20, <julien.grall@arm.com> wrote:
>>>> On 15/05/2017 09:08, Jan Beulich wrote:
>>>>>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
>>>>>> ====================================================
>>>>>> 1. Motivation and Description
>>>>>> ====================================================
>>>>>> Virtual machines use grant table hypercalls to setup a share page for
>>>>>> inter-VMs communications. These hypercalls are used by all PV
>>>>>> protocols today. However, very simple guests, such as baremetal
>>>>>> applications, might not have the infrastructure to handle the grant table.
>>>>>> This project is about setting up several shared memory areas for inter-VMs
>>>>>> communications directly from the VM config file.
>>>>>> So that the guest kernel doesn't have to have grant table support to be
>>>>>> able to communicate with other guests.
>>>>>
>>>>> I think it would help to compare your proposal with the alternative of
>>>>> adding grant table infrastructure to such environments (which I
>>>>> wouldn't expect to be all that difficult). After all introduction of a
>>>>> (seemingly) redundant mechanism comes at the price of extra /
>>>>> duplicate code in the tool stack and maybe even in the hypervisor.
>>>>> Hence there needs to be a meaningfully higher gain than price here.
>>>>
>>>> This is a key feature for embedded because they want to be able to share
>>>> buffer very easily at domain creation time between two guests.
>>>>
>>>> Adding the grant table driver in the guest OS as a high a cost when the
>>>> goal is to run unmodified OS in a VM. This is achievable on ARM if you
>>>> use passthrough.
>>>
>>> "high cost" is pretty abstract and vague. And I admit I have difficulty
>>> seeing how an entirely unmodified OS could leverage this newly
>>> proposed sharing model.
>>
>> Let's step back for a moment, I will come back on Zhongze proposal
>> afterwards.
>>
>> Using grant table in the guest will obviously require the grant-table
>> driver. It is not that bad. However, how do you pass the grant ref
>> number to the other guest? The only way I can see is xenstore, so yet
>> another driver to port.
>
> Just look at the amount of code that was needed to get PV drivers
> to work in x86 HVM guests. It's not all that much. Plus making such
> available in a new environment doesn't normally mean everything
> needs to be written from scratch.

Even if PV drivers don't need to be written from scratch, it has a 
certain cost to port them to a new OS. By trying to make most of the VM 
interface agnostic to Xen, we potentially allow vendor to switch to Xen 
easily.

>
>> On Zhongze proposal, the share page will be mapped at the a specific
>> address in the guest memory. I agree this will require some work in the
>> toolstack, on the hypervisor side we could re-use the foreign mapping
>> API. But on the guest side there are nothing to do Xen specific.
>
> So what is the equivalent of the shared page on bare hardware?
>
>> What's the benefit? Baremetal guest are usually tiny, you could use the
>> device-tree (and hence generic way) to present the share page for
>> communicating. This means no Xen PV drivers, and therefore easier to
>> move an OS in Xen VM.
>
> Is this intended to be an ARM-specific extension, or a generic one?
> There's no DT on x86 to pass such information, and I can't easily
> see alternatives there. Also the consumer of the shared page info
> is still a PV component of the guest. You simply can't have an
> entirely unmodified guest which at the same time is Xen (or
> whatever other component sits at the other end of the shared
> page) aware.

The toolstack will setup the shared page in both the producer and 
consumer guest. This will be setup during the domain creation. My 
understanding is it will not be possible to share page after the two 
domains have been created.

This feature is not meant to replace grant-table but here to ease 
sharing a page between 2 guests without introducing any Xen knowledge in 
both guests.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 14:13           ` Julien Grall
@ 2017-05-15 14:25             ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2017-05-15 14:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson, xen-devel, nd

>>> On 15.05.17 at 16:13, <julien.grall@arm.com> wrote:
> On 15/05/17 13:28, Jan Beulich wrote:
>>>>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
>>> On Zhongze proposal, the share page will be mapped at the a specific
>>> address in the guest memory. I agree this will require some work in the
>>> toolstack, on the hypervisor side we could re-use the foreign mapping
>>> API. But on the guest side there are nothing to do Xen specific.
>>
>> So what is the equivalent of the shared page on bare hardware?

No answer here?

>>> What's the benefit? Baremetal guest are usually tiny, you could use the
>>> device-tree (and hence generic way) to present the share page for
>>> communicating. This means no Xen PV drivers, and therefore easier to
>>> move an OS in Xen VM.
>>
>> Is this intended to be an ARM-specific extension, or a generic one?
>> There's no DT on x86 to pass such information, and I can't easily
>> see alternatives there. Also the consumer of the shared page info
>> is still a PV component of the guest. You simply can't have an
>> entirely unmodified guest which at the same time is Xen (or
>> whatever other component sits at the other end of the shared
>> page) aware.
> 
> The toolstack will setup the shared page in both the producer and 
> consumer guest. This will be setup during the domain creation. My 
> understanding is it will not be possible to share page after the two 
> domains have been created.

Whether this is going to become too limiting remains to be seen, but
in any event this doesn't answer my question regarding the x86 side.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-13  2:28   ` Zhongze Liu
@ 2017-05-15 16:46     ` Wei Liu
  2017-05-18 18:09       ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: Wei Liu @ 2017-05-15 16:46 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Wei Liu, Ian Jackson

On Sat, May 13, 2017 at 10:28:27AM +0800, Zhongze Liu wrote:
> 2017-05-13 1:51 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> > Hi Zhongze
> >
> > This is a nice write-up. Some comments below. Feel free to disagree with
> > what I say below, this is more a discussion than picking on your design
> > or plan.
> >
> 
> HI, Wei Liu
> 
> Thanks for your time reading through my proposal.
> 
> >
> > On Sat, May 13, 2017 at 01:01:39AM +0800, Zhongze Liu wrote:
> >> Hi, Xen developers,
> >>
> >> I'm Zhongze Liu, a GSoC student of this year. Glad to meet you in the
> >> Xen Project.  As an initial step to implementing my GSoC proposal, which
> >> is still a draft,  I'm posting it here. And hope to hear from you your
> >> suggestions.
> >>
> >> ====================================================
> >> 1. Motivation and Description
> >> ====================================================
> >> Virtual machines use grant table hypercalls to setup a share page for
> >> inter-VMs communications. These hypercalls are used by all PV
> >> protocols today. However, very simple guests, such as baremetal
> >> applications, might not have the infrastructure to handle the grant table.
> >> This project is about setting up several shared memory areas for inter-VMs
> >> communications directly from the VM config file.
> >> So that the guest kernel doesn't have to have grant table support to be
> >> able to communicate with other guests.
> >>
> >> ====================================================
> >> 2. Implementation Plan:
> >> ====================================================
> >>
> >> ======================================
> >> 2.1 Introduce a new VM config option in xl:
> >> ======================================
> >> The shared areas should be shareable among several VMs,
> >> every shared physical memory area is assigned to a set of VMs.
> >> Therefore, a “token” or “identifier” should be used here to uniquely
> >> identify a backing memory area.
> >>
> >>
> >> I would suggest using an unsigned integer to serve as the identifier.
> >> For example:
> >>
> >> In xl config file of vm1:
> >>
> >>     static_shared_mem = [“addr_range1= ID1”, “addr_range2 = ID2”]
> >>
> >> In xl config file of vm2:
> >>
> >>     static_shared_mem = [“addr_range3 = ID1”]
> >>
> >> In xl config file of vm3:
> >>
> >>     static_shared_mem = [“addr_range4 = ID2”]
> >
> > I can envisage you need some more attributes: what about the attributes
> > like RW / RO / WO (or even X)?
> >
> > Also, I assume the granularity of the mapping is a page, but as far as I
> > can tell there are two page granularity on ARM, you do need to consider
> > both and what should happen if you mix and match them. What about
> > mapping several pages and different VM use overlapping ranges?
> >
> > Can you give some concrete examples? What does addr_rangeX look like in
> > practice?
> >
> >
> 
> Yes, those attributes are necessary and should be explicitly specified in the
> config file. I'll add them in the next version of this proposal. And taking the
> granularity into consideration, what do you say if we change the entries into
> something like:
> 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.

I realised I may have gone too far after reading your reply.

What is the end purpose of this project? If you only want to insert a
mfn into guest address space and don't care how the guest is going to
map it, you can omit the prot= part. If you want stricter control, you
will need them -- and that would also have implications on the
hypervisor code you need.

I suggest you write the manual for the new mechanism you propose first.
That way you describe the feature in a sysadmin-friendly way.  Describe
the syntax, the effect of the new mechanism and how people are supposed
to use it under what circumstances.

> 
> >
> >>
> >>
> >> In the example above. A memory area A1 will be shared between
> >> vm1 and vm2 -- vm1 can access this area using addr_range1
> >> and vm2 using addr_range3. Likewise, a memory area A2 will be
> >> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
> >> and vm3 using addr_range4.
> >>
> >> The shared memory area denoted by an identifier IDx will be
> >> allocated when it first appear, and the memory pages will be taken from
> >> the first VM whose static_shared_mem list contains IDx. Take the above
> >> config files for example, if we instantiate vm1, vm2 and vm3, one after
> >> another, the memory areas denoted by ID1 and ID2 will both be allocated
> >> in and taken from vm1.
> >
> > Hmm... I can see some potential hazards. Currently, multiple xl processes
> > are serialized by a lock, and your assumption is the creation is done in
> > order, but suppose sometime later they can run in parallel. When you
> > have several "xl create" and they race with each other, what will
> > happen?
> >
> > This can be solved by serializing in libxl or hypervisor, I think.
> > It is up to you to choose where to do it.
> >
> > Also, please consider what happens when you destroy the owner domain
> > before the rest. Proper reference counting should be done in the
> > hypervisor.
> >
> 
> Yes, the access to xenstore and other shared data should be serialized
> using some kind of lock.
> 
> >
> >>
> >> ======================================
> >> 2.2 Store the mem-sharing information in xenstore
> >> ======================================
> >> This information should include the length and owner of the area. And
> >> it should also include information about where the backing memory areas
> >> are mapped in every VM that are using it. This information should be
> >> known to the xl command and all domains, so we utilize xenstore to keep
> >> this information. A current plan is to place the information under
> >> /local/shared_mem/ID. Still take the above config files as an example:
> >>
> >> If we instantiate vm1, vm2 and vm3, one after another,
> >> “xenstore ls -f” should output something like this:
> >>
> >>
> >> After VM1 was instantiated, the output of “xenstore ls -f”
> >> will be something like this:
> >>
> >>     /local/shared_mem/ID1/owner = dom_id_of_vm1
> >>
> >>     /local/shared_mem/ID1/size = sizeof_addr_range1
> >>
> >>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
> >>
> >>
> >>     /local/shared_mem/ID2/owner = dom_id_of_vm1
> >>
> >>     /local/shared_mem/ID2/size = sizeof_addr_range1
> >>
> >>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
> >>
> >>
> >> After VM2 was instantiated, the following new lines will appear:
> >>
> >>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
> >>
> >>
> >> After VM2 was instantiated, the following new lines will appear:
> >>
> >>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
> >>
> >> When we encounter an id IDx during "xl create":
> >>
> >>   + If it’s not under /local/shared_mem, create the corresponding entries
> >>      (owner, size, and mappings) in xenstore, and allocate the memory from
> >>      the newly created domain.
> >>
> >>   + If it’s found under /local/shared_mem, map the pages to the newly
> >>       created domain, and add the current domain to
> >>       /local/shared_mem/IDx/mappings.
> >>
> >
> > Again, please think about destruction as well.
> >
> > At this point I think modelling after POSIX shared memory makes more
> > sense. That is, there isn't one "owner" for the memory. You get hold of
> > the shared memory via a key (ID in your case?).
> >
> 
> Actually, I've thought about the same question and have discussed this with
> Julien and Stefano. And this what they told me:
> 
> Stefano wrote:
> "I think that in your scenario Xen (the hypervisor) wouldn't allow the
> first domain to be completely destroyed because it knows that its
> memory is still in use by something else in the system. The domain
> remains in a zombie state until the memory is not used anymore. We need
> to double-check this, but I don't think it will be a problem."
> 

This has security implications -- a rogue guest can prevent the
destruction of the owner.

> and Julien wrote:
> "That's correct. A domain will not be destroyed until all the memory
> associated to it will be freed.
> A page will be considered free when all the reference on it will be
> removed. This means that if the domain who allocated the page die, it
> will not be fully destroyed until the page is not used by another
> domain.
> This is assuming that every domain using the page is taking a
> reference (similar to foreign mapping). Actually, I think we might be
> able to re-use the mapspace XENMAPSPACE_gmfn_foreign.
> Actually, I think we can re-use the same mechanism as foreign mapping (see
> Note that Xen on ARM (and x86?) does not take reference when mapping a
> page to a stage-2 page table (e.g the page table holding the
> translation between a guest physical address and host physical
> address)."
> 
> I've also thought about modeling after the POSIX way of sharing memory.
> If we do so, the owner of the shared pages should be Dom0, and we
> will have to do the reference counting ourselves, and free pages when they're
> no longer needed. I'm not sure which method is better. What do you say?
> 

Assigning the page to Dom0 doesn't sound right to me either.

But the first step should really be defining the scope of the project.
Technical details will follow naturally.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 12:28         ` Jan Beulich
  2017-05-15 14:13           ` Julien Grall
@ 2017-05-15 17:40           ` Stefano Stabellini
  2017-05-16 10:11             ` Jan Beulich
  2017-05-16 11:04             ` Ian Jackson
  1 sibling, 2 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-15 17:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson,
	Julien Grall, xen-devel, nd

On Mon, 15 May 2017, Jan Beulich wrote:
> >>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
> > On 05/15/2017 09:52 AM, Jan Beulich wrote:
> >>>>> On 15.05.17 at 10:20, <julien.grall@arm.com> wrote:
> >>> On 15/05/2017 09:08, Jan Beulich wrote:
> >>>>>>> On 12.05.17 at 19:01, <blackskygg@gmail.com> wrote:
> >>>>> ====================================================
> >>>>> 1. Motivation and Description
> >>>>> ====================================================
> >>>>> Virtual machines use grant table hypercalls to setup a share page for
> >>>>> inter-VMs communications. These hypercalls are used by all PV
> >>>>> protocols today. However, very simple guests, such as baremetal
> >>>>> applications, might not have the infrastructure to handle the grant table.
> >>>>> This project is about setting up several shared memory areas for inter-VMs
> >>>>> communications directly from the VM config file.
> >>>>> So that the guest kernel doesn't have to have grant table support to be
> >>>>> able to communicate with other guests.
> >>>>
> >>>> I think it would help to compare your proposal with the alternative of
> >>>> adding grant table infrastructure to such environments (which I
> >>>> wouldn't expect to be all that difficult). After all introduction of a
> >>>> (seemingly) redundant mechanism comes at the price of extra /
> >>>> duplicate code in the tool stack and maybe even in the hypervisor.
> >>>> Hence there needs to be a meaningfully higher gain than price here.
> >>>
> >>> This is a key feature for embedded because they want to be able to share
> >>> buffer very easily at domain creation time between two guests.
> >>>
> >>> Adding the grant table driver in the guest OS as a high a cost when the
> >>> goal is to run unmodified OS in a VM. This is achievable on ARM if you
> >>> use passthrough.
> >>
> >> "high cost" is pretty abstract and vague. And I admit I have difficulty
> >> seeing how an entirely unmodified OS could leverage this newly
> >> proposed sharing model.
> > 
> > Let's step back for a moment, I will come back on Zhongze proposal 
> > afterwards.
> > 
> > Using grant table in the guest will obviously require the grant-table 
> > driver. It is not that bad. However, how do you pass the grant ref 
> > number to the other guest? The only way I can see is xenstore, so yet 
> > another driver to port.
> 
> Just look at the amount of code that was needed to get PV drivers
> to work in x86 HVM guests. It's not all that much. Plus making such
> available in a new environment doesn't normally mean everything
> needs to be written from scratch.

The requirement is to allow shared communication between unmodified
bare-metal applications. These applications are extremely simple and
lack the basic infrastructure that an operating system has, nor they
would want to introduce it. I have been hearing this request from
embedded people for months now.


> > On Zhongze proposal, the share page will be mapped at the a specific 
> > address in the guest memory. I agree this will require some work in the 
> > toolstack, on the hypervisor side we could re-use the foreign mapping 
> > API. But on the guest side there are nothing to do Xen specific.
> 
> So what is the equivalent of the shared page on bare hardware?

Bare-metal apps already have the concept of a shared page to communicate
with hardware devices, co-processors and other hardware/firmare
intercommunication frameworks.


> > What's the benefit? Baremetal guest are usually tiny, you could use the 
> > device-tree (and hence generic way) to present the share page for 
> > communicating. This means no Xen PV drivers, and therefore easier to 
> > move an OS in Xen VM.
> 
> Is this intended to be an ARM-specific extension, or a generic one?
> There's no DT on x86 to pass such information, and I can't easily
> see alternatives there. Also the consumer of the shared page info
> is still a PV component of the guest. You simply can't have an
> entirely unmodified guest which at the same time is Xen (or
> whatever other component sits at the other end of the shared
> page) aware.

I was going to propose for this work to be arch-neutral. However, it is
true that with the existing x86 software and hardware ecosystem, it
wouldn't be much use there. Given that the work is technically common
though, I don't see any downsides on enabling it on x86 on the off
chance that somebody will find it useful. However, if you prefer to
keep it ARM only, that's fine by me too.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 17:40           ` Stefano Stabellini
@ 2017-05-16 10:11             ` Jan Beulich
  2017-05-16 18:16               ` Stefano Stabellini
  2017-05-16 11:04             ` Ian Jackson
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2017-05-16 10:11 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Zhongze Liu, Ian Jackson, Julien Grall, xen-devel, nd

>>> On 15.05.17 at 19:40, <sstabellini@kernel.org> wrote:
> On Mon, 15 May 2017, Jan Beulich wrote:
>> >>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
>> > On Zhongze proposal, the share page will be mapped at the a specific 
>> > address in the guest memory. I agree this will require some work in the 
>> > toolstack, on the hypervisor side we could re-use the foreign mapping 
>> > API. But on the guest side there are nothing to do Xen specific.
>> 
>> So what is the equivalent of the shared page on bare hardware?
> 
> Bare-metal apps already have the concept of a shared page to communicate
> with hardware devices, co-processors and other hardware/firmare
> intercommunication frameworks.

So with that, is one side of the communication here then intended to
emulate such a hardware device, co-processor or other hardware /
firmware intercommunication framework? If so, aren't we talking
about device emulation then? If not, how can such a bare metal app
know the protocol (after all, if the protocol is Xen-specific, the app
wouldn't be Xen-unaware anymore)?

>> > What's the benefit? Baremetal guest are usually tiny, you could use the 
>> > device-tree (and hence generic way) to present the share page for 
>> > communicating. This means no Xen PV drivers, and therefore easier to 
>> > move an OS in Xen VM.
>> 
>> Is this intended to be an ARM-specific extension, or a generic one?
>> There's no DT on x86 to pass such information, and I can't easily
>> see alternatives there. Also the consumer of the shared page info
>> is still a PV component of the guest. You simply can't have an
>> entirely unmodified guest which at the same time is Xen (or
>> whatever other component sits at the other end of the shared
>> page) aware.
> 
> I was going to propose for this work to be arch-neutral. However, it is
> true that with the existing x86 software and hardware ecosystem, it
> wouldn't be much use there. Given that the work is technically common
> though, I don't see any downsides on enabling it on x86 on the off
> chance that somebody will find it useful. However, if you prefer to
> keep it ARM only, that's fine by me too.

I don't have a preference either way, but if you do it in an arch-neutral
way, then the manifestation of the frame numbers also needs to be
arch-neutral, in which case DT is not a suitable vehicle.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 17:40           ` Stefano Stabellini
  2017-05-16 10:11             ` Jan Beulich
@ 2017-05-16 11:04             ` Ian Jackson
  2017-05-16 18:08               ` Stefano Stabellini
  1 sibling, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2017-05-16 11:04 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Zhongze Liu, Julien Grall, Jan Beulich, xen-devel, nd

Stefano Stabellini writes ("Re: [Xen-devel] Proposal to allow setting up shared memory areas between VMs from xl config file"):
> Bare-metal apps already have the concept of a shared page to communicate
> with hardware devices, co-processors and other hardware/firmare
> intercommunication frameworks.

I think this discussion is rather too abstract.  Can you give an
example of a pair of apps that would usefully communicate using a
shared memory page in this way ?  Is one of these apps an unmodified
bare metal app ?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-16 11:04             ` Ian Jackson
@ 2017-05-16 18:08               ` Stefano Stabellini
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-16 18:08 UTC (permalink / raw)
  To: Ian Jackson
  Cc: edgar.iglesias, Stefano Stabellini, Wei Liu, Zhongze Liu,
	Julien Grall, Jan Beulich, xen-devel, nd, Jarvis.Roach

On Tue, 16 May 2017, Ian Jackson wrote:
> I think this discussion is rather too abstract.  Can you give an
> example of a pair of apps that would usefully communicate using a
> shared memory page in this way ?  Is one of these apps an unmodified
> bare metal app ?

That's a good idea. I CC'ed a couple of people that have more
information than me on the customer use-cases. They'll be able to tell
you more about them.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-16 10:11             ` Jan Beulich
@ 2017-05-16 18:16               ` Stefano Stabellini
  2017-05-19 16:52                 ` Zhongze Liu
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-16 18:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson,
	Julien Grall, xen-devel, nd

On Tue, 16 May 2017, Jan Beulich wrote:
> >>> On 15.05.17 at 19:40, <sstabellini@kernel.org> wrote:
> > On Mon, 15 May 2017, Jan Beulich wrote:
> >> >>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
> >> > On Zhongze proposal, the share page will be mapped at the a specific 
> >> > address in the guest memory. I agree this will require some work in the 
> >> > toolstack, on the hypervisor side we could re-use the foreign mapping 
> >> > API. But on the guest side there are nothing to do Xen specific.
> >> 
> >> So what is the equivalent of the shared page on bare hardware?
> > 
> > Bare-metal apps already have the concept of a shared page to communicate
> > with hardware devices, co-processors and other hardware/firmare
> > intercommunication frameworks.
> 
> So with that, is one side of the communication here then intended to
> emulate such a hardware device, co-processor or other hardware /
> firmware intercommunication framework? If so, aren't we talking
> about device emulation then? If not, how can such a bare metal app
> know the protocol (after all, if the protocol is Xen-specific, the app
> wouldn't be Xen-unaware anymore)?

They would have to come up with a protocol. However, they already have
code to deal with shared rings in their baremetal apps. It's not hard
for them to do so, it is not hypervisor specific, and it is similar to
the way they are used to work already. On the other end, they lack the
code to deal with hypercalls, event channels and grant tables. In fact,
they don't have Xen support.


> >> > What's the benefit? Baremetal guest are usually tiny, you could use the 
> >> > device-tree (and hence generic way) to present the share page for 
> >> > communicating. This means no Xen PV drivers, and therefore easier to 
> >> > move an OS in Xen VM.
> >> 
> >> Is this intended to be an ARM-specific extension, or a generic one?
> >> There's no DT on x86 to pass such information, and I can't easily
> >> see alternatives there. Also the consumer of the shared page info
> >> is still a PV component of the guest. You simply can't have an
> >> entirely unmodified guest which at the same time is Xen (or
> >> whatever other component sits at the other end of the shared
> >> page) aware.
> > 
> > I was going to propose for this work to be arch-neutral. However, it is
> > true that with the existing x86 software and hardware ecosystem, it
> > wouldn't be much use there. Given that the work is technically common
> > though, I don't see any downsides on enabling it on x86 on the off
> > chance that somebody will find it useful. However, if you prefer to
> > keep it ARM only, that's fine by me too.
> 
> I don't have a preference either way, but if you do it in an arch-neutral
> way, then the manifestation of the frame numbers also needs to be
> arch-neutral, in which case DT is not a suitable vehicle.

Makes sense.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-15 16:46     ` Wei Liu
@ 2017-05-18 18:09       ` Stefano Stabellini
  2017-05-19  6:59         ` Jan Beulich
  2017-05-19  9:33         ` Wei Liu
  0 siblings, 2 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-18 18:09 UTC (permalink / raw)
  To: Wei Liu
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Ian Jackson, Zhongze Liu

[-- Attachment #1: Type: TEXT/PLAIN, Size: 10616 bytes --]

On Mon, 15 May 2017, Wei Liu wrote:
> On Sat, May 13, 2017 at 10:28:27AM +0800, Zhongze Liu wrote:
> > 2017-05-13 1:51 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> > > Hi Zhongze
> > >
> > > This is a nice write-up. Some comments below. Feel free to disagree with
> > > what I say below, this is more a discussion than picking on your design
> > > or plan.
> > >
> > 
> > HI, Wei Liu
> > 
> > Thanks for your time reading through my proposal.
> > 
> > >
> > > On Sat, May 13, 2017 at 01:01:39AM +0800, Zhongze Liu wrote:
> > >> Hi, Xen developers,
> > >>
> > >> I'm Zhongze Liu, a GSoC student of this year. Glad to meet you in the
> > >> Xen Project.  As an initial step to implementing my GSoC proposal, which
> > >> is still a draft,  I'm posting it here. And hope to hear from you your
> > >> suggestions.
> > >>
> > >> ====================================================
> > >> 1. Motivation and Description
> > >> ====================================================
> > >> Virtual machines use grant table hypercalls to setup a share page for
> > >> inter-VMs communications. These hypercalls are used by all PV
> > >> protocols today. However, very simple guests, such as baremetal
> > >> applications, might not have the infrastructure to handle the grant table.
> > >> This project is about setting up several shared memory areas for inter-VMs
> > >> communications directly from the VM config file.
> > >> So that the guest kernel doesn't have to have grant table support to be
> > >> able to communicate with other guests.
> > >>
> > >> ====================================================
> > >> 2. Implementation Plan:
> > >> ====================================================
> > >>
> > >> ======================================
> > >> 2.1 Introduce a new VM config option in xl:
> > >> ======================================
> > >> The shared areas should be shareable among several VMs,
> > >> every shared physical memory area is assigned to a set of VMs.
> > >> Therefore, a “token” or “identifier” should be used here to uniquely
> > >> identify a backing memory area.
> > >>
> > >>
> > >> I would suggest using an unsigned integer to serve as the identifier.
> > >> For example:
> > >>
> > >> In xl config file of vm1:
> > >>
> > >>     static_shared_mem = [“addr_range1= ID1”, “addr_range2 = ID2”]
> > >>
> > >> In xl config file of vm2:
> > >>
> > >>     static_shared_mem = [“addr_range3 = ID1”]
> > >>
> > >> In xl config file of vm3:
> > >>
> > >>     static_shared_mem = [“addr_range4 = ID2”]
> > >
> > > I can envisage you need some more attributes: what about the attributes
> > > like RW / RO / WO (or even X)?
> > >
> > > Also, I assume the granularity of the mapping is a page, but as far as I
> > > can tell there are two page granularity on ARM, you do need to consider
> > > both and what should happen if you mix and match them. What about
> > > mapping several pages and different VM use overlapping ranges?
> > >
> > > Can you give some concrete examples? What does addr_rangeX look like in
> > > practice?
> > >
> > >
> > 
> > Yes, those attributes are necessary and should be explicitly specified in the
> > config file. I'll add them in the next version of this proposal. And taking the
> > granularity into consideration, what do you say if we change the entries into
> > something like:
> > 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.
> 
> I realised I may have gone too far after reading your reply.
> 
> What is the end purpose of this project? If you only want to insert a
> mfn into guest address space and don't care how the guest is going to
> map it, you can omit the prot= part. If you want stricter control, you
> will need them -- and that would also have implications on the
> hypervisor code you need.
> 
> I suggest you write the manual for the new mechanism you propose first.
> That way you describe the feature in a sysadmin-friendly way.  Describe
> the syntax, the effect of the new mechanism and how people are supposed
> to use it under what circumstances.

The memory sharing mechanism should enable guests to communicate with
each other using a shared ring. That implies that the memory needs to be
read-write, but I can imagine there are use cases for it to be read-only
too. I think it is a good idea to specify it.

However, I do not think we should ask Zhongze to write a protocol
specification for how these guests should communicate. That is out of
scope.


> > >> In the example above. A memory area A1 will be shared between
> > >> vm1 and vm2 -- vm1 can access this area using addr_range1
> > >> and vm2 using addr_range3. Likewise, a memory area A2 will be
> > >> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
> > >> and vm3 using addr_range4.
> > >>
> > >> The shared memory area denoted by an identifier IDx will be
> > >> allocated when it first appear, and the memory pages will be taken from
> > >> the first VM whose static_shared_mem list contains IDx. Take the above
> > >> config files for example, if we instantiate vm1, vm2 and vm3, one after
> > >> another, the memory areas denoted by ID1 and ID2 will both be allocated
> > >> in and taken from vm1.
> > >
> > > Hmm... I can see some potential hazards. Currently, multiple xl processes
> > > are serialized by a lock, and your assumption is the creation is done in
> > > order, but suppose sometime later they can run in parallel. When you
> > > have several "xl create" and they race with each other, what will
> > > happen?
> > >
> > > This can be solved by serializing in libxl or hypervisor, I think.
> > > It is up to you to choose where to do it.
> > >
> > > Also, please consider what happens when you destroy the owner domain
> > > before the rest. Proper reference counting should be done in the
> > > hypervisor.
> > >
> > 
> > Yes, the access to xenstore and other shared data should be serialized
> > using some kind of lock.
> > 
> > >
> > >>
> > >> ======================================
> > >> 2.2 Store the mem-sharing information in xenstore
> > >> ======================================
> > >> This information should include the length and owner of the area. And
> > >> it should also include information about where the backing memory areas
> > >> are mapped in every VM that are using it. This information should be
> > >> known to the xl command and all domains, so we utilize xenstore to keep
> > >> this information. A current plan is to place the information under
> > >> /local/shared_mem/ID. Still take the above config files as an example:
> > >>
> > >> If we instantiate vm1, vm2 and vm3, one after another,
> > >> “xenstore ls -f” should output something like this:
> > >>
> > >>
> > >> After VM1 was instantiated, the output of “xenstore ls -f”
> > >> will be something like this:
> > >>
> > >>     /local/shared_mem/ID1/owner = dom_id_of_vm1
> > >>
> > >>     /local/shared_mem/ID1/size = sizeof_addr_range1
> > >>
> > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
> > >>
> > >>
> > >>     /local/shared_mem/ID2/owner = dom_id_of_vm1
> > >>
> > >>     /local/shared_mem/ID2/size = sizeof_addr_range1
> > >>
> > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
> > >>
> > >>
> > >> After VM2 was instantiated, the following new lines will appear:
> > >>
> > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
> > >>
> > >>
> > >> After VM2 was instantiated, the following new lines will appear:
> > >>
> > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
> > >>
> > >> When we encounter an id IDx during "xl create":
> > >>
> > >>   + If it’s not under /local/shared_mem, create the corresponding entries
> > >>      (owner, size, and mappings) in xenstore, and allocate the memory from
> > >>      the newly created domain.
> > >>
> > >>   + If it’s found under /local/shared_mem, map the pages to the newly
> > >>       created domain, and add the current domain to
> > >>       /local/shared_mem/IDx/mappings.
> > >>
> > >
> > > Again, please think about destruction as well.
> > >
> > > At this point I think modelling after POSIX shared memory makes more
> > > sense. That is, there isn't one "owner" for the memory. You get hold of
> > > the shared memory via a key (ID in your case?).
> > >
> > 
> > Actually, I've thought about the same question and have discussed this with
> > Julien and Stefano. And this what they told me:
> > 
> > Stefano wrote:
> > "I think that in your scenario Xen (the hypervisor) wouldn't allow the
> > first domain to be completely destroyed because it knows that its
> > memory is still in use by something else in the system. The domain
> > remains in a zombie state until the memory is not used anymore. We need
> > to double-check this, but I don't think it will be a problem."
> > 
> 
> This has security implications -- a rogue guest can prevent the
> destruction of the owner.

We are going to use the same underlying hypervisor infrastructure, the
end result should be no different than sharing memory via grant table
from a security perspective. If not, then we need to fix Xen.


> > and Julien wrote:
> > "That's correct. A domain will not be destroyed until all the memory
> > associated to it will be freed.
> > A page will be considered free when all the reference on it will be
> > removed. This means that if the domain who allocated the page die, it
> > will not be fully destroyed until the page is not used by another
> > domain.
> > This is assuming that every domain using the page is taking a
> > reference (similar to foreign mapping). Actually, I think we might be
> > able to re-use the mapspace XENMAPSPACE_gmfn_foreign.
> > Actually, I think we can re-use the same mechanism as foreign mapping (see
> > Note that Xen on ARM (and x86?) does not take reference when mapping a
> > page to a stage-2 page table (e.g the page table holding the
> > translation between a guest physical address and host physical
> > address)."
> > 
> > I've also thought about modeling after the POSIX way of sharing memory.
> > If we do so, the owner of the shared pages should be Dom0, and we
> > will have to do the reference counting ourselves, and free pages when they're
> > no longer needed. I'm not sure which method is better. What do you say?
> > 
> 
> Assigning the page to Dom0 doesn't sound right to me either.
> 
> But the first step should really be defining the scope of the project.
> Technical details will follow naturally.

I thought that Zhongze wrote it well in "Motivation and Description".
What would you like to know in addition to that? 

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-18 18:09       ` Stefano Stabellini
@ 2017-05-19  6:59         ` Jan Beulich
  2017-05-19  9:33         ` Wei Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2017-05-19  6:59 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Jackson, Julien Grall, Wei Liu, Zhongze Liu, xen-devel

>>> On 18.05.17 at 20:09, <sstabellini@kernel.org> wrote:
> On Mon, 15 May 2017, Wei Liu wrote:
>> On Sat, May 13, 2017 at 10:28:27AM +0800, Zhongze Liu wrote:
>> > Stefano wrote:
>> > "I think that in your scenario Xen (the hypervisor) wouldn't allow the
>> > first domain to be completely destroyed because it knows that its
>> > memory is still in use by something else in the system. The domain
>> > remains in a zombie state until the memory is not used anymore. We need
>> > to double-check this, but I don't think it will be a problem."
>> > 
>> 
>> This has security implications -- a rogue guest can prevent the
>> destruction of the owner.
> 
> We are going to use the same underlying hypervisor infrastructure, the
> end result should be no different than sharing memory via grant table
> from a security perspective. If not, then we need to fix Xen.

Yes and no. Improper use of grant table interfaces can lead to
this problem too. There the requirement is that all memory is
always owned (and granted foreign access to) by the frontend
drivers. I.e. there's a certain level of trust that backend behave
themselves. Similarly page ownership and direction of trust need
to be considered (and perhaps written down) here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-18 18:09       ` Stefano Stabellini
  2017-05-19  6:59         ` Jan Beulich
@ 2017-05-19  9:33         ` Wei Liu
  2017-05-19 17:14           ` Zhongze Liu
  1 sibling, 1 reply; 31+ messages in thread
From: Wei Liu @ 2017-05-19  9:33 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Jackson, xen-devel, Julien Grall, Wei Liu, Zhongze Liu

On Thu, May 18, 2017 at 11:09:40AM -0700, Stefano Stabellini wrote:
> > > 
> > > Yes, those attributes are necessary and should be explicitly specified in the
> > > config file. I'll add them in the next version of this proposal. And taking the
> > > granularity into consideration, what do you say if we change the entries into
> > > something like:
> > > 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.
> > 
> > I realised I may have gone too far after reading your reply.
> > 
> > What is the end purpose of this project? If you only want to insert a
> > mfn into guest address space and don't care how the guest is going to
> > map it, you can omit the prot= part. If you want stricter control, you
> > will need them -- and that would also have implications on the
> > hypervisor code you need.
> > 
> > I suggest you write the manual for the new mechanism you propose first.
> > That way you describe the feature in a sysadmin-friendly way.  Describe
> > the syntax, the effect of the new mechanism and how people are supposed
> > to use it under what circumstances.
> 
> The memory sharing mechanism should enable guests to communicate with
> each other using a shared ring. That implies that the memory needs to be
> read-write, but I can imagine there are use cases for it to be read-only
> too. I think it is a good idea to specify it.
> 
> However, I do not think we should ask Zhongze to write a protocol
> specification for how these guests should communicate. That is out of
> scope.

That's right. This is out of scope. I didn't mean to ask Zhongze to
write a protocol specification.

> 
> 
> > > >> In the example above. A memory area A1 will be shared between
> > > >> vm1 and vm2 -- vm1 can access this area using addr_range1
> > > >> and vm2 using addr_range3. Likewise, a memory area A2 will be
> > > >> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
> > > >> and vm3 using addr_range4.
> > > >>
> > > >> The shared memory area denoted by an identifier IDx will be
> > > >> allocated when it first appear, and the memory pages will be taken from
> > > >> the first VM whose static_shared_mem list contains IDx. Take the above
> > > >> config files for example, if we instantiate vm1, vm2 and vm3, one after
> > > >> another, the memory areas denoted by ID1 and ID2 will both be allocated
> > > >> in and taken from vm1.
> > > >
> > > > Hmm... I can see some potential hazards. Currently, multiple xl processes
> > > > are serialized by a lock, and your assumption is the creation is done in
> > > > order, but suppose sometime later they can run in parallel. When you
> > > > have several "xl create" and they race with each other, what will
> > > > happen?
> > > >
> > > > This can be solved by serializing in libxl or hypervisor, I think.
> > > > It is up to you to choose where to do it.
> > > >
> > > > Also, please consider what happens when you destroy the owner domain
> > > > before the rest. Proper reference counting should be done in the
> > > > hypervisor.
> > > >
> > > 
> > > Yes, the access to xenstore and other shared data should be serialized
> > > using some kind of lock.
> > > 
> > > >
> > > >>
> > > >> ======================================
> > > >> 2.2 Store the mem-sharing information in xenstore
> > > >> ======================================
> > > >> This information should include the length and owner of the area. And
> > > >> it should also include information about where the backing memory areas
> > > >> are mapped in every VM that are using it. This information should be
> > > >> known to the xl command and all domains, so we utilize xenstore to keep
> > > >> this information. A current plan is to place the information under
> > > >> /local/shared_mem/ID. Still take the above config files as an example:
> > > >>
> > > >> If we instantiate vm1, vm2 and vm3, one after another,
> > > >> “xenstore ls -f” should output something like this:
> > > >>
> > > >>
> > > >> After VM1 was instantiated, the output of “xenstore ls -f”
> > > >> will be something like this:
> > > >>
> > > >>     /local/shared_mem/ID1/owner = dom_id_of_vm1
> > > >>
> > > >>     /local/shared_mem/ID1/size = sizeof_addr_range1
> > > >>
> > > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
> > > >>
> > > >>
> > > >>     /local/shared_mem/ID2/owner = dom_id_of_vm1
> > > >>
> > > >>     /local/shared_mem/ID2/size = sizeof_addr_range1
> > > >>
> > > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
> > > >>
> > > >>
> > > >> After VM2 was instantiated, the following new lines will appear:
> > > >>
> > > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
> > > >>
> > > >>
> > > >> After VM2 was instantiated, the following new lines will appear:
> > > >>
> > > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
> > > >>
> > > >> When we encounter an id IDx during "xl create":
> > > >>
> > > >>   + If it’s not under /local/shared_mem, create the corresponding entries
> > > >>      (owner, size, and mappings) in xenstore, and allocate the memory from
> > > >>      the newly created domain.
> > > >>
> > > >>   + If it’s found under /local/shared_mem, map the pages to the newly
> > > >>       created domain, and add the current domain to
> > > >>       /local/shared_mem/IDx/mappings.
> > > >>
> > > >
> > > > Again, please think about destruction as well.
> > > >
> > > > At this point I think modelling after POSIX shared memory makes more
> > > > sense. That is, there isn't one "owner" for the memory. You get hold of
> > > > the shared memory via a key (ID in your case?).
> > > >
> > > 
> > > Actually, I've thought about the same question and have discussed this with
> > > Julien and Stefano. And this what they told me:
> > > 
> > > Stefano wrote:
> > > "I think that in your scenario Xen (the hypervisor) wouldn't allow the
> > > first domain to be completely destroyed because it knows that its
> > > memory is still in use by something else in the system. The domain
> > > remains in a zombie state until the memory is not used anymore. We need
> > > to double-check this, but I don't think it will be a problem."
> > > 
> > 
> > This has security implications -- a rogue guest can prevent the
> > destruction of the owner.
> 
> We are going to use the same underlying hypervisor infrastructure, the
> end result should be no different than sharing memory via grant table
> from a security perspective. If not, then we need to fix Xen.
> 

There is a certain level of trust in the frontend / backend model. The
frontend needs to trust backend to a certain degree. A user knows what
to expect or do if one side misbehaves.

But the way this proposal is phrased is that this is to construct a
communication channel, i.e. it reads to me from a user's perspective you
don't give one guest more trust than the other. This needs clarifying.

> 
> > > and Julien wrote:
> > > "That's correct. A domain will not be destroyed until all the memory
> > > associated to it will be freed.
> > > A page will be considered free when all the reference on it will be
> > > removed. This means that if the domain who allocated the page die, it
> > > will not be fully destroyed until the page is not used by another
> > > domain.
> > > This is assuming that every domain using the page is taking a
> > > reference (similar to foreign mapping). Actually, I think we might be
> > > able to re-use the mapspace XENMAPSPACE_gmfn_foreign.
> > > Actually, I think we can re-use the same mechanism as foreign mapping (see
> > > Note that Xen on ARM (and x86?) does not take reference when mapping a
> > > page to a stage-2 page table (e.g the page table holding the
> > > translation between a guest physical address and host physical
> > > address)."
> > > 
> > > I've also thought about modeling after the POSIX way of sharing memory.
> > > If we do so, the owner of the shared pages should be Dom0, and we
> > > will have to do the reference counting ourselves, and free pages when they're
> > > no longer needed. I'm not sure which method is better. What do you say?
> > > 
> > 
> > Assigning the page to Dom0 doesn't sound right to me either.
> > 
> > But the first step should really be defining the scope of the project.
> > Technical details will follow naturally.
> 
> I thought that Zhongze wrote it well in "Motivation and Description".
> What would you like to know in addition to that? 

A bit more details are needed. See above.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-16 18:16               ` Stefano Stabellini
@ 2017-05-19 16:52                 ` Zhongze Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhongze Liu @ 2017-05-19 16:52 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Ian Jackson, Julien Grall, Jan Beulich, xen-devel, nd

2017-05-17 2:16 GMT+08:00 Stefano Stabellini <sstabellini@kernel.org>:
> On Tue, 16 May 2017, Jan Beulich wrote:
>> >>> On 15.05.17 at 19:40, <sstabellini@kernel.org> wrote:
>> > On Mon, 15 May 2017, Jan Beulich wrote:
>> >> >>> On 15.05.17 at 12:21, <julien.grall@arm.com> wrote:
>> >> > On Zhongze proposal, the share page will be mapped at the a specific
>> >> > address in the guest memory. I agree this will require some work in the
>> >> > toolstack, on the hypervisor side we could re-use the foreign mapping
>> >> > API. But on the guest side there are nothing to do Xen specific.
>> >>
>> >> So what is the equivalent of the shared page on bare hardware?
>> >
>> > Bare-metal apps already have the concept of a shared page to communicate
>> > with hardware devices, co-processors and other hardware/firmare
>> > intercommunication frameworks.
>>
>> So with that, is one side of the communication here then intended to
>> emulate such a hardware device, co-processor or other hardware /
>> firmware intercommunication framework? If so, aren't we talking
>> about device emulation then? If not, how can such a bare metal app
>> know the protocol (after all, if the protocol is Xen-specific, the app
>> wouldn't be Xen-unaware anymore)?
>
> They would have to come up with a protocol. However, they already have
> code to deal with shared rings in their baremetal apps. It's not hard
> for them to do so, it is not hypervisor specific, and it is similar to
> the way they are used to work already. On the other end, they lack the
> code to deal with hypercalls, event channels and grant tables. In fact,
> they don't have Xen support.
>
>
>> >> > What's the benefit? Baremetal guest are usually tiny, you could use the
>> >> > device-tree (and hence generic way) to present the share page for
>> >> > communicating. This means no Xen PV drivers, and therefore easier to
>> >> > move an OS in Xen VM.
>> >>
>> >> Is this intended to be an ARM-specific extension, or a generic one?
>> >> There's no DT on x86 to pass such information, and I can't easily
>> >> see alternatives there. Also the consumer of the shared page info
>> >> is still a PV component of the guest. You simply can't have an
>> >> entirely unmodified guest which at the same time is Xen (or
>> >> whatever other component sits at the other end of the shared
>> >> page) aware.
>> >
>> > I was going to propose for this work to be arch-neutral. However, it is
>> > true that with the existing x86 software and hardware ecosystem, it
>> > wouldn't be much use there. Given that the work is technically common
>> > though, I don't see any downsides on enabling it on x86 on the off
>> > chance that somebody will find it useful. However, if you prefer to
>> > keep it ARM only, that's fine by me too.
>>
>> I don't have a preference either way, but if you do it in an arch-neutral
>> way, then the manifestation of the frame numbers also needs to be
>> arch-neutral, in which case DT is not a suitable vehicle.
>
> Makes sense.
>

I agree with this, I'll take this into consideration in the next version of
this proposal.

Cheers,

Zhongze Liu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-19  9:33         ` Wei Liu
@ 2017-05-19 17:14           ` Zhongze Liu
  2017-05-19 17:46             ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: Zhongze Liu @ 2017-05-19 17:14 UTC (permalink / raw)
  To: Wei Liu; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Ian Jackson

Hi, Wei,

>>>>2017-05-19 17:33 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> On Thu, May 18, 2017 at 11:09:40AM -0700, Stefano Stabellini wrote:
>> > >
>> > > Yes, those attributes are necessary and should be explicitly specified in the
>> > > config file. I'll add them in the next version of this proposal. And taking the
>> > > granularity into consideration, what do you say if we change the entries into
>> > > something like:
>> > > 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.
>> >
>> > I realised I may have gone too far after reading your reply.
>> >
>> > What is the end purpose of this project? If you only want to insert a
>> > mfn into guest address space and don't care how the guest is going to
>> > map it, you can omit the prot= part. If you want stricter control, you
>> > will need them -- and that would also have implications on the
>> > hypervisor code you need.
>> >
>> > I suggest you write the manual for the new mechanism you propose first.
>> > That way you describe the feature in a sysadmin-friendly way.  Describe
>> > the syntax, the effect of the new mechanism and how people are supposed
>> > to use it under what circumstances.
>>
>> The memory sharing mechanism should enable guests to communicate with
>> each other using a shared ring. That implies that the memory needs to be
>> read-write, but I can imagine there are use cases for it to be read-only
>> too. I think it is a good idea to specify it.
>>
>> However, I do not think we should ask Zhongze to write a protocol
>> specification for how these guests should communicate. That is out of
>> scope.
>
> That's right. This is out of scope. I didn't mean to ask Zhongze to
> write a protocol specification.
>


Yes, describing the syntax and the effect of the new mechanism, as well as
example usecases, will be of great importance. I'm now waiting for Stefano to
get some real llfe customer usecases.

>
>>
>>
>> > > >> In the example above. A memory area A1 will be shared between
>> > > >> vm1 and vm2 -- vm1 can access this area using addr_range1
>> > > >> and vm2 using addr_range3. Likewise, a memory area A2 will be
>> > > >> shared between vm1 and vm3 -- vm1 can access A2 using addr_range2
>> > > >> and vm3 using addr_range4.
>> > > >>
>> > > >> The shared memory area denoted by an identifier IDx will be
>> > > >> allocated when it first appear, and the memory pages will be taken from
>> > > >> the first VM whose static_shared_mem list contains IDx. Take the above
>> > > >> config files for example, if we instantiate vm1, vm2 and vm3, one after
>> > > >> another, the memory areas denoted by ID1 and ID2 will both be allocated
>> > > >> in and taken from vm1.
>> > > >
>> > > > Hmm... I can see some potential hazards. Currently, multiple xl processes
>> > > > are serialized by a lock, and your assumption is the creation is done in
>> > > > order, but suppose sometime later they can run in parallel. When you
>> > > > have several "xl create" and they race with each other, what will
>> > > > happen?
>> > > >
>> > > > This can be solved by serializing in libxl or hypervisor, I think.
>> > > > It is up to you to choose where to do it.
>> > > >
>> > > > Also, please consider what happens when you destroy the owner domain
>> > > > before the rest. Proper reference counting should be done in the
>> > > > hypervisor.
>> > > >
>> > >
>> > > Yes, the access to xenstore and other shared data should be serialized
>> > > using some kind of lock.
>> > >
>> > > >
>> > > >>
>> > > >> ======================================
>> > > >> 2.2 Store the mem-sharing information in xenstore
>> > > >> ======================================
>> > > >> This information should include the length and owner of the area. And
>> > > >> it should also include information about where the backing memory areas
>> > > >> are mapped in every VM that are using it. This information should be
>> > > >> known to the xl command and all domains, so we utilize xenstore to keep
>> > > >> this information. A current plan is to place the information under
>> > > >> /local/shared_mem/ID. Still take the above config files as an example:
>> > > >>
>> > > >> If we instantiate vm1, vm2 and vm3, one after another,
>> > > >> “xenstore ls -f” should output something like this:
>> > > >>
>> > > >>
>> > > >> After VM1 was instantiated, the output of “xenstore ls -f”
>> > > >> will be something like this:
>> > > >>
>> > > >>     /local/shared_mem/ID1/owner = dom_id_of_vm1
>> > > >>
>> > > >>     /local/shared_mem/ID1/size = sizeof_addr_range1
>> > > >>
>> > > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm1 = addr_range1
>> > > >>
>> > > >>
>> > > >>     /local/shared_mem/ID2/owner = dom_id_of_vm1
>> > > >>
>> > > >>     /local/shared_mem/ID2/size = sizeof_addr_range1
>> > > >>
>> > > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm1 = addr_range2
>> > > >>
>> > > >>
>> > > >> After VM2 was instantiated, the following new lines will appear:
>> > > >>
>> > > >>     /local/shared_mem/ID1/mappings/dom_id_of_vm2 = addr_range3
>> > > >>
>> > > >>
>> > > >> After VM2 was instantiated, the following new lines will appear:
>> > > >>
>> > > >>     /local/shared_mem/ID2/mappings/dom_id_of_vm2 = addr_range4
>> > > >>
>> > > >> When we encounter an id IDx during "xl create":
>> > > >>
>> > > >>   + If it’s not under /local/shared_mem, create the corresponding entries
>> > > >>      (owner, size, and mappings) in xenstore, and allocate the memory from
>> > > >>      the newly created domain.
>> > > >>
>> > > >>   + If it’s found under /local/shared_mem, map the pages to the newly
>> > > >>       created domain, and add the current domain to
>> > > >>       /local/shared_mem/IDx/mappings.
>> > > >>
>> > > >
>> > > > Again, please think about destruction as well.
>> > > >
>> > > > At this point I think modelling after POSIX shared memory makes more
>> > > > sense. That is, there isn't one "owner" for the memory. You get hold of
>> > > > the shared memory via a key (ID in your case?).
>> > > >
>> > >
>> > > Actually, I've thought about the same question and have discussed this with
>> > > Julien and Stefano. And this what they told me:
>> > >
>> > > Stefano wrote:
>> > > "I think that in your scenario Xen (the hypervisor) wouldn't allow the
>> > > first domain to be completely destroyed because it knows that its
>> > > memory is still in use by something else in the system. The domain
>> > > remains in a zombie state until the memory is not used anymore. We need
>> > > to double-check this, but I don't think it will be a problem."
>> > >
>> >
>> > This has security implications -- a rogue guest can prevent the
>> > destruction of the owner.
>>
>> We are going to use the same underlying hypervisor infrastructure, the
>> end result should be no different than sharing memory via grant table
>> from a security perspective. If not, then we need to fix Xen.
>>
>
> There is a certain level of trust in the frontend / backend model. The
> frontend needs to trust backend to a certain degree. A user knows what
> to expect or do if one side misbehaves.
>
> But the way this proposal is phrased is that this is to construct a
> communication channel, i.e. it reads to me from a user's perspective you
> don't give one guest more trust than the other. This needs clarifying.
>

IMHO, Since the shared memory is statically specified in the xl config files,
VMs don't have to ability to dynamically ask for a shared page from other
VMS through this new mechanism,
 The xl file is written by the sys admin, so I think it's the administrators'
duty to make it right.

>
>>
>> > > and Julien wrote:
>> > > "That's correct. A domain will not be destroyed until all the memory
>> > > associated to it will be freed.
>> > > A page will be considered free when all the reference on it will be
>> > > removed. This means that if the domain who allocated the page die, it
>> > > will not be fully destroyed until the page is not used by another
>> > > domain.
>> > > This is assuming that every domain using the page is taking a
>> > > reference (similar to foreign mapping). Actually, I think we might be
>> > > able to re-use the mapspace XENMAPSPACE_gmfn_foreign.
>> > > Actually, I think we can re-use the same mechanism as foreign mapping (see
>> > > Note that Xen on ARM (and x86?) does not take reference when mapping a
>> > > page to a stage-2 page table (e.g the page table holding the
>> > > translation between a guest physical address and host physical
>> > > address)."
>> > >
>> > > I've also thought about modeling after the POSIX way of sharing memory.
>> > > If we do so, the owner of the shared pages should be Dom0, and we
>> > > will have to do the reference counting ourselves, and free pages when they're
>> > > no longer needed. I'm not sure which method is better. What do you say?
>> > >
>> >
>> > Assigning the page to Dom0 doesn't sound right to me either.
>

Then do you have any ideas here?

>
>> >
>> > But the first step should really be defining the scope of the project.
>> > Technical details will follow naturally.
>>
>> I thought that Zhongze wrote it well in "Motivation and Description".
>> What would you like to know in addition to that?
>
> A bit more details are needed. See above.

I'll try to take all the good points  in the comments into consideration.
Thank you.

Cheers,

Zhongze Liu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-19 17:14           ` Zhongze Liu
@ 2017-05-19 17:46             ` Stefano Stabellini
  2017-05-19 18:19               ` Jarvis Roach
  2017-05-22 13:49               ` Ian Jackson
  0 siblings, 2 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-19 17:46 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: Stefano Stabellini, Wei Liu, Ian Jackson, edgari, Julien Grall,
	xen-devel, Jarvis.Roach

On Sat, 20 May 2017, Zhongze Liu wrote:
> Hi, Wei,
> 
> >>>>2017-05-19 17:33 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> > On Thu, May 18, 2017 at 11:09:40AM -0700, Stefano Stabellini wrote:
> >> > >
> >> > > Yes, those attributes are necessary and should be explicitly specified in the
> >> > > config file. I'll add them in the next version of this proposal. And taking the
> >> > > granularity into consideration, what do you say if we change the entries into
> >> > > something like:
> >> > > 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.
> >> >
> >> > I realised I may have gone too far after reading your reply.
> >> >
> >> > What is the end purpose of this project? If you only want to insert a
> >> > mfn into guest address space and don't care how the guest is going to
> >> > map it, you can omit the prot= part. If you want stricter control, you
> >> > will need them -- and that would also have implications on the
> >> > hypervisor code you need.
> >> >
> >> > I suggest you write the manual for the new mechanism you propose first.
> >> > That way you describe the feature in a sysadmin-friendly way.  Describe
> >> > the syntax, the effect of the new mechanism and how people are supposed
> >> > to use it under what circumstances.
> >>
> >> The memory sharing mechanism should enable guests to communicate with
> >> each other using a shared ring. That implies that the memory needs to be
> >> read-write, but I can imagine there are use cases for it to be read-only
> >> too. I think it is a good idea to specify it.
> >>
> >> However, I do not think we should ask Zhongze to write a protocol
> >> specification for how these guests should communicate. That is out of
> >> scope.
> >
> > That's right. This is out of scope. I didn't mean to ask Zhongze to
> > write a protocol specification.
> >
> 
> Yes, describing the syntax and the effect of the new mechanism, as well as
> example usecases, will be of great importance. I'm now waiting for Stefano to
> get some real llfe customer usecases.

[...]

> > There is a certain level of trust in the frontend / backend model. The
> > frontend needs to trust backend to a certain degree. A user knows what
> > to expect or do if one side misbehaves.
> >
> > But the way this proposal is phrased is that this is to construct a
> > communication channel, i.e. it reads to me from a user's perspective you
> > don't give one guest more trust than the other. This needs clarifying.
> >
> 
> IMHO, Since the shared memory is statically specified in the xl config files,
> VMs don't have to ability to dynamically ask for a shared page from other
> VMS through this new mechanism,
>  The xl file is written by the sys admin, so I think it's the administrators'
> duty to make it right.

While waiting for Jarvis and Edgar to provide more befitting
information, I'll try to fill in myself. There is a demand to run
"bare-metal" applications on embedded boards (such as the new Xilinx
Zynq MPSoC board). People in those communities actually call them
"bare-metal". They are written in C and run directly in kernel mode.
There is no operating system or unikernel.  They can run bare-metal or
inside a Xen VM. Usually they drive one or more devices (or FPGAs) with
extremely low latency and overhead. An example of such an application is
tbm (which is the one I used to do irq latency measurements):

https://github.com/edgarigl/tbm/app/xen/guest_irq_latency/apu.c

Let's suppose they you have one bare-metal app to drive a set of FPGA
space and another one to drive another set of FPGA space (or two
different devices), you might need the two apps to exchange information.

The owner knows exactly how many apps she is going to have on the board
and who needs to communicate with whom from the start. It is a fully
static configuration.

In this scenario, she is going to write to the VM config files of the
two apps that one page will be shared among the two, so that they can
send each other messages. She will hard-code the address of the shared
page in her "bare-metal" app.

There is no frontend and backend (as in the usual Xen meaning). In some
configurations one app might be more critical than the other, but in
some other scenarios they might have the same criticality.

If, as Jan pointed out, we need to call out explicitly which is the
frontend and which is the backend for page ownership reasons, then I
suggested we expose that configuration to the user, so that she can
choose.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-19 17:46             ` Stefano Stabellini
@ 2017-05-19 18:19               ` Jarvis Roach
  2017-05-19 19:08                 ` Stefano Stabellini
  2017-05-22 13:49               ` Ian Jackson
  1 sibling, 1 reply; 31+ messages in thread
From: Jarvis Roach @ 2017-05-19 18:19 UTC (permalink / raw)
  To: Stefano Stabellini, Zhongze Liu
  Cc: edgari, xen-devel, Julien Grall, Wei Liu, Ian Jackson

> -----Original Message-----
> From: Stefano Stabellini [mailto:sstabellini@kernel.org]
> Sent: Friday, May 19, 2017 1:47 PM
> To: Zhongze Liu <blackskygg@gmail.com>
> Cc: Wei Liu <wei.liu2@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel@lists.xenproject.org; Julien Grall
> <julien.grall@arm.com>; Ian Jackson <ian.jackson@eu.citrix.com>; Jarvis
> Roach <Jarvis.Roach@dornerworks.com>; edgari@xilinx.com
> Subject: Re: Proposal to allow setting up shared memory areas between VMs
> from xl config file
> 
> On Sat, 20 May 2017, Zhongze Liu wrote:
> > Hi, Wei,
> >
> > >>>>2017-05-19 17:33 GMT+08:00 Wei Liu <wei.liu2@citrix.com>:
> > > On Thu, May 18, 2017 at 11:09:40AM -0700, Stefano Stabellini wrote:
> > >> > >
> > >> > > Yes, those attributes are necessary and should be explicitly
> > >> > > specified in the config file. I'll add them in the next version
> > >> > > of this proposal. And taking the granularity into
> > >> > > consideration, what do you say if we change the entries into
> something like:
> > >> > > 'start=0xcafebabe, end=0xdeedbeef, granularity=4K, prot=RWX'.
> > >> >
> > >> > I realised I may have gone too far after reading your reply.
> > >> >
> > >> > What is the end purpose of this project? If you only want to
> > >> > insert a mfn into guest address space and don't care how the
> > >> > guest is going to map it, you can omit the prot= part. If you
> > >> > want stricter control, you will need them -- and that would also
> > >> > have implications on the hypervisor code you need.
> > >> >
> > >> > I suggest you write the manual for the new mechanism you propose
> first.
> > >> > That way you describe the feature in a sysadmin-friendly way.
> > >> > Describe the syntax, the effect of the new mechanism and how
> > >> > people are supposed to use it under what circumstances.
> > >>
> > >> The memory sharing mechanism should enable guests to communicate
> > >> with each other using a shared ring. That implies that the memory
> > >> needs to be read-write, but I can imagine there are use cases for
> > >> it to be read-only too. I think it is a good idea to specify it.

We've been seeing increased need for controls over other memory attributes (cacheability, shareability) in addition to access.


> > >> However, I do not think we should ask Zhongze to write a protocol
> > >> specification for how these guests should communicate. That is out
> > >> of scope.
> > >
> > > That's right. This is out of scope. I didn't mean to ask Zhongze to
> > > write a protocol specification.
> > >
> >
> > Yes, describing the syntax and the effect of the new mechanism, as
> > well as example usecases, will be of great importance. I'm now waiting
> > for Stefano to get some real llfe customer usecases.
> 
> [...]
> 
> > > There is a certain level of trust in the frontend / backend model.
> > > The frontend needs to trust backend to a certain degree. A user
> > > knows what to expect or do if one side misbehaves.
> > >
> > > But the way this proposal is phrased is that this is to construct a
> > > communication channel, i.e. it reads to me from a user's perspective
> > > you don't give one guest more trust than the other. This needs clarifying.
> > >
> >
> > IMHO, Since the shared memory is statically specified in the xl config
> > files, VMs don't have to ability to dynamically ask for a shared page
> > from other VMS through this new mechanism,  The xl file is written by
> > the sys admin, so I think it's the administrators'
> > duty to make it right.
> 

I think so too. From my embedded aviation experience, instead of a "sys admin" you have a "system integrator", an entity responsible for configuring the system correctly. For Xen-ARM, I'd see that covering the system device-tree and guest .cfg (including any partial DTs).

> While waiting for Jarvis and Edgar to provide more befitting information, I'll
> try to fill in myself. There is a demand to run "bare-metal" applications on
> embedded boards (such as the new Xilinx Zynq MPSoC board). People in
> those communities actually call them "bare-metal". They are written in C and
> run directly in kernel mode.
> There is no operating system or unikernel.  They can run bare-metal or inside
> a Xen VM. Usually they drive one or more devices (or FPGAs) with extremely
> low latency and overhead. An example of such an application is tbm (which is
> the one I used to do irq latency measurements):
> 
> https://github.com/edgarigl/tbm/app/xen/guest_irq_latency/apu.c
> 
> Let's suppose they you have one bare-metal app to drive a set of FPGA space
> and another one to drive another set of FPGA space (or two different
> devices), you might need the two apps to exchange information.
> 
> The owner knows exactly how many apps she is going to have on the board
> and who needs to communicate with whom from the start. It is a fully static
> configuration.
> 
> In this scenario, she is going to write to the VM config files of the two apps
> that one page will be shared among the two, so that they can send each
> other messages. She will hard-code the address of the shared page in her
> "bare-metal" app.
> 
> There is no frontend and backend (as in the usual Xen meaning). In some
> configurations one app might be more critical than the other, but in some
> other scenarios they might have the same criticality.
> 
> If, as Jan pointed out, we need to call out explicitly which is the frontend and
> which is the backend for page ownership reasons, then I suggested we
> expose that configuration to the user, so that she can choose.

Thank you Stefano, that is a good representation of the need. 

I am not familiar with what all page ownership entails, if someone knows of a good link please send to me privately so I can educate myself. :)

Because of my ignorance the following may fly in the face of some design paradigm I'm not aware of, but I would expect to see these shared memory regions treated as a system resource, analogous to physical I/O periphs, defined so that Xen and Dom0 don't try to use them, and allocated to guests as part of a static configuration under control of a system integrator. Something would be needed in the system DT  to "carve" these regions out of the rest of memory, and then something in the guest's .cfg that gives guest access to the location. We can already do that, albeit with some limitations, using the "iomem" attribute in the .cfg and "uio" nodes in the system device tree. I mentioned this elsewhere, but if we just had greater control over the memory attributes via the "iomem" attribute in the .cfg I think we'd have most of the functionality we'd need.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-19 18:19               ` Jarvis Roach
@ 2017-05-19 19:08                 ` Stefano Stabellini
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-19 19:08 UTC (permalink / raw)
  To: Jarvis Roach
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, Ian Jackson, edgari,
	Julien Grall, xen-devel

On Fri, 19 May 2017, Jarvis Roach wrote:
> > While waiting for Jarvis and Edgar to provide more befitting information, I'll
> > try to fill in myself. There is a demand to run "bare-metal" applications on
> > embedded boards (such as the new Xilinx Zynq MPSoC board). People in
> > those communities actually call them "bare-metal". They are written in C and
> > run directly in kernel mode.
> > There is no operating system or unikernel.  They can run bare-metal or inside
> > a Xen VM. Usually they drive one or more devices (or FPGAs) with extremely
> > low latency and overhead. An example of such an application is tbm (which is
> > the one I used to do irq latency measurements):
> > 
> > https://github.com/edgarigl/tbm/app/xen/guest_irq_latency/apu.c
> > 
> > Let's suppose they you have one bare-metal app to drive a set of FPGA space
> > and another one to drive another set of FPGA space (or two different
> > devices), you might need the two apps to exchange information.
> > 
> > The owner knows exactly how many apps she is going to have on the board
> > and who needs to communicate with whom from the start. It is a fully static
> > configuration.
> > 
> > In this scenario, she is going to write to the VM config files of the two apps
> > that one page will be shared among the two, so that they can send each
> > other messages. She will hard-code the address of the shared page in her
> > "bare-metal" app.
> > 
> > There is no frontend and backend (as in the usual Xen meaning). In some
> > configurations one app might be more critical than the other, but in some
> > other scenarios they might have the same criticality.
> > 
> > If, as Jan pointed out, we need to call out explicitly which is the frontend and
> > which is the backend for page ownership reasons, then I suggested we
> > expose that configuration to the user, so that she can choose.
> 
> Thank you Stefano, that is a good representation of the need. 
> 
> I am not familiar with what all page ownership entails, if someone knows of a good link please send to me privately so I can educate myself. :)
> 
> Because of my ignorance the following may fly in the face of some design paradigm I'm not aware of, but I would expect to see these shared memory regions treated as a system resource, analogous to physical I/O periphs, defined so that Xen and Dom0 don't try to use them, and allocated to guests as part of a static configuration under control of a system integrator. Something would be needed in the system DT  to "carve" these regions out of the rest of memory, and then something in the guest's .cfg that gives guest access to the location. We can already do that, albeit with some limitations, using the "iomem" attribute in the .cfg and "uio" nodes in the system device tree. I mentioned this elsewhere, but if we just had greater control over the memory attributes via the "iomem" attribute in the .cfg I think we'd have most of the functionality we'd need.

As memory is allocated on demand, there is no need to carve out the
regions from device tree. Using dom0_mem (which is required on ARM),
there is always unused memory on the system which can be used for this
purpose (unless you specifically want to share some special memory
regions, not RAM).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-19 17:46             ` Stefano Stabellini
  2017-05-19 18:19               ` Jarvis Roach
@ 2017-05-22 13:49               ` Ian Jackson
  2017-05-22 21:14                 ` Stefano Stabellini
  1 sibling, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2017-05-22 13:49 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Zhongze Liu, Julien Grall, edgari, xen-devel, Jarvis.Roach

Stefano Stabellini writes ("Re: Proposal to allow setting up shared memory areas between VMs from xl config file"):
> In this scenario, she is going to write to the VM config files of the
> two apps that one page will be shared among the two, so that they can
> send each other messages. She will hard-code the address of the shared
> page in her "bare-metal" app.

Thanks.  This makes some sense.

How do these apps expect to interrupt each other, or do they poll in
the shared memory ?  What I'm getting at with this question is that
perhaps some event channels will need setting up.

> There is no frontend and backend (as in the usual Xen meaning). In some
> configurations one app might be more critical than the other, but in
> some other scenarios they might have the same criticality.

Yes.

> If, as Jan pointed out, we need to call out explicitly which is the
> frontend and which is the backend for page ownership reasons, then I
> suggested we expose that configuration to the user, so that she can
> choose.

Indeed.

ISTM that this scenario doesn't depend on new hypervisor
functionality.  The toolstack could set up the appropriate page
sharing (presumably, this would be done with grants so that the result
is like something the guests could have done themselves.)

I see no objection to the libxl domain configuration file naming
guest-physical addresses for use in this way.

One problem, though, is to do with startup order.  To do this in the
most natural way, one would want to start both guests at once so that
one would know their domids etc.  (Also that avoids questions like
`what if one of them crashes'...)

I'm not sure exactly how to fit this into the libxl model, which
mostly talks about one guest domain at a time; and each guest config
talks about persistent resources, rather than resources which are
somehow exposed by a particular guest.

I think this question is worth exploring to see what shape the right
solution is.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-22 13:49               ` Ian Jackson
@ 2017-05-22 21:14                 ` Stefano Stabellini
  2017-05-23 15:13                   ` Edgar E. Iglesias
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-22 21:14 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Stefano Stabellini, Wei Liu, Zhongze Liu, edgari, Julien Grall,
	xen-devel, Jarvis.Roach

On Mon, 22 May 2017, Ian Jackson wrote:
> Stefano Stabellini writes ("Re: Proposal to allow setting up shared memory areas between VMs from xl config file"):
> > In this scenario, she is going to write to the VM config files of the
> > two apps that one page will be shared among the two, so that they can
> > send each other messages. She will hard-code the address of the shared
> > page in her "bare-metal" app.
> 
> Thanks.  This makes some sense.
> 
> How do these apps expect to interrupt each other, or do they poll in
> the shared memory ?  What I'm getting at with this question is that
> perhaps some event channels will need setting up.

As a matter of fact, I have been asking myself the same question. Nobody
asked me explicitly for notifications support, so I didn't include it in
the original project definition (it can always be added later) but I
think it would be useful.

Edgar, Jarvis, do you have an opinion on this? Do (software) interrupts
need to be setup together with the shared memory region to send
notifications back and forth between the two guests, or are they
unnecessary because the apps do polling anyway?

Event channels are not as complex as grants, but they are not trivial
either. Guests need to support the full event channel ABI even just to
receive notifications from one event channel only (because they need to
clear the pending bits), which is not simple and increases latency to
de-multiplex events. See drivers/xen/events/events_2l.c and
drivers/xen/events/events_base.c in Linux. I think we would have to
introduce a simpler model, where each "notification channel" is not
implemented by an event channel, but by a PPI or SGI instead. We expect
only one or two to be used. PPIs and SGIs are interrupt classes on ARM,
it is possible to allocate one or more for notification usage.

I think it is probably best to leave notifications to the future.


> > There is no frontend and backend (as in the usual Xen meaning). In some
> > configurations one app might be more critical than the other, but in
> > some other scenarios they might have the same criticality.
> 
> Yes.
> 
> > If, as Jan pointed out, we need to call out explicitly which is the
> > frontend and which is the backend for page ownership reasons, then I
> > suggested we expose that configuration to the user, so that she can
> > choose.
> 
> Indeed.
> 
> ISTM that this scenario doesn't depend on new hypervisor
> functionality.  The toolstack could set up the appropriate page
> sharing (presumably, this would be done with grants so that the result
> is like something the guests could have done themselves.)

Right, I don't think we need new hypervisor functionalities. I don't
have an opinion on whether it should be done with grants or with other
hypercalls, although I have the feeling that it might be more difficult
to achieve with grants. As long as it works... :-)


> I see no objection to the libxl domain configuration file naming
> guest-physical addresses for use in this way.
>
> One problem, though, is to do with startup order.  To do this in the
> most natural way, one would want to start both guests at once so that
> one would know their domids etc.  (Also that avoids questions like
> `what if one of them crashes'...)
> 
> I'm not sure exactly how to fit this into the libxl model, which
> mostly talks about one guest domain at a time; and each guest config
> talks about persistent resources, rather than resources which are
> somehow exposed by a particular guest.
> 
> I think this question is worth exploring to see what shape the right
> solution is.

You are right that it would make sense to start both domains together
but, to avoid confusion, I would stick with one config file per VM. I
would still require the user to issue "xl create" twice to start the two
guests.

If we demand the user to specify the domain that provides the memory,
then we establish a startup order naturally: the user needs to create
the memory sharing domain first, and the memory mapping domain second.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-22 21:14                 ` Stefano Stabellini
@ 2017-05-23 15:13                   ` Edgar E. Iglesias
  2017-05-23 17:29                     ` Zhongze Liu
  0 siblings, 1 reply; 31+ messages in thread
From: Edgar E. Iglesias @ 2017-05-23 15:13 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Zhongze Liu, Ian Jackson, edgari, Julien Grall,
	xen-devel, Jarvis.Roach

On Mon, May 22, 2017 at 02:14:41PM -0700, Stefano Stabellini wrote:
> On Mon, 22 May 2017, Ian Jackson wrote:
> > Stefano Stabellini writes ("Re: Proposal to allow setting up shared memory areas between VMs from xl config file"):
> > > In this scenario, she is going to write to the VM config files of the
> > > two apps that one page will be shared among the two, so that they can
> > > send each other messages. She will hard-code the address of the shared
> > > page in her "bare-metal" app.
> > 
> > Thanks.  This makes some sense.
> > 
> > How do these apps expect to interrupt each other, or do they poll in
> > the shared memory ?  What I'm getting at with this question is that
> > perhaps some event channels will need setting up.
> 
> As a matter of fact, I have been asking myself the same question. Nobody
> asked me explicitly for notifications support, so I didn't include it in
> the original project definition (it can always be added later) but I
> think it would be useful.
> 
> Edgar, Jarvis, do you have an opinion on this? Do (software) interrupts
> need to be setup together with the shared memory region to send
> notifications back and forth between the two guests, or are they
> unnecessary because the apps do polling anyway?

Hi Stefano,

Sorry, I haven't been following this thread in detail.

The requests I've heard of so far involve:

1. Static setup of memory/pages at a given guest physical address.
   Preferably by allowing the setup to control the real physical
   address aswell (e.g, to select on chip memories as the backing).
   Bonus for an option to let Xen dynamically allocate the physical
   memory.

   Preferably avoiding the need for hypercalls and such as the guest
   my be an unmodified program that runs natively (without Xen).

2. Interrupts would be done by means of IPIs like if running natively
   on HW. Either by some dedicated IP device or by using GIC PPIs/SGIs
   to raise interrupts on other cores. PPI's are a bit akward as
   it conflicts with the Xen model of multi-core intra-guest IPIs,
   as oppposed to inter-guest IPIs. SGI's accross guests could work.


> 
> Event channels are not as complex as grants, but they are not trivial
> either. Guests need to support the full event channel ABI even just to
> receive notifications from one event channel only (because they need to
> clear the pending bits), which is not simple and increases latency to
> de-multiplex events. See drivers/xen/events/events_2l.c and
> drivers/xen/events/events_base.c in Linux. I think we would have to
> introduce a simpler model, where each "notification channel" is not
> implemented by an event channel, but by a PPI or SGI instead. We expect
> only one or two to be used. PPIs and SGIs are interrupt classes on ARM,
> it is possible to allocate one or more for notification usage.

Yes, I wrote too fast, you're getting to the same point here...



> 
> I think it is probably best to leave notifications to the future.

Perhaps yes.

In the ZynqMP case, as a first step, we can use the dedicated IPI blocks.
It would simply involve mapping irq's and memory regions to the various guests
and they would be able to raise interrupts to each other by memory
writes to the IPI devices. Xen doesn't need to be involved more.
This should already work today.

Cheers,
Edgar


> 
> 
> > > There is no frontend and backend (as in the usual Xen meaning). In some
> > > configurations one app might be more critical than the other, but in
> > > some other scenarios they might have the same criticality.
> > 
> > Yes.
> > 
> > > If, as Jan pointed out, we need to call out explicitly which is the
> > > frontend and which is the backend for page ownership reasons, then I
> > > suggested we expose that configuration to the user, so that she can
> > > choose.
> > 
> > Indeed.
> > 
> > ISTM that this scenario doesn't depend on new hypervisor
> > functionality.  The toolstack could set up the appropriate page
> > sharing (presumably, this would be done with grants so that the result
> > is like something the guests could have done themselves.)
> 
> Right, I don't think we need new hypervisor functionalities. I don't
> have an opinion on whether it should be done with grants or with other
> hypercalls, although I have the feeling that it might be more difficult
> to achieve with grants. As long as it works... :-)
> 
> 
> > I see no objection to the libxl domain configuration file naming
> > guest-physical addresses for use in this way.
> >
> > One problem, though, is to do with startup order.  To do this in the
> > most natural way, one would want to start both guests at once so that
> > one would know their domids etc.  (Also that avoids questions like
> > `what if one of them crashes'...)
> > 
> > I'm not sure exactly how to fit this into the libxl model, which
> > mostly talks about one guest domain at a time; and each guest config
> > talks about persistent resources, rather than resources which are
> > somehow exposed by a particular guest.
> > 
> > I think this question is worth exploring to see what shape the right
> > solution is.
> 
> You are right that it would make sense to start both domains together
> but, to avoid confusion, I would stick with one config file per VM. I
> would still require the user to issue "xl create" twice to start the two
> guests.
> 
> If we demand the user to specify the domain that provides the memory,
> then we establish a startup order naturally: the user needs to create
> the memory sharing domain first, and the memory mapping domain second.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-23 15:13                   ` Edgar E. Iglesias
@ 2017-05-23 17:29                     ` Zhongze Liu
  2017-05-23 17:41                       ` Stefano Stabellini
  2017-05-30 10:53                       ` Wei Liu
  0 siblings, 2 replies; 31+ messages in thread
From: Zhongze Liu @ 2017-05-23 17:29 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Stefano Stabellini, Wei Liu, Ian Jackson, edgari, Julien Grall,
	xen-devel, Jarvis.Roach

Hi there,

Thanks for your comments. They are all very insightful.

Having read through the discussions so far, I can draw the
following conclusions:

0. The syntax and semantics of the new config option should be more clearlly
    defined. And this actually depends on the following:

1. If we're going to make this project arch-neutral, then the manifestation
    of the frame numbers should also be arch-neutral;

2. Attributes such as readability, writability (and maybe even
    cacheability and shareability) should be able to be specified in the
    config files

3. We should allow users to specify the mfn of the shared pages as well. (but
    I think maybe we could just specify a memory zone (if we're dealing
    with NUMA) instead of a specific physical address).

4. (maybe in the future) Set up a notification channel between domains
    who are communicating through shared memory regions. The
   channel could be built upon PPI or SGI.

5. Clearify the ownership of shared pages, and thus the creation order.
    I thinks I've already clearify this in my proposal -- we just
treat the first domain
    created with the shared memory range as the owner.
    But specifying  this in the config file also makes sense. Then we'll have to
    enforced that the owner is created prior to all the "clients".

After talking to stefano, I'll try to first come out with a tiny but
working prototype
and do some evaluations on it. I'll not fully deny the possibility of using
the grants api as the final implementation choice.

This list might be incomplete. Please tell me if I missed or misunderstand any
important information.


Cheers,


Zhongze Liu

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-23 17:29                     ` Zhongze Liu
@ 2017-05-23 17:41                       ` Stefano Stabellini
  2017-05-30 10:53                       ` Wei Liu
  1 sibling, 0 replies; 31+ messages in thread
From: Stefano Stabellini @ 2017-05-23 17:41 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: Edgar E. Iglesias, Stefano Stabellini, Wei Liu, Ian Jackson,
	edgari, Julien Grall, xen-devel, Jarvis.Roach

On Wed, 24 May 2017, Zhongze Liu wrote:
> Hi there,
> 
> Thanks for your comments. They are all very insightful.
> 
> Having read through the discussions so far, I can draw the
> following conclusions:
> 
> 0. The syntax and semantics of the new config option should be more clearlly
>     defined. And this actually depends on the following:
> 
> 1. If we're going to make this project arch-neutral, then the manifestation
>     of the frame numbers should also be arch-neutral;
> 
> 2. Attributes such as readability, writability (and maybe even
>     cacheability and shareability) should be able to be specified in the
>     config files
> 
> 3. We should allow users to specify the mfn of the shared pages as well. (but
>     I think maybe we could just specify a memory zone (if we're dealing
>     with NUMA) instead of a specific physical address).
> 
> 4. (maybe in the future) Set up a notification channel between domains
>     who are communicating through shared memory regions. The
>    channel could be built upon PPI or SGI.
> 
> 5. Clearify the ownership of shared pages, and thus the creation order.
>     I thinks I've already clearify this in my proposal -- we just
> treat the first domain
>     created with the shared memory range as the owner.
>     But specifying  this in the config file also makes sense. Then we'll have to
>     enforced that the owner is created prior to all the "clients".

Right, I think we have to be explicit about this to avoid scenarios
where it is not known deterministically which domain is created first.
Imagine if the user gets the idea of doing:

  xl create dom1 &
  xl create dom2 &


> After talking to stefano, I'll try to first come out with a tiny but
> working prototype
> and do some evaluations on it. I'll not fully deny the possibility of using
> the grants api as the final implementation choice.
> 
> This list might be incomplete. Please tell me if I missed or misunderstand any
> important information.

Thanks Zhongze!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-23 17:29                     ` Zhongze Liu
  2017-05-23 17:41                       ` Stefano Stabellini
@ 2017-05-30 10:53                       ` Wei Liu
  2017-05-31  1:55                         ` Zhongze Liu
  1 sibling, 1 reply; 31+ messages in thread
From: Wei Liu @ 2017-05-30 10:53 UTC (permalink / raw)
  To: Zhongze Liu
  Cc: Edgar E. Iglesias, Stefano Stabellini, Wei Liu, Ian Jackson,
	edgari, Julien Grall, xen-devel, Jarvis.Roach

Sorry for not replying earlier. I was on vacation.

On Wed, May 24, 2017 at 01:29:59AM +0800, Zhongze Liu wrote:
> Hi there,
> 
> Thanks for your comments. They are all very insightful.
> 
> Having read through the discussions so far, I can draw the
> following conclusions:
> 
> 0. The syntax and semantics of the new config option should be more clearlly
>     defined. And this actually depends on the following:
> 
> 1. If we're going to make this project arch-neutral, then the manifestation
>     of the frame numbers should also be arch-neutral;
> 
> 2. Attributes such as readability, writability (and maybe even
>     cacheability and shareability) should be able to be specified in the
>     config files
> 
> 3. We should allow users to specify the mfn of the shared pages as well. (but
>     I think maybe we could just specify a memory zone (if we're dealing
>     with NUMA) instead of a specific physical address).
> 
> 4. (maybe in the future) Set up a notification channel between domains
>     who are communicating through shared memory regions. The
>    channel could be built upon PPI or SGI.
> 
> 5. Clearify the ownership of shared pages, and thus the creation order.
>     I thinks I've already clearify this in my proposal -- we just
> treat the first domain
>     created with the shared memory range as the owner.
>     But specifying  this in the config file also makes sense. Then we'll have to
>     enforced that the owner is created prior to all the "clients".
> 
> After talking to stefano, I'll try to first come out with a tiny but
> working prototype
> and do some evaluations on it. I'll not fully deny the possibility of using
> the grants api as the final implementation choice.
> 

I think this is a reasonable plan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Proposal to allow setting up shared memory areas between VMs from xl config file
  2017-05-30 10:53                       ` Wei Liu
@ 2017-05-31  1:55                         ` Zhongze Liu
  0 siblings, 0 replies; 31+ messages in thread
From: Zhongze Liu @ 2017-05-31  1:55 UTC (permalink / raw)
  To: Wei Liu, Stefano Stabellini
  Cc: Edgar E. Iglesias, Ian Jackson, edgari, Julien Grall, xen-devel,
	Jarvis Roach


[-- Attachment #1.1: Type: text/plain, Size: 134 bytes --]

Hi Wei Liu,

Thanks for your review and approval.

Then I'll revise my proposal according to these conclusions.

Cheers,

Zhongze Liu

[-- Attachment #1.2: Type: text/html, Size: 255 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-05-31  1:55 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-12 17:01 Proposal to allow setting up shared memory areas between VMs from xl config file Zhongze Liu
2017-05-12 17:51 ` Wei Liu
2017-05-13  2:28   ` Zhongze Liu
2017-05-15 16:46     ` Wei Liu
2017-05-18 18:09       ` Stefano Stabellini
2017-05-19  6:59         ` Jan Beulich
2017-05-19  9:33         ` Wei Liu
2017-05-19 17:14           ` Zhongze Liu
2017-05-19 17:46             ` Stefano Stabellini
2017-05-19 18:19               ` Jarvis Roach
2017-05-19 19:08                 ` Stefano Stabellini
2017-05-22 13:49               ` Ian Jackson
2017-05-22 21:14                 ` Stefano Stabellini
2017-05-23 15:13                   ` Edgar E. Iglesias
2017-05-23 17:29                     ` Zhongze Liu
2017-05-23 17:41                       ` Stefano Stabellini
2017-05-30 10:53                       ` Wei Liu
2017-05-31  1:55                         ` Zhongze Liu
2017-05-15  8:08 ` Jan Beulich
2017-05-15  8:20   ` Julien Grall
2017-05-15  8:52     ` Jan Beulich
2017-05-15 10:21       ` Julien Grall
2017-05-15 12:28         ` Jan Beulich
2017-05-15 14:13           ` Julien Grall
2017-05-15 14:25             ` Jan Beulich
2017-05-15 17:40           ` Stefano Stabellini
2017-05-16 10:11             ` Jan Beulich
2017-05-16 18:16               ` Stefano Stabellini
2017-05-19 16:52                 ` Zhongze Liu
2017-05-16 11:04             ` Ian Jackson
2017-05-16 18:08               ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.