Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail

All of lore.kernel.org
 help / color / mirror / Atom feed

* Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
@ 2017-01-09 11:36 Razvan Cojocaru
  2017-01-09 11:52 ` Jan Beulich
  2017-01-09 12:54 ` Andrew Cooper
  0 siblings, 2 replies; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-09 11:36 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Tamas K Lengyel

Hello,

We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
to eat up all the RAM it can:

(XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544

This leads to a problem with xen-access, specifically libxc which does
this in xc_vm_event_enable() (this is Xen 4.6):

ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
                                 &mmap_pfn, 1);

if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
{
    /* Map failed, populate ring page */
    rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
                                               &ring_pfn);
    if ( rc1 != 0 )
    {
        PERROR("Failed to populate ring pfn\n");
        goto out;
    }

The first time everything works fine, xen-access can map the ring page.
But most of the time the second time fails in the
xc_domain_populate_physmap_exact() call, and again this is dumped in the
Xen log (once for each failed attempt):

(XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544

This is the only guest we've seen so far doing this. All other HVM
guests (Linux, Windows) behave.

We've tried setting max_pfn and mem as kernel parameters for the guest,
and even setting HVM-shadow-multiplier from XenCenter to 10, but it has
made no difference.

Is this something that anyone else has encountered? Any suggestions
appreciated.

Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 11:36 Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail Razvan Cojocaru
@ 2017-01-09 11:52 ` Jan Beulich
  2017-01-09 12:01   ` Razvan Cojocaru
  2017-01-09 12:04   ` Andrew Cooper
  2017-01-09 12:54 ` Andrew Cooper
  1 sibling, 2 replies; 16+ messages in thread
From: Jan Beulich @ 2017-01-09 11:52 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, Tamas K Lengyel, xen-devel

>>> On 09.01.17 at 12:36, <rcojocaru@bitdefender.com> wrote:
> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
> to eat up all the RAM it can:
> 
> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
> 
> This leads to a problem with xen-access, specifically libxc which does
> this in xc_vm_event_enable() (this is Xen 4.6):
> 
> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>                                  &mmap_pfn, 1);
> 
> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
> {
>     /* Map failed, populate ring page */
>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>                                                &ring_pfn);
>     if ( rc1 != 0 )
>     {
>         PERROR("Failed to populate ring pfn\n");
>         goto out;
>     }
> 
> The first time everything works fine, xen-access can map the ring page.
> But most of the time the second time fails in the
> xc_domain_populate_physmap_exact() call, and again this is dumped in the
> Xen log (once for each failed attempt):
> 
> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544

I don't think there's any weirdness here - if the guest ballooned
itself to the exact boundary it is permitted to allocate, there's
no way for another page to be allocated for it, no matter whether
that's being requested by the guest itself or the tool stack. Before
thinking of possible solutions, could you remind me why it is that
the ring page gets put in guest pfn space in the first place? Isn't
the ring used for communication between tool stack and hypervisor?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 11:52 ` Jan Beulich
@ 2017-01-09 12:01   ` Razvan Cojocaru
  2017-01-09 12:04   ` Andrew Cooper
  1 sibling, 0 replies; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-09 12:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Tamas K Lengyel, xen-devel

On 01/09/2017 01:52 PM, Jan Beulich wrote:
>>>> On 09.01.17 at 12:36, <rcojocaru@bitdefender.com> wrote:
>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>> to eat up all the RAM it can:
>>
>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>
>> This leads to a problem with xen-access, specifically libxc which does
>> this in xc_vm_event_enable() (this is Xen 4.6):
>>
>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>                                  &mmap_pfn, 1);
>>
>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>> {
>>     /* Map failed, populate ring page */
>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>                                                &ring_pfn);
>>     if ( rc1 != 0 )
>>     {
>>         PERROR("Failed to populate ring pfn\n");
>>         goto out;
>>     }
>>
>> The first time everything works fine, xen-access can map the ring page.
>> But most of the time the second time fails in the
>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>> Xen log (once for each failed attempt):
>>
>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
> 
> I don't think there's any weirdness here - if the guest ballooned
> itself to the exact boundary it is permitted to allocate, there's
> no way for another page to be allocated for it, no matter whether
> that's being requested by the guest itself or the tool stack. Before
> thinking of possible solutions, could you remind me why it is that
> the ring page gets put in guest pfn space in the first place? Isn't
> the ring used for communication between tool stack and hypervisor?

I couldn't say what the design reasons for putting the ring in guest
memory were - it's been like that way before my (and Tamas') time with
Xen, and I think we both thought there were good reasons for it. I agree
that having it in guest memory is less than ideal (as it has proven to be).


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 11:52 ` Jan Beulich
  2017-01-09 12:01   ` Razvan Cojocaru
@ 2017-01-09 12:04   ` Andrew Cooper
  2017-01-09 19:18     ` Tamas K Lengyel
  1 sibling, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2017-01-09 12:04 UTC (permalink / raw)
  To: Jan Beulich, Razvan Cojocaru; +Cc: xen-devel, Tamas K Lengyel

On 09/01/17 11:52, Jan Beulich wrote:
>>>> On 09.01.17 at 12:36, <rcojocaru@bitdefender.com> wrote:
>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>> to eat up all the RAM it can:
>>
>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>
>> This leads to a problem with xen-access, specifically libxc which does
>> this in xc_vm_event_enable() (this is Xen 4.6):
>>
>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>                                  &mmap_pfn, 1);
>>
>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>> {
>>     /* Map failed, populate ring page */
>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>                                                &ring_pfn);
>>     if ( rc1 != 0 )
>>     {
>>         PERROR("Failed to populate ring pfn\n");
>>         goto out;
>>     }
>>
>> The first time everything works fine, xen-access can map the ring page.
>> But most of the time the second time fails in the
>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>> Xen log (once for each failed attempt):
>>
>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
> I don't think there's any weirdness here - if the guest ballooned
> itself to the exact boundary it is permitted to allocate, there's
> no way for another page to be allocated for it, no matter whether
> that's being requested by the guest itself or the tool stack. Before
> thinking of possible solutions, could you remind me why it is that
> the ring page gets put in guest pfn space in the first place? Isn't
> the ring used for communication between tool stack and hypervisor?

Because there is no other API available for doing rings like this.

IMO, it is and always was a mistake to ever have rings like this in GFN
space, but that ship has sailed (and come back with several XSAs over
the years).  (Apparently, it was done this way originally so the RAM the
ring took up was accounted to the domain, but there are easy ways to get
the accounting correct without the attack surfaces).

I have some very vague plans to introduce a new mapping API, for frames
which mustn’t be accessable to the guest, but also to support exporting
stats from Xen via shared memory rather than hypercall.  I should
probably see about writing up a design for this, and seeing if someone
has time to look into it.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 11:36 Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail Razvan Cojocaru
  2017-01-09 11:52 ` Jan Beulich
@ 2017-01-09 12:54 ` Andrew Cooper
  2017-01-10  9:06   ` Razvan Cojocaru
  2017-01-10  9:45   ` Razvan Cojocaru
  1 sibling, 2 replies; 16+ messages in thread
From: Andrew Cooper @ 2017-01-09 12:54 UTC (permalink / raw)
  To: Razvan Cojocaru, xen-devel; +Cc: Tamas K Lengyel

On 09/01/17 11:36, Razvan Cojocaru wrote:
> Hello,
>
> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
> to eat up all the RAM it can:
>
> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>
> This leads to a problem with xen-access, specifically libxc which does
> this in xc_vm_event_enable() (this is Xen 4.6):
>
> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>                                  &mmap_pfn, 1);
>
> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
> {
>     /* Map failed, populate ring page */
>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>                                                &ring_pfn);
>     if ( rc1 != 0 )
>     {
>         PERROR("Failed to populate ring pfn\n");
>         goto out;
>     }
>
> The first time everything works fine, xen-access can map the ring page.
> But most of the time the second time fails in the
> xc_domain_populate_physmap_exact() call, and again this is dumped in the
> Xen log (once for each failed attempt):
>
> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544

Thinking further about this, what happens if you avoid removing the page
on exit?

The first populate succeeds, and if you leave the page populated, the
second time you come around the loop, it should not be of type XTAB, and
the map should succeed.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 12:04   ` Andrew Cooper
@ 2017-01-09 19:18     ` Tamas K Lengyel
  0 siblings, 0 replies; 16+ messages in thread
From: Tamas K Lengyel @ 2017-01-09 19:18 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Razvan Cojocaru, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 2947 bytes --]

On Mon, Jan 9, 2017 at 5:04 AM, Andrew Cooper <andrew.cooper3@citrix.com>
wrote:

> On 09/01/17 11:52, Jan Beulich wrote:
> >>>> On 09.01.17 at 12:36, <rcojocaru@bitdefender.com> wrote:
> >> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
> >> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
> >> to eat up all the RAM it can:
> >>
> >> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
> >>
> >> This leads to a problem with xen-access, specifically libxc which does
> >> this in xc_vm_event_enable() (this is Xen 4.6):
> >>
> >> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
> >>                                  &mmap_pfn, 1);
> >>
> >> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
> >> {
> >>     /* Map failed, populate ring page */
> >>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
> >>                                                &ring_pfn);
> >>     if ( rc1 != 0 )
> >>     {
> >>         PERROR("Failed to populate ring pfn\n");
> >>         goto out;
> >>     }
> >>
> >> The first time everything works fine, xen-access can map the ring page.
> >> But most of the time the second time fails in the
> >> xc_domain_populate_physmap_exact() call, and again this is dumped in
> the
> >> Xen log (once for each failed attempt):
> >>
> >> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
> > I don't think there's any weirdness here - if the guest ballooned
> > itself to the exact boundary it is permitted to allocate, there's
> > no way for another page to be allocated for it, no matter whether
> > that's being requested by the guest itself or the tool stack. Before
> > thinking of possible solutions, could you remind me why it is that
> > the ring page gets put in guest pfn space in the first place? Isn't
> > the ring used for communication between tool stack and hypervisor?
>
> Because there is no other API available for doing rings like this.
>
> IMO, it is and always was a mistake to ever have rings like this in GFN
> space, but that ship has sailed (and come back with several XSAs over
> the years).  (Apparently, it was done this way originally so the RAM the
> ring took up was accounted to the domain, but there are easy ways to get
> the accounting correct without the attack surfaces).
>
> I have some very vague plans to introduce a new mapping API, for frames
> which mustn’t be accessable to the guest, but also to support exporting
> stats from Xen via shared memory rather than hypercall.  I should
> probably see about writing up a design for this, and seeing if someone
> has time to look into it.
>

+1, definitely let us know, this would be highly desirable for vm_event. It
would also make implementing multi-page rings a lot easier if we didn't
have to muck with magic pfns in the guest physmap.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 3885 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 12:54 ` Andrew Cooper
@ 2017-01-10  9:06   ` Razvan Cojocaru
  2017-01-10 14:13     ` Andrew Cooper
  2017-01-10  9:45   ` Razvan Cojocaru
  1 sibling, 1 reply; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10  9:06 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

On 01/09/2017 02:54 PM, Andrew Cooper wrote:
> On 09/01/17 11:36, Razvan Cojocaru wrote:
>> Hello,
>>
>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>> to eat up all the RAM it can:
>>
>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>
>> This leads to a problem with xen-access, specifically libxc which does
>> this in xc_vm_event_enable() (this is Xen 4.6):
>>
>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>                                  &mmap_pfn, 1);
>>
>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>> {
>>     /* Map failed, populate ring page */
>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>                                                &ring_pfn);
>>     if ( rc1 != 0 )
>>     {
>>         PERROR("Failed to populate ring pfn\n");
>>         goto out;
>>     }
>>
>> The first time everything works fine, xen-access can map the ring page.
>> But most of the time the second time fails in the
>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>> Xen log (once for each failed attempt):
>>
>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
> 
> Thinking further about this, what happens if you avoid removing the page
> on exit?
> 
> The first populate succeeds, and if you leave the page populated, the
> second time you come around the loop, it should not be of type XTAB, and
> the map should succeed.

Sorry for the late reply, had to put out another fire yesterday.

I've taken your recommendation to roughly mean this:

diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index ba9690a..805564b 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -100,8 +100,11 @@ static int vm_event_enable(
     return 0;

  err:
+    /*
     destroy_ring_for_helper(&ved->ring_page,
                             ved->ring_pg_struct);
+    */
+    ved->ring_page = NULL;
     vm_event_ring_unlock(ved);

     return rc;
@@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
struct vm_event_domain *ved)
             }
         }

+        /*
         destroy_ring_for_helper(&ved->ring_page,
                                 ved->ring_pg_struct);
+       */

+        ved->ring_page = NULL;
         vm_event_cleanup_domain(d);

         vm_event_ring_unlock(ved);

but this unfortunately still fails to map the page the second time. Do
you mean to simply no longer munmap() the ring page from libxc / the
client application?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-09 12:54 ` Andrew Cooper
  2017-01-10  9:06   ` Razvan Cojocaru
@ 2017-01-10  9:45   ` Razvan Cojocaru
  1 sibling, 0 replies; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10  9:45 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

On 01/09/2017 02:54 PM, Andrew Cooper wrote:
> On 09/01/17 11:36, Razvan Cojocaru wrote:
>> Hello,
>>
>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>> to eat up all the RAM it can:
>>
>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>
>> This leads to a problem with xen-access, specifically libxc which does
>> this in xc_vm_event_enable() (this is Xen 4.6):
>>
>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>                                  &mmap_pfn, 1);
>>
>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>> {
>>     /* Map failed, populate ring page */
>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>                                                &ring_pfn);
>>     if ( rc1 != 0 )
>>     {
>>         PERROR("Failed to populate ring pfn\n");
>>         goto out;
>>     }
>>
>> The first time everything works fine, xen-access can map the ring page.
>> But most of the time the second time fails in the
>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>> Xen log (once for each failed attempt):
>>
>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
> 
> Thinking further about this, what happens if you avoid removing the page
> on exit?
> 
> The first populate succeeds, and if you leave the page populated, the
> second time you come around the loop, it should not be of type XTAB, and
> the map should succeed.

While considering a long-term source-code level solution, could there be
a workaround we haven't thought to try yet to prevent the guest from
hogging all available memory?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10  9:06   ` Razvan Cojocaru
@ 2017-01-10 14:13     ` Andrew Cooper
  2017-01-10 15:02       ` Razvan Cojocaru
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2017-01-10 14:13 UTC (permalink / raw)
  To: Razvan Cojocaru, xen-devel; +Cc: Tamas K Lengyel

On 10/01/17 09:06, Razvan Cojocaru wrote:
> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>> Hello,
>>>
>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>> to eat up all the RAM it can:
>>>
>>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>
>>> This leads to a problem with xen-access, specifically libxc which does
>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>
>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>                                  &mmap_pfn, 1);
>>>
>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>> {
>>>     /* Map failed, populate ring page */
>>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>                                                &ring_pfn);
>>>     if ( rc1 != 0 )
>>>     {
>>>         PERROR("Failed to populate ring pfn\n");
>>>         goto out;
>>>     }
>>>
>>> The first time everything works fine, xen-access can map the ring page.
>>> But most of the time the second time fails in the
>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>> Xen log (once for each failed attempt):
>>>
>>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>> Thinking further about this, what happens if you avoid removing the page
>> on exit?
>>
>> The first populate succeeds, and if you leave the page populated, the
>> second time you come around the loop, it should not be of type XTAB, and
>> the map should succeed.
> Sorry for the late reply, had to put out another fire yesterday.
>
> I've taken your recommendation to roughly mean this:
>
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index ba9690a..805564b 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -100,8 +100,11 @@ static int vm_event_enable(
>      return 0;
>
>   err:
> +    /*
>      destroy_ring_for_helper(&ved->ring_page,
>                              ved->ring_pg_struct);
> +    */
> +    ved->ring_page = NULL;
>      vm_event_ring_unlock(ved);
>
>      return rc;
> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
> struct vm_event_domain *ved)
>              }
>          }
>
> +        /*
>          destroy_ring_for_helper(&ved->ring_page,
>                                  ved->ring_pg_struct);
> +       */
>
> +        ved->ring_page = NULL;
>          vm_event_cleanup_domain(d);
>
>          vm_event_ring_unlock(ved);
>
> but this unfortunately still fails to map the page the second time. Do
> you mean to simply no longer munmap() the ring page from libxc / the
> client application?

Neither.

First of all, I notice that this is probably buggy:

    ring_pfn = pfn;
    mmap_pfn = pfn;
    rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
    if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
    {
        /* Page not in the physmap, try to populate it */
        rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
                                              &ring_pfn);
        if ( rc1 != 0 )
        {
            PERROR("Failed to populate ring pfn\n");
            goto out;
        }
    }

A failure of xc_get_pfn_type_batch() is not a suggestion that population
might work.


What I meant was taking out this call:

    /* Remove the ring_pfn from the guest's physmap */
    rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
&ring_pfn);
    if ( rc1 != 0 )
        PERROR("Failed to remove ring page from guest physmap");

To leave the frame in the guest physmap.  The issue is fundamentally
that after this frame has been taken out, something kicks the VM to
realise it has an extra frame of balloonable space, which it clearly
compensates for.

You can work around the added attack surface by marking it RO in EPT;
neither Xen's nor dom0's mappings are translated via EPT, so they can
still make updates, but the guest won't be able to write to it.

I should say that this is all a gross hack, and is in desperate need of
a proper API to make rings entirely outside of the gfn space, but this
hack should work for now.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 14:13     ` Andrew Cooper
@ 2017-01-10 15:02       ` Razvan Cojocaru
  2017-01-10 15:11         ` Andrew Cooper
  0 siblings, 1 reply; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10 15:02 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

On 01/10/2017 04:13 PM, Andrew Cooper wrote:
> On 10/01/17 09:06, Razvan Cojocaru wrote:
>> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>>> Hello,
>>>>
>>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>>> to eat up all the RAM it can:
>>>>
>>>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>>
>>>> This leads to a problem with xen-access, specifically libxc which does
>>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>>
>>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>>                                  &mmap_pfn, 1);
>>>>
>>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>>> {
>>>>     /* Map failed, populate ring page */
>>>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>>                                                &ring_pfn);
>>>>     if ( rc1 != 0 )
>>>>     {
>>>>         PERROR("Failed to populate ring pfn\n");
>>>>         goto out;
>>>>     }
>>>>
>>>> The first time everything works fine, xen-access can map the ring page.
>>>> But most of the time the second time fails in the
>>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>>> Xen log (once for each failed attempt):
>>>>
>>>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>>> Thinking further about this, what happens if you avoid removing the page
>>> on exit?
>>>
>>> The first populate succeeds, and if you leave the page populated, the
>>> second time you come around the loop, it should not be of type XTAB, and
>>> the map should succeed.
>> Sorry for the late reply, had to put out another fire yesterday.
>>
>> I've taken your recommendation to roughly mean this:
>>
>> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
>> index ba9690a..805564b 100644
>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -100,8 +100,11 @@ static int vm_event_enable(
>>      return 0;
>>
>>   err:
>> +    /*
>>      destroy_ring_for_helper(&ved->ring_page,
>>                              ved->ring_pg_struct);
>> +    */
>> +    ved->ring_page = NULL;
>>      vm_event_ring_unlock(ved);
>>
>>      return rc;
>> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
>> struct vm_event_domain *ved)
>>              }
>>          }
>>
>> +        /*
>>          destroy_ring_for_helper(&ved->ring_page,
>>                                  ved->ring_pg_struct);
>> +       */
>>
>> +        ved->ring_page = NULL;
>>          vm_event_cleanup_domain(d);
>>
>>          vm_event_ring_unlock(ved);
>>
>> but this unfortunately still fails to map the page the second time. Do
>> you mean to simply no longer munmap() the ring page from libxc / the
>> client application?
> 
> Neither.
> 
> First of all, I notice that this is probably buggy:
> 
>     ring_pfn = pfn;
>     mmap_pfn = pfn;
>     rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
>     if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>     {
>         /* Page not in the physmap, try to populate it */
>         rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>                                               &ring_pfn);
>         if ( rc1 != 0 )
>         {
>             PERROR("Failed to populate ring pfn\n");
>             goto out;
>         }
>     }
> 
> A failure of xc_get_pfn_type_batch() is not a suggestion that population
> might work.
> 
> 
> What I meant was taking out this call:
> 
>     /* Remove the ring_pfn from the guest's physmap */
>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> &ring_pfn);
>     if ( rc1 != 0 )
>         PERROR("Failed to remove ring page from guest physmap");
> 
> To leave the frame in the guest physmap.  The issue is fundamentally
> that after this frame has been taken out, something kicks the VM to
> realise it has an extra frame of balloonable space, which it clearly
> compensates for.
> 
> You can work around the added attack surface by marking it RO in EPT;
> neither Xen's nor dom0's mappings are translated via EPT, so they can
> still make updates, but the guest won't be able to write to it.
> 
> I should say that this is all a gross hack, and is in desperate need of
> a proper API to make rings entirely outside of the gfn space, but this
> hack should work for now.

Thanks! So far, it seems to work like a charm like this:

diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
index 2fef96a..5dd00a6 100644
--- a/tools/libxc/xc_vm_event.c
+++ b/tools/libxc/xc_vm_event.c
@@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t
domain_id, int param,
     }

     /* Remove the ring_pfn from the guest's physmap */
+    /*
     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
&ring_pfn);
     if ( rc1 != 0 )
         PERROR("Failed to remove ring page from guest physmap");
+    */
+
+    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) )
+    {
+        PERROR("Could not set ring page read-only\n");
+        goto out;
+    }

  out:
     saved_errno = errno;

Should I send this as a patch for mainline as well?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 15:02       ` Razvan Cojocaru
@ 2017-01-10 15:11         ` Andrew Cooper
  2017-01-10 15:35           ` Razvan Cojocaru
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Cooper @ 2017-01-10 15:11 UTC (permalink / raw)
  To: Razvan Cojocaru, xen-devel; +Cc: Tamas K Lengyel

On 10/01/17 15:02, Razvan Cojocaru wrote:
> On 01/10/2017 04:13 PM, Andrew Cooper wrote:
>> On 10/01/17 09:06, Razvan Cojocaru wrote:
>>> On 01/09/2017 02:54 PM, Andrew Cooper wrote:
>>>> On 09/01/17 11:36, Razvan Cojocaru wrote:
>>>>> Hello,
>>>>>
>>>>> We've come across a weird phenomenon: an Ubuntu 16.04.1 LTS HVM guest
>>>>> running kernel 4.4.0 installed via XenCenter in XenServer Dundee seems
>>>>> to eat up all the RAM it can:
>>>>>
>>>>> (XEN) [  394.379760] d1v1 Over-allocation for domain 1: 524545 > 524544
>>>>>
>>>>> This leads to a problem with xen-access, specifically libxc which does
>>>>> this in xc_vm_event_enable() (this is Xen 4.6):
>>>>>
>>>>> ring_page = xc_map_foreign_batch(xch, domain_id, PROT_READ | PROT_WRITE,
>>>>>                                  &mmap_pfn, 1);
>>>>>
>>>>> if ( mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>>>> {
>>>>>     /* Map failed, populate ring page */
>>>>>     rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>>>>                                                &ring_pfn);
>>>>>     if ( rc1 != 0 )
>>>>>     {
>>>>>         PERROR("Failed to populate ring pfn\n");
>>>>>         goto out;
>>>>>     }
>>>>>
>>>>> The first time everything works fine, xen-access can map the ring page.
>>>>> But most of the time the second time fails in the
>>>>> xc_domain_populate_physmap_exact() call, and again this is dumped in the
>>>>> Xen log (once for each failed attempt):
>>>>>
>>>>> (XEN) [  395.952188] d0v3 Over-allocation for domain 1: 524545 > 524544
>>>> Thinking further about this, what happens if you avoid removing the page
>>>> on exit?
>>>>
>>>> The first populate succeeds, and if you leave the page populated, the
>>>> second time you come around the loop, it should not be of type XTAB, and
>>>> the map should succeed.
>>> Sorry for the late reply, had to put out another fire yesterday.
>>>
>>> I've taken your recommendation to roughly mean this:
>>>
>>> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
>>> index ba9690a..805564b 100644
>>> --- a/xen/common/vm_event.c
>>> +++ b/xen/common/vm_event.c
>>> @@ -100,8 +100,11 @@ static int vm_event_enable(
>>>      return 0;
>>>
>>>   err:
>>> +    /*
>>>      destroy_ring_for_helper(&ved->ring_page,
>>>                              ved->ring_pg_struct);
>>> +    */
>>> +    ved->ring_page = NULL;
>>>      vm_event_ring_unlock(ved);
>>>
>>>      return rc;
>>> @@ -229,9 +232,12 @@ static int vm_event_disable(struct domain *d,
>>> struct vm_event_domain *ved)
>>>              }
>>>          }
>>>
>>> +        /*
>>>          destroy_ring_for_helper(&ved->ring_page,
>>>                                  ved->ring_pg_struct);
>>> +       */
>>>
>>> +        ved->ring_page = NULL;
>>>          vm_event_cleanup_domain(d);
>>>
>>>          vm_event_ring_unlock(ved);
>>>
>>> but this unfortunately still fails to map the page the second time. Do
>>> you mean to simply no longer munmap() the ring page from libxc / the
>>> client application?
>> Neither.
>>
>> First of all, I notice that this is probably buggy:
>>
>>     ring_pfn = pfn;
>>     mmap_pfn = pfn;
>>     rc1 = xc_get_pfn_type_batch(xch, domain_id, 1, &mmap_pfn);
>>     if ( rc1 || mmap_pfn & XEN_DOMCTL_PFINFO_XTAB )
>>     {
>>         /* Page not in the physmap, try to populate it */
>>         rc1 = xc_domain_populate_physmap_exact(xch, domain_id, 1, 0, 0,
>>                                               &ring_pfn);
>>         if ( rc1 != 0 )
>>         {
>>             PERROR("Failed to populate ring pfn\n");
>>             goto out;
>>         }
>>     }
>>
>> A failure of xc_get_pfn_type_batch() is not a suggestion that population
>> might work.
>>
>>
>> What I meant was taking out this call:
>>
>>     /* Remove the ring_pfn from the guest's physmap */
>>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>> &ring_pfn);
>>     if ( rc1 != 0 )
>>         PERROR("Failed to remove ring page from guest physmap");
>>
>> To leave the frame in the guest physmap.  The issue is fundamentally
>> that after this frame has been taken out, something kicks the VM to
>> realise it has an extra frame of balloonable space, which it clearly
>> compensates for.
>>
>> You can work around the added attack surface by marking it RO in EPT;
>> neither Xen's nor dom0's mappings are translated via EPT, so they can
>> still make updates, but the guest won't be able to write to it.
>>
>> I should say that this is all a gross hack, and is in desperate need of
>> a proper API to make rings entirely outside of the gfn space, but this
>> hack should work for now.
> Thanks! So far, it seems to work like a charm like this:

Great.

>
> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
> index 2fef96a..5dd00a6 100644
> --- a/tools/libxc/xc_vm_event.c
> +++ b/tools/libxc/xc_vm_event.c
> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t
> domain_id, int param,
>      }
>
>      /* Remove the ring_pfn from the guest's physmap */
> +    /*
>      rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> &ring_pfn);
>      if ( rc1 != 0 )
>          PERROR("Failed to remove ring page from guest physmap");
> +    */
> +
> +    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) )
> +    {
> +        PERROR("Could not set ring page read-only\n");
> +        goto out;
> +    }
>
>   out:
>      saved_errno = errno;
>
> Should I send this as a patch for mainline as well?

Probably a good idea, although I would include a code comment explaining
what is going on, because this is subtle if you don't know the context.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 15:11         ` Andrew Cooper
@ 2017-01-10 15:35           ` Razvan Cojocaru
  2017-01-10 16:29             ` Tamas K Lengyel
  0 siblings, 1 reply; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10 15:35 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Tamas K Lengyel

>>> What I meant was taking out this call:
>>>
>>>     /* Remove the ring_pfn from the guest's physmap */
>>>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>>> &ring_pfn);
>>>     if ( rc1 != 0 )
>>>         PERROR("Failed to remove ring page from guest physmap");
>>>
>>> To leave the frame in the guest physmap.  The issue is fundamentally
>>> that after this frame has been taken out, something kicks the VM to
>>> realise it has an extra frame of balloonable space, which it clearly
>>> compensates for.
>>>
>>> You can work around the added attack surface by marking it RO in EPT;
>>> neither Xen's nor dom0's mappings are translated via EPT, so they can
>>> still make updates, but the guest won't be able to write to it.
>>>
>>> I should say that this is all a gross hack, and is in desperate need of
>>> a proper API to make rings entirely outside of the gfn space, but this
>>> hack should work for now.
>> Thanks! So far, it seems to work like a charm like this:
> 
> Great.
> 
>>
>> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
>> index 2fef96a..5dd00a6 100644
>> --- a/tools/libxc/xc_vm_event.c
>> +++ b/tools/libxc/xc_vm_event.c
>> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch, domid_t
>> domain_id, int param,
>>      }
>>
>>      /* Remove the ring_pfn from the guest's physmap */
>> +    /*
>>      rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>> &ring_pfn);
>>      if ( rc1 != 0 )
>>          PERROR("Failed to remove ring page from guest physmap");
>> +    */
>> +
>> +    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn, 1) )
>> +    {
>> +        PERROR("Could not set ring page read-only\n");
>> +        goto out;
>> +    }
>>
>>   out:
>>      saved_errno = errno;
>>
>> Should I send this as a patch for mainline as well?
> 
> Probably a good idea, although I would include a code comment explaining
> what is going on, because this is subtle if you don't know the context.

Will do, I'll send a patch out as soon as we've done a few more rounds
of testing.


Thanks again,
Razvan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 15:35           ` Razvan Cojocaru
@ 2017-01-10 16:29             ` Tamas K Lengyel
  2017-01-10 16:34               ` Razvan Cojocaru
  0 siblings, 1 reply; 16+ messages in thread
From: Tamas K Lengyel @ 2017-01-10 16:29 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2585 bytes --]

On Tue, Jan 10, 2017 at 8:35 AM, Razvan Cojocaru <rcojocaru@bitdefender.com>
wrote:

> >>> What I meant was taking out this call:
> >>>
> >>>     /* Remove the ring_pfn from the guest's physmap */
> >>>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> >>> &ring_pfn);
> >>>     if ( rc1 != 0 )
> >>>         PERROR("Failed to remove ring page from guest physmap");
> >>>
> >>> To leave the frame in the guest physmap.  The issue is fundamentally
> >>> that after this frame has been taken out, something kicks the VM to
> >>> realise it has an extra frame of balloonable space, which it clearly
> >>> compensates for.
> >>>
> >>> You can work around the added attack surface by marking it RO in EPT;
> >>> neither Xen's nor dom0's mappings are translated via EPT, so they can
> >>> still make updates, but the guest won't be able to write to it.
> >>>
> >>> I should say that this is all a gross hack, and is in desperate need of
> >>> a proper API to make rings entirely outside of the gfn space, but this
> >>> hack should work for now.
> >> Thanks! So far, it seems to work like a charm like this:
> >
> > Great.
> >
> >>
> >> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
> >> index 2fef96a..5dd00a6 100644
> >> --- a/tools/libxc/xc_vm_event.c
> >> +++ b/tools/libxc/xc_vm_event.c
> >> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch,
> domid_t
> >> domain_id, int param,
> >>      }
> >>
> >>      /* Remove the ring_pfn from the guest's physmap */
> >> +    /*
> >>      rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
> >> &ring_pfn);
> >>      if ( rc1 != 0 )
> >>          PERROR("Failed to remove ring page from guest physmap");
> >> +    */
> >> +
> >> +    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r, mmap_pfn,
> 1) )
> >> +    {
> >> +        PERROR("Could not set ring page read-only\n");
> >> +        goto out;
> >> +    }
> >>
> >>   out:
> >>      saved_errno = errno;
> >>
> >> Should I send this as a patch for mainline as well?
> >
> > Probably a good idea, although I would include a code comment explaining
> > what is going on, because this is subtle if you don't know the context.
>
> Will do, I'll send a patch out as soon as we've done a few more rounds
> of testing.
>

(replying to all): I'm not in favor of this patch mainly because it is not
stealthy. A malicious kernel could easily track what events are being sent
on the ring. With DRAKVUF I could work around this using altp2m
pfn-remapping, but for other tools this is can be a serious information
leak.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 3692 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 16:29             ` Tamas K Lengyel
@ 2017-01-10 16:34               ` Razvan Cojocaru
  2017-01-10 16:40                 ` Tamas K Lengyel
  0 siblings, 1 reply; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10 16:34 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Andrew Cooper, xen-devel

On 01/10/2017 06:29 PM, Tamas K Lengyel wrote:
> 
> 
> On Tue, Jan 10, 2017 at 8:35 AM, Razvan Cojocaru
> <rcojocaru@bitdefender.com <mailto:rcojocaru@bitdefender.com>> wrote:
> 
>     >>> What I meant was taking out this call:
>     >>>
>     >>>     /* Remove the ring_pfn from the guest's physmap */
>     >>>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>     >>> &ring_pfn);
>     >>>     if ( rc1 != 0 )
>     >>>         PERROR("Failed to remove ring page from guest physmap");
>     >>>
>     >>> To leave the frame in the guest physmap.  The issue is fundamentally
>     >>> that after this frame has been taken out, something kicks the VM to
>     >>> realise it has an extra frame of balloonable space, which it clearly
>     >>> compensates for.
>     >>>
>     >>> You can work around the added attack surface by marking it RO in
>     EPT;
>     >>> neither Xen's nor dom0's mappings are translated via EPT, so
>     they can
>     >>> still make updates, but the guest won't be able to write to it.
>     >>>
>     >>> I should say that this is all a gross hack, and is in desperate
>     need of
>     >>> a proper API to make rings entirely outside of the gfn space,
>     but this
>     >>> hack should work for now.
>     >> Thanks! So far, it seems to work like a charm like this:
>     >
>     > Great.
>     >
>     >>
>     >> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
>     >> index 2fef96a..5dd00a6 100644
>     >> --- a/tools/libxc/xc_vm_event.c
>     >> +++ b/tools/libxc/xc_vm_event.c
>     >> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch,
>     domid_t
>     >> domain_id, int param,
>     >>      }
>     >>
>     >>      /* Remove the ring_pfn from the guest's physmap */
>     >> +    /*
>     >>      rc1 = xc_domain_decrease_reservation_exact(xch, domain_id, 1, 0,
>     >> &ring_pfn);
>     >>      if ( rc1 != 0 )
>     >>          PERROR("Failed to remove ring page from guest physmap");
>     >> +    */
>     >> +
>     >> +    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r,
>     mmap_pfn, 1) )
>     >> +    {
>     >> +        PERROR("Could not set ring page read-only\n");
>     >> +        goto out;
>     >> +    }
>     >>
>     >>   out:
>     >>      saved_errno = errno;
>     >>
>     >> Should I send this as a patch for mainline as well?
>     >
>     > Probably a good idea, although I would include a code comment
>     explaining
>     > what is going on, because this is subtle if you don't know the
>     context.
> 
>     Will do, I'll send a patch out as soon as we've done a few more rounds
>     of testing.
> 
> 
> (replying to all): I'm not in favor of this patch mainly because it is
> not stealthy. A malicious kernel could easily track what events are
> being sent on the ring. With DRAKVUF I could work around this using
> altp2m pfn-remapping, but for other tools this is can be a serious
> information leak.

I understand your point, however the alternative is potential lack of
availability to monitor which is arguably a more severe problem. _Any_
guest could choose to do what this Ubuntu 16.04 guest does, and then
connecting to the guest via vm_event can only be done once.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 16:34               ` Razvan Cojocaru
@ 2017-01-10 16:40                 ` Tamas K Lengyel
  2017-01-10 17:09                   ` Razvan Cojocaru
  0 siblings, 1 reply; 16+ messages in thread
From: Tamas K Lengyel @ 2017-01-10 16:40 UTC (permalink / raw)
  To: Razvan Cojocaru; +Cc: Andrew Cooper, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3745 bytes --]

On Tue, Jan 10, 2017 at 9:34 AM, Razvan Cojocaru <rcojocaru@bitdefender.com>
wrote:

> On 01/10/2017 06:29 PM, Tamas K Lengyel wrote:
> >
> >
> > On Tue, Jan 10, 2017 at 8:35 AM, Razvan Cojocaru
> > <rcojocaru@bitdefender.com <mailto:rcojocaru@bitdefender.com>> wrote:
> >
> >     >>> What I meant was taking out this call:
> >     >>>
> >     >>>     /* Remove the ring_pfn from the guest's physmap */
> >     >>>     rc1 = xc_domain_decrease_reservation_exact(xch, domain_id,
> 1, 0,
> >     >>> &ring_pfn);
> >     >>>     if ( rc1 != 0 )
> >     >>>         PERROR("Failed to remove ring page from guest physmap");
> >     >>>
> >     >>> To leave the frame in the guest physmap.  The issue is
> fundamentally
> >     >>> that after this frame has been taken out, something kicks the VM
> to
> >     >>> realise it has an extra frame of balloonable space, which it
> clearly
> >     >>> compensates for.
> >     >>>
> >     >>> You can work around the added attack surface by marking it RO in
> >     EPT;
> >     >>> neither Xen's nor dom0's mappings are translated via EPT, so
> >     they can
> >     >>> still make updates, but the guest won't be able to write to it.
> >     >>>
> >     >>> I should say that this is all a gross hack, and is in desperate
> >     need of
> >     >>> a proper API to make rings entirely outside of the gfn space,
> >     but this
> >     >>> hack should work for now.
> >     >> Thanks! So far, it seems to work like a charm like this:
> >     >
> >     > Great.
> >     >
> >     >>
> >     >> diff --git a/tools/libxc/xc_vm_event.c b/tools/libxc/xc_vm_event.c
> >     >> index 2fef96a..5dd00a6 100644
> >     >> --- a/tools/libxc/xc_vm_event.c
> >     >> +++ b/tools/libxc/xc_vm_event.c
> >     >> @@ -130,9 +130,17 @@ void *xc_vm_event_enable(xc_interface *xch,
> >     domid_t
> >     >> domain_id, int param,
> >     >>      }
> >     >>
> >     >>      /* Remove the ring_pfn from the guest's physmap */
> >     >> +    /*
> >     >>      rc1 = xc_domain_decrease_reservation_exact(xch, domain_id,
> 1, 0,
> >     >> &ring_pfn);
> >     >>      if ( rc1 != 0 )
> >     >>          PERROR("Failed to remove ring page from guest physmap");
> >     >> +    */
> >     >> +
> >     >> +    if ( xc_set_mem_access(xch, domain_id, XENMEM_access_r,
> >     mmap_pfn, 1) )
> >     >> +    {
> >     >> +        PERROR("Could not set ring page read-only\n");
> >     >> +        goto out;
> >     >> +    }
> >     >>
> >     >>   out:
> >     >>      saved_errno = errno;
> >     >>
> >     >> Should I send this as a patch for mainline as well?
> >     >
> >     > Probably a good idea, although I would include a code comment
> >     explaining
> >     > what is going on, because this is subtle if you don't know the
> >     context.
> >
> >     Will do, I'll send a patch out as soon as we've done a few more
> rounds
> >     of testing.
> >
> >
> > (replying to all): I'm not in favor of this patch mainly because it is
> > not stealthy. A malicious kernel could easily track what events are
> > being sent on the ring. With DRAKVUF I could work around this using
> > altp2m pfn-remapping, but for other tools this is can be a serious
> > information leak.
>
> I understand your point, however the alternative is potential lack of
> availability to monitor which is arguably a more severe problem. _Any_
> guest could choose to do what this Ubuntu 16.04 guest does, and then
> connecting to the guest via vm_event can only be done once.
>

IMHO in that case you should implement your own internal version of this
function to fall back to instead of forcing all tools to go down this path.
The requirements to do that in your own tool are accessible so there is no
need to push that into libxc.

Tamas

[-- Attachment #1.2: Type: text/html, Size: 5425 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail
  2017-01-10 16:40                 ` Tamas K Lengyel
@ 2017-01-10 17:09                   ` Razvan Cojocaru
  0 siblings, 0 replies; 16+ messages in thread
From: Razvan Cojocaru @ 2017-01-10 17:09 UTC (permalink / raw)
  To: Tamas K Lengyel; +Cc: Andrew Cooper, xen-devel

On 01/10/2017 06:40 PM, Tamas K Lengyel wrote:
>     > (replying to all): I'm not in favor of this patch mainly because it is
>     > not stealthy. A malicious kernel could easily track what events are
>     > being sent on the ring. With DRAKVUF I could work around this using
>     > altp2m pfn-remapping, but for other tools this is can be a serious
>     > information leak.
> 
>     I understand your point, however the alternative is potential lack of
>     availability to monitor which is arguably a more severe problem. _Any_
>     guest could choose to do what this Ubuntu 16.04 guest does, and then
>     connecting to the guest via vm_event can only be done once.
> 
> 
> IMHO in that case you should implement your own internal version of this
> function to fall back to instead of forcing all tools to go down this
> path. The requirements to do that in your own tool are accessible so
> there is no need to push that into libxc.

Obviously I won't send a patch you're against as co-maintainer of
vm_event, however looking at my version of xc_vm_event_enable() (from
Xen 4.6), it's calling xc_vm_event_control() which is not publicly
accesible - so going my own way would in this case still require libxc
modifications.

We could also add another parameter to xc_vm_event_enable(), for example
"bool remove_page_from_guest_physmap" to control this behaviour, which
would change the API - or leave the current API alone and add
xc_vm_event_control_remap() or something to that effect.

But in any case I don't see how any of this can be achieved with zero
libxc modifications.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-01-10 17:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-09 11:36 Ubuntu 16.04.1 LTS kernel 4.4.0-57 over-allocation and xen-access fail Razvan Cojocaru
2017-01-09 11:52 ` Jan Beulich
2017-01-09 12:01   ` Razvan Cojocaru
2017-01-09 12:04   ` Andrew Cooper
2017-01-09 19:18     ` Tamas K Lengyel
2017-01-09 12:54 ` Andrew Cooper
2017-01-10  9:06   ` Razvan Cojocaru
2017-01-10 14:13     ` Andrew Cooper
2017-01-10 15:02       ` Razvan Cojocaru
2017-01-10 15:11         ` Andrew Cooper
2017-01-10 15:35           ` Razvan Cojocaru
2017-01-10 16:29             ` Tamas K Lengyel
2017-01-10 16:34               ` Razvan Cojocaru
2017-01-10 16:40                 ` Tamas K Lengyel
2017-01-10 17:09                   ` Razvan Cojocaru
2017-01-10  9:45   ` Razvan Cojocaru

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.