All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/pvh: fix memory accounting for Dom0
@ 2017-09-28 10:16 Roger Pau Monne
  2017-09-28 13:18 ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monne @ 2017-09-28 10:16 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Boris Ostrovsky, Jan Beulich, Roger Pau Monne

Make sure that the memory for the paging structures in case of a HVM
Dom0 is subtracted from the total amount of memory available for Dom0
to use. Also take into account whether the IOMMU is sharing the
page tables with HAP, or else also reserve some memory for the IOMMU
page tables.

While there re-organize the code slightly so that the for loop and the
need_paging local variable can be removed.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
I've tested booting a PVHv2 Dom0 without dom0_mem set (and
with/without no-sharept), and it works fine. Current code doesn't work
if dom0_mem is not set to a reasonable value.
---
 xen/arch/x86/dom0_build.c | 51 +++++++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index f616b99ddc..3a9aac8aca 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -241,7 +241,6 @@ unsigned long __init dom0_compute_nr_pages(
 {
     nodeid_t node;
     unsigned long avail = 0, nr_pages, min_pages, max_pages;
-    bool need_paging;
 
     for_each_node_mask ( node, dom0_nodes )
         avail += avail_domheap_pages_region(node, 0, 0) +
@@ -263,39 +262,39 @@ unsigned long __init dom0_compute_nr_pages(
             avail -= max_pdx >> s;
     }
 
-    need_paging = is_hvm_domain(d) &&
-        (!iommu_hap_pt_share || !paging_mode_hap(d));
-    for ( ; ; need_paging = false )
-    {
-        nr_pages = dom0_nrpages;
-        min_pages = dom0_min_nrpages;
-        max_pages = dom0_max_nrpages;
+    nr_pages = dom0_nrpages;
+    min_pages = dom0_min_nrpages;
+    max_pages = dom0_max_nrpages;
 
-        /*
-         * If allocation isn't specified, reserve 1/16th of available memory
-         * for things like DMA buffers. This reservation is clamped to a
-         * maximum of 128MB.
-         */
-        if ( nr_pages == 0 )
-            nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
+    /*
+     * If allocation isn't specified, reserve 1/16th of available memory
+     * for things like DMA buffers. This reservation is clamped to a
+     * maximum of 128MB.
+     */
+    if ( nr_pages == 0 )
+        nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
 
-        /* Negative specification means "all memory - specified amount". */
-        if ( (long)nr_pages  < 0 ) nr_pages  += avail;
-        if ( (long)min_pages < 0 ) min_pages += avail;
-        if ( (long)max_pages < 0 ) max_pages += avail;
+    /* Negative specification means "all memory - specified amount". */
+    if ( (long)nr_pages  < 0 ) nr_pages  += avail;
+    if ( (long)min_pages < 0 ) min_pages += avail;
+    if ( (long)max_pages < 0 ) max_pages += avail;
 
-        /* Clamp according to min/max limits and available memory. */
-        nr_pages = max(nr_pages, min_pages);
-        nr_pages = min(nr_pages, max_pages);
-        nr_pages = min(nr_pages, avail);
+    /* Clamp according to min/max limits and available memory. */
+    nr_pages = max(nr_pages, min_pages);
+    nr_pages = min(nr_pages, max_pages);
 
-        if ( !need_paging )
-            break;
+    if ( is_hvm_domain(d) )
+    {
+        unsigned long paging_mem = dom0_paging_pages(d, nr_pages);
 
         /* Reserve memory for shadow or HAP. */
-        avail -= dom0_paging_pages(d, nr_pages);
+        avail -= paging_mem;
+        /* Reserve the same amount for the IOMMU page tables if not shared. */
+        avail -= !iommu_hap_pt_share ? paging_mem : 0;
     }
 
+    nr_pages = min(nr_pages, avail);
+
     if ( is_pv_domain(d) &&
          (parms->p2m_base == UNSET_ADDR) && (dom0_nrpages <= 0) &&
          ((dom0_min_nrpages <= 0) || (nr_pages > min_pages)) )
-- 
2.13.5 (Apple Git-94)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/pvh: fix memory accounting for Dom0
  2017-09-28 10:16 [PATCH] x86/pvh: fix memory accounting for Dom0 Roger Pau Monne
@ 2017-09-28 13:18 ` Jan Beulich
  2017-09-28 15:39   ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2017-09-28 13:18 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

>>> On 28.09.17 at 12:16, <roger.pau@citrix.com> wrote:
> Make sure that the memory for the paging structures in case of a HVM
> Dom0 is subtracted from the total amount of memory available for Dom0
> to use. Also take into account whether the IOMMU is sharing the
> page tables with HAP, or else also reserve some memory for the IOMMU
> page tables.
> 
> While there re-organize the code slightly so that the for loop and the
> need_paging local variable can be removed.

These two things very definitely should not be merged into a single
patch; I'm not convinced the reorg is correct in the first place. Note
how avail, which is being changed in the first iteration of the loop,
feeds back into the second iteration.

> @@ -263,39 +262,39 @@ unsigned long __init dom0_compute_nr_pages(
>              avail -= max_pdx >> s;
>      }
>  
> -    need_paging = is_hvm_domain(d) &&
> -        (!iommu_hap_pt_share || !paging_mode_hap(d));
> -    for ( ; ; need_paging = false )
> -    {
> -        nr_pages = dom0_nrpages;
> -        min_pages = dom0_min_nrpages;
> -        max_pages = dom0_max_nrpages;
> +    nr_pages = dom0_nrpages;
> +    min_pages = dom0_min_nrpages;
> +    max_pages = dom0_max_nrpages;
>  
> -        /*
> -         * If allocation isn't specified, reserve 1/16th of available memory
> -         * for things like DMA buffers. This reservation is clamped to a
> -         * maximum of 128MB.
> -         */
> -        if ( nr_pages == 0 )
> -            nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
> +    /*
> +     * If allocation isn't specified, reserve 1/16th of available memory
> +     * for things like DMA buffers. This reservation is clamped to a
> +     * maximum of 128MB.
> +     */
> +    if ( nr_pages == 0 )
> +        nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
>  
> -        /* Negative specification means "all memory - specified amount". */
> -        if ( (long)nr_pages  < 0 ) nr_pages  += avail;
> -        if ( (long)min_pages < 0 ) min_pages += avail;
> -        if ( (long)max_pages < 0 ) max_pages += avail;
> +    /* Negative specification means "all memory - specified amount". */
> +    if ( (long)nr_pages  < 0 ) nr_pages  += avail;
> +    if ( (long)min_pages < 0 ) min_pages += avail;
> +    if ( (long)max_pages < 0 ) max_pages += avail;
>  
> -        /* Clamp according to min/max limits and available memory. */
> -        nr_pages = max(nr_pages, min_pages);
> -        nr_pages = min(nr_pages, max_pages);
> -        nr_pages = min(nr_pages, avail);
> +    /* Clamp according to min/max limits and available memory. */
> +    nr_pages = max(nr_pages, min_pages);
> +    nr_pages = min(nr_pages, max_pages);
>  
> -        if ( !need_paging )
> -            break;
> +    if ( is_hvm_domain(d) )
> +    {
> +        unsigned long paging_mem = dom0_paging_pages(d, nr_pages);
>  
>          /* Reserve memory for shadow or HAP. */
> -        avail -= dom0_paging_pages(d, nr_pages);
> +        avail -= paging_mem;
> +        /* Reserve the same amount for the IOMMU page tables if not shared. */
> +        avail -= !iommu_hap_pt_share ? paging_mem : 0;

If you account for IOMMU tables here, why don't you delete the
code ahead of what so far was a loop?

Also, why not

        avail -= iommu_hap_pt_share ? 0 : paging_mem;

?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/pvh: fix memory accounting for Dom0
  2017-09-28 13:18 ` Jan Beulich
@ 2017-09-28 15:39   ` Roger Pau Monné
  2017-09-28 15:55     ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2017-09-28 15:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

On Thu, Sep 28, 2017 at 01:18:55PM +0000, Jan Beulich wrote:
> >>> On 28.09.17 at 12:16, <roger.pau@citrix.com> wrote:
> > Make sure that the memory for the paging structures in case of a HVM
> > Dom0 is subtracted from the total amount of memory available for Dom0
> > to use. Also take into account whether the IOMMU is sharing the
> > page tables with HAP, or else also reserve some memory for the IOMMU
> > page tables.
> > 
> > While there re-organize the code slightly so that the for loop and the
> > need_paging local variable can be removed.
> 
> These two things very definitely should not be merged into a single
> patch; I'm not convinced the reorg is correct in the first place. Note
> how avail, which is being changed in the first iteration of the loop,
> feeds back into the second iteration.

I'm afraid I don't understand why this is done. Also, the second loop
is only going to happen when need_paging is true, which only happens
for HVM guests using shadow or without shared pt with the IOMMU.

AFAICT the original code is wrong (or I don't understand it).

Please bear with me. So let's assume Xen is trying to create a PVH
Dom0 using HAP and with shared page tables with the IOMMU. In this
case need_paging is false, so the following flow will happen:

for ( ; ; need_paging = false )
{
    [nr_pages calculations]

    if ( !need_paging )
        break; <- break

    /* Not reached */
    /* Reserve memory for shadow or HAP. */
    avail -= dom0_paging_pages(d, nr_pages);
}

In which case the reservation of the memory required for paging it's
not going to happen.

> > @@ -263,39 +262,39 @@ unsigned long __init dom0_compute_nr_pages(
> >              avail -= max_pdx >> s;
> >      }
> >  
> > -    need_paging = is_hvm_domain(d) &&
> > -        (!iommu_hap_pt_share || !paging_mode_hap(d));
> > -    for ( ; ; need_paging = false )
> > -    {
> > -        nr_pages = dom0_nrpages;
> > -        min_pages = dom0_min_nrpages;
> > -        max_pages = dom0_max_nrpages;
> > +    nr_pages = dom0_nrpages;
> > +    min_pages = dom0_min_nrpages;
> > +    max_pages = dom0_max_nrpages;
> >  
> > -        /*
> > -         * If allocation isn't specified, reserve 1/16th of available memory
> > -         * for things like DMA buffers. This reservation is clamped to a
> > -         * maximum of 128MB.
> > -         */
> > -        if ( nr_pages == 0 )
> > -            nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
> > +    /*
> > +     * If allocation isn't specified, reserve 1/16th of available memory
> > +     * for things like DMA buffers. This reservation is clamped to a
> > +     * maximum of 128MB.
> > +     */
> > +    if ( nr_pages == 0 )
> > +        nr_pages = -min(avail / 16, 128UL << (20 - PAGE_SHIFT));
> >  
> > -        /* Negative specification means "all memory - specified amount". */
> > -        if ( (long)nr_pages  < 0 ) nr_pages  += avail;
> > -        if ( (long)min_pages < 0 ) min_pages += avail;
> > -        if ( (long)max_pages < 0 ) max_pages += avail;
> > +    /* Negative specification means "all memory - specified amount". */
> > +    if ( (long)nr_pages  < 0 ) nr_pages  += avail;
> > +    if ( (long)min_pages < 0 ) min_pages += avail;
> > +    if ( (long)max_pages < 0 ) max_pages += avail;
> >  
> > -        /* Clamp according to min/max limits and available memory. */
> > -        nr_pages = max(nr_pages, min_pages);
> > -        nr_pages = min(nr_pages, max_pages);
> > -        nr_pages = min(nr_pages, avail);
> > +    /* Clamp according to min/max limits and available memory. */
> > +    nr_pages = max(nr_pages, min_pages);
> > +    nr_pages = min(nr_pages, max_pages);
> >  
> > -        if ( !need_paging )
> > -            break;
> > +    if ( is_hvm_domain(d) )
> > +    {
> > +        unsigned long paging_mem = dom0_paging_pages(d, nr_pages);
> >  
> >          /* Reserve memory for shadow or HAP. */
> > -        avail -= dom0_paging_pages(d, nr_pages);
> > +        avail -= paging_mem;
> > +        /* Reserve the same amount for the IOMMU page tables if not shared. */
> > +        avail -= !iommu_hap_pt_share ? paging_mem : 0;
> 
> If you account for IOMMU tables here, why don't you delete the
> code ahead of what so far was a loop?

Oh right, this chunk is incorrect, there's no need to account for the
IOMMU memory, the more that PV guests also make use of the IOMMU. As
you say the memory is already accounted for in the chunk above.

> Also, why not
> 
>         avail -= iommu_hap_pt_share ? 0 : paging_mem;
> 
> ?

Yes, can change to that, it's shorter :).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/pvh: fix memory accounting for Dom0
  2017-09-28 15:39   ` Roger Pau Monné
@ 2017-09-28 15:55     ` Roger Pau Monné
  2017-10-04  8:42       ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Roger Pau Monné @ 2017-09-28 15:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

On Thu, Sep 28, 2017 at 03:39:08PM +0000, Roger Pau Monné wrote:
> On Thu, Sep 28, 2017 at 01:18:55PM +0000, Jan Beulich wrote:
> > >>> On 28.09.17 at 12:16, <roger.pau@citrix.com> wrote:
> > > Make sure that the memory for the paging structures in case of a HVM
> > > Dom0 is subtracted from the total amount of memory available for Dom0
> > > to use. Also take into account whether the IOMMU is sharing the
> > > page tables with HAP, or else also reserve some memory for the IOMMU
> > > page tables.
> > > 
> > > While there re-organize the code slightly so that the for loop and the
> > > need_paging local variable can be removed.
> > 
> > These two things very definitely should not be merged into a single
> > patch; I'm not convinced the reorg is correct in the first place. Note
> > how avail, which is being changed in the first iteration of the loop,
> > feeds back into the second iteration.
> 
> I'm afraid I don't understand why this is done. Also, the second loop
> is only going to happen when need_paging is true, which only happens
> for HVM guests using shadow or without shared pt with the IOMMU.

OK I think I'm starting to understand this. The need_paging thing it's
only done if the page tables are not shared because the iommu_enabled
has already reserved some memory for the page tables, that can be
shared with EPT. I think this is all very confusing, and the
calculations done for the iommu_enabled case are wrong. Xen should use
dom0_paging_pages instead.

I still don't understand the need for the 'for' loop.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/pvh: fix memory accounting for Dom0
  2017-09-28 15:55     ` Roger Pau Monné
@ 2017-10-04  8:42       ` Jan Beulich
  2017-10-04  9:26         ` Roger Pau Monné
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2017-10-04  8:42 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

>>> On 28.09.17 at 17:55, <roger.pau@citrix.com> wrote:
> On Thu, Sep 28, 2017 at 03:39:08PM +0000, Roger Pau Monné wrote:
>> On Thu, Sep 28, 2017 at 01:18:55PM +0000, Jan Beulich wrote:
>> > >>> On 28.09.17 at 12:16, <roger.pau@citrix.com> wrote:
>> > > Make sure that the memory for the paging structures in case of a HVM
>> > > Dom0 is subtracted from the total amount of memory available for Dom0
>> > > to use. Also take into account whether the IOMMU is sharing the
>> > > page tables with HAP, or else also reserve some memory for the IOMMU
>> > > page tables.
>> > > 
>> > > While there re-organize the code slightly so that the for loop and the
>> > > need_paging local variable can be removed.
>> > 
>> > These two things very definitely should not be merged into a single
>> > patch; I'm not convinced the reorg is correct in the first place. Note
>> > how avail, which is being changed in the first iteration of the loop,
>> > feeds back into the second iteration.
>> 
>> I'm afraid I don't understand why this is done. Also, the second loop
>> is only going to happen when need_paging is true, which only happens
>> for HVM guests using shadow or without shared pt with the IOMMU.
> 
> OK I think I'm starting to understand this. The need_paging thing it's
> only done if the page tables are not shared because the iommu_enabled
> has already reserved some memory for the page tables, that can be
> shared with EPT. I think this is all very confusing, and the
> calculations done for the iommu_enabled case are wrong. Xen should use
> dom0_paging_pages instead.

You need to look at the history - the IOMMU part was there before
the paging portion, I think. And yes, as indicated in an earlier reply,
removing the IOMMU part in favor of using dom0_paging_pages()
(and doubling the value when not sharing page tables) would seem
preferable.

> I still don't understand the need for the 'for' loop.

The second iteration of the loop uses a shrunk "avail" resulting
from the first iteration, to make sure the remaining amount of
memory comes reasonably close to what was requested on the
command line. This matters specifically when quite a bit of
memory is needed for the page tables.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86/pvh: fix memory accounting for Dom0
  2017-10-04  8:42       ` Jan Beulich
@ 2017-10-04  9:26         ` Roger Pau Monné
  0 siblings, 0 replies; 6+ messages in thread
From: Roger Pau Monné @ 2017-10-04  9:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Boris Ostrovsky, xen-devel

On Wed, Oct 04, 2017 at 08:42:33AM +0000, Jan Beulich wrote:
> >>> On 28.09.17 at 17:55, <roger.pau@citrix.com> wrote:
> > OK I think I'm starting to understand this. The need_paging thing it's
> > only done if the page tables are not shared because the iommu_enabled
> > has already reserved some memory for the page tables, that can be
> > shared with EPT. I think this is all very confusing, and the
> > calculations done for the iommu_enabled case are wrong. Xen should use
> > dom0_paging_pages instead.
> 
> You need to look at the history - the IOMMU part was there before
> the paging portion, I think. And yes, as indicated in an earlier reply,
> removing the IOMMU part in favor of using dom0_paging_pages()
> (and doubling the value when not sharing page tables) would seem
> preferable.

I've done this in the new version that I posted.

> > I still don't understand the need for the 'for' loop.
> 
> The second iteration of the loop uses a shrunk "avail" resulting
> from the first iteration, to make sure the remaining amount of
> memory comes reasonably close to what was requested on the
> command line. This matters specifically when quite a bit of
> memory is needed for the page tables.

Right, in the new series I was able to get rid of the loop because I
use max_pdx to calculate the amount of memory needed for the IOMMU and
p2m page-tables.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-10-04  9:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-28 10:16 [PATCH] x86/pvh: fix memory accounting for Dom0 Roger Pau Monne
2017-09-28 13:18 ` Jan Beulich
2017-09-28 15:39   ` Roger Pau Monné
2017-09-28 15:55     ` Roger Pau Monné
2017-10-04  8:42       ` Jan Beulich
2017-10-04  9:26         ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.