xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
@ 2019-08-09 12:14 Julien Grall
  2019-08-09 13:38 ` Jan Beulich
  2019-08-09 17:55 ` Stefano Stabellini
  0 siblings, 2 replies; 9+ messages in thread
From: Julien Grall @ 2019-08-09 12:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Julien Grall, Jan Beulich, Stewart.Hildebrand,
	Jarvis.Roach

Combining of buddies happens only such that the resulting larger buddy
is still order-aligned. To cross a zone boundary while merging, the
implication is that both the buddy [0, 2^n-1] and the buddy
[2^n, 2^(n+1)] are free.

Ideally we want to fix the allocator, but for now we can just prevent
adding the MFN 0 in the allocator.

On x86, the MFN 0 is already kept away from the buddy allocator. So the
bug can only happen on Arm platform where the first memory bank is
starting at 0.

As this is a specific to the allocator, the MFN 0 is removed in the common code
to cater all the architectures (current and future).

Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>

---
Cc: Jarvis.Roach@dornerworks.com
Cc: Stewart.Hildebrand@dornerworks.com

    I am not sure I fully understood the exact problem when MFN 0 is
    given to the allocator. Feel free to suggest a better explanation.

    Can anyone able to reproduce the bug test where the patch
    effectively solve the crash?
---
 xen/common/page_alloc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 453b303b5b..42b8a8ce23 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1759,6 +1759,18 @@ static void init_heap_pages(
     bool idle_scrub = false;
 
     /*
+     * Keep MFN 0 away from the buddy allocator to avoid crossing zone
+     * boundary when merging two buddies.
+     */
+    if ( !mfn_x(page_to_mfn(pg)) )
+    {
+        if ( nr_pages-- <= 1 )
+            return;
+        pg++;
+    }
+
+
+    /*
      * Some pages may not go through the boot allocator (e.g reserved
      * memory at boot but released just after --- kernel, initramfs,
      * etc.).
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 12:14 [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator Julien Grall
@ 2019-08-09 13:38 ` Jan Beulich
  2019-08-09 18:15   ` Stewart Hildebrand
  2019-08-09 17:55 ` Stefano Stabellini
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2019-08-09 13:38 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Stewart.Hildebrand, xen-devel, Jarvis.Roach

On 09.08.2019 14:14, Julien Grall wrote:
> Combining of buddies happens only such that the resulting larger buddy
> is still order-aligned. To cross a zone boundary while merging, the
> implication is that both the buddy [0, 2^n-1] and the buddy
> [2^n, 2^(n+1)] are free.

[2^n, 2^(n+1)-1]

You may want to add that merging across zone boundaries is what we
need to prevent.

> Ideally we want to fix the allocator, but for now we can just prevent
> adding the MFN 0 in the allocator.
> 
> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> bug can only happen on Arm platform where the first memory bank is
> starting at 0.
> 
> As this is a specific to the allocator, the MFN 0 is removed in the common code
> to cater all the architectures (current and future).
> 
> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 12:14 [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator Julien Grall
  2019-08-09 13:38 ` Jan Beulich
@ 2019-08-09 17:55 ` Stefano Stabellini
  1 sibling, 0 replies; 9+ messages in thread
From: Stefano Stabellini @ 2019-08-09 17:55 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Jan Beulich, Stewart.Hildebrand, xen-devel,
	Jarvis.Roach

On Fri, 9 Aug 2019, Julien Grall wrote:
> Combining of buddies happens only such that the resulting larger buddy
> is still order-aligned. To cross a zone boundary while merging, the
> implication is that both the buddy [0, 2^n-1] and the buddy
> [2^n, 2^(n+1)] are free.
> 
> Ideally we want to fix the allocator, but for now we can just prevent
> adding the MFN 0 in the allocator.
> 
> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> bug can only happen on Arm platform where the first memory bank is
> starting at 0.
> 
> As this is a specific to the allocator, the MFN 0 is removed in the common code
> to cater all the architectures (current and future).
> 
> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

I could reproduce the problem and I confirm that this patch fixes it:

Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>



> ---
> Cc: Jarvis.Roach@dornerworks.com
> Cc: Stewart.Hildebrand@dornerworks.com
> 
>     I am not sure I fully understood the exact problem when MFN 0 is
>     given to the allocator. Feel free to suggest a better explanation.
> 
>     Can anyone able to reproduce the bug test where the patch
>     effectively solve the crash?
> ---
>  xen/common/page_alloc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 453b303b5b..42b8a8ce23 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1759,6 +1759,18 @@ static void init_heap_pages(
>      bool idle_scrub = false;
>  
>      /*
> +     * Keep MFN 0 away from the buddy allocator to avoid crossing zone
> +     * boundary when merging two buddies.
> +     */
> +    if ( !mfn_x(page_to_mfn(pg)) )
> +    {
> +        if ( nr_pages-- <= 1 )
> +            return;
> +        pg++;
> +    }
> +
> +
> +    /*
>       * Some pages may not go through the boot allocator (e.g reserved
>       * memory at boot but released just after --- kernel, initramfs,
>       * etc.).
> -- 
> 2.11.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 13:38 ` Jan Beulich
@ 2019-08-09 18:15   ` Stewart Hildebrand
  2019-08-09 18:23     ` Stefano Stabellini
  2019-08-12  6:01     ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Stewart Hildebrand @ 2019-08-09 18:15 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, xen-devel, Jarvis Roach

On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
>On 09.08.2019 14:14, Julien Grall wrote:
>> Combining of buddies happens only such that the resulting larger buddy
>> is still order-aligned. To cross a zone boundary while merging, the
>> implication is that both the buddy [0, 2^n-1] and the buddy
>> [2^n, 2^(n+1)] are free.
>
>[2^n, 2^(n+1)-1]
>
>You may want to add that merging across zone boundaries is what we
>need to prevent.
>
>> Ideally we want to fix the allocator, but for now we can just prevent
>> adding the MFN 0 in the allocator.
>>
>> On x86, the MFN 0 is already kept away from the buddy allocator. So the
>> bug can only happen on Arm platform where the first memory bank is
>> starting at 0.
>>
>> As this is a specific to the allocator, the MFN 0 is removed in the common code
>> to cater all the architectures (current and future).
>>
>> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>
>Reviewed-by: Jan Beulich <jbeulich@suse.com>

Here is Jeff's initial patch for the issue.

From: Jeff Kubascik <jeff.kubascik@dornerworks.com>
Date: Mon, 4 Mar 2019 14:14:05 -0500
Subject: [PATCH] Check zone before merging adjacent blocks in heap

The Xen heap is split up into nodes and zones. Each node + zone is
managed as a separate pool of memory.

When returning pages to the heap, free_heap_pages will check adjacent
blocks to see if they can be combined into a larger block. However, the
zone of the adjacent block is not checked. This results in blocks that
migrate from one zone to another.

When a block migrates to the adjacent zone, the avail counters for the
old and new node + zone is not updated accordingly. The avail counter
is used when allocating pages to determine whether to skip over a zone.
With this behavior, it is possible for free pages to collect in a zone
with the avail counter smaller than the actual page count, resulting
in free pages that are not allocable.

This commit adds a check to compare the adjacent block's zone with the
current zone before merging them.

Signed-off-by: Jeff Kubascik <Jeff.Kubascik@dornerworks.com>
Tested-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
---
 xen/common/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 482f0988f7..a92268cc67 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1419,6 +1419,7 @@ static void free_heap_pages(
             if ( !mfn_valid(page_to_mfn(predecessor)) ||
                  !page_state_is(predecessor, free) ||
                  (PFN_ORDER(predecessor) != order) ||
+                 (page_to_zone(pg-mask) != zone) ||
                  (phys_to_nid(page_to_maddr(predecessor)) != node) )
                 break;
 
@@ -1442,6 +1443,7 @@ static void free_heap_pages(
             if ( !mfn_valid(page_to_mfn(successor)) ||
                  !page_state_is(successor, free) ||
                  (PFN_ORDER(successor) != order) ||
+                 (page_to_zone(pg+mask) != zone) ||
                  (phys_to_nid(page_to_maddr(successor)) != node) )
                 break;
 
-- 
2.22.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 18:15   ` Stewart Hildebrand
@ 2019-08-09 18:23     ` Stefano Stabellini
  2019-08-09 18:34       ` Stewart Hildebrand
  2019-08-12  6:01     ` Jan Beulich
  1 sibling, 1 reply; 9+ messages in thread
From: Stefano Stabellini @ 2019-08-09 18:23 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Julien Grall, Jan Beulich, xen-devel, Jarvis Roach

On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
> >On 09.08.2019 14:14, Julien Grall wrote:
> >> Combining of buddies happens only such that the resulting larger buddy
> >> is still order-aligned. To cross a zone boundary while merging, the
> >> implication is that both the buddy [0, 2^n-1] and the buddy
> >> [2^n, 2^(n+1)] are free.
> >
> >[2^n, 2^(n+1)-1]
> >
> >You may want to add that merging across zone boundaries is what we
> >need to prevent.
> >
> >> Ideally we want to fix the allocator, but for now we can just prevent
> >> adding the MFN 0 in the allocator.
> >>
> >> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> >> bug can only happen on Arm platform where the first memory bank is
> >> starting at 0.
> >>
> >> As this is a specific to the allocator, the MFN 0 is removed in the common code
> >> to cater all the architectures (current and future).
> >>
> >> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> >> Signed-off-by: Julien Grall <julien.grall@arm.com>
> >
> >Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Here is Jeff's initial patch for the issue.

I committed Julien's patch for now, but if we need to make any changes
or decide for a better alternative, we can always revert it.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 18:23     ` Stefano Stabellini
@ 2019-08-09 18:34       ` Stewart Hildebrand
  2019-08-09 18:35         ` Julien Grall
  2019-08-12  6:16         ` Jan Beulich
  0 siblings, 2 replies; 9+ messages in thread
From: Stewart Hildebrand @ 2019-08-09 18:34 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jeff Kubascik, Tim Deegan, Julien Grall,
	Jan Beulich, xen-devel, Jarvis Roach

On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>> Here is Jeff's initial patch for the issue.
>
>I committed Julien's patch for now,

Great! Thanks!

>but if we need to make any changes
>or decide for a better alternative, we can always revert it.

Can we entertain committing both patches?
To paraphrase George from an earlier discussion: Removing MFN 0 fixes the issue by relying on side effects. Adding an explicit check in the merging logic directly fixes the issue.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 18:34       ` Stewart Hildebrand
@ 2019-08-09 18:35         ` Julien Grall
  2019-08-12  6:16         ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Julien Grall @ 2019-08-09 18:35 UTC (permalink / raw)
  To: Stewart Hildebrand, Stefano Stabellini
  Cc: Wei Liu, Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper,
	Ian Jackson, Jeff Kubascik, Tim Deegan, Jan Beulich, xen-devel,
	Jarvis Roach

Hi Stewart,

On 09/08/2019 19:34, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>> On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>>> Here is Jeff's initial patch for the issue.
>>
>> I committed Julien's patch for now,
> 
> Great! Thanks!
> 
>> but if we need to make any changes
>> or decide for a better alternative, we can always revert it.
> 
> Can we entertain committing both patches?

That's you to post formally the patch on the ML ;). The way you posted is likely 
going to be unnoticed as you hijack the discussion (subject title as not be 
changed).

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 18:15   ` Stewart Hildebrand
  2019-08-09 18:23     ` Stefano Stabellini
@ 2019-08-12  6:01     ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2019-08-12  6:01 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, AndrewCooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Julien Grall, xen-devel, Jarvis Roach

On 09.08.2019 20:15, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
>> On 09.08.2019 14:14, Julien Grall wrote:
>>> Combining of buddies happens only such that the resulting larger buddy
>>> is still order-aligned. To cross a zone boundary while merging, the
>>> implication is that both the buddy [0, 2^n-1] and the buddy
>>> [2^n, 2^(n+1)] are free.
>>
>> [2^n, 2^(n+1)-1]
>>
>> You may want to add that merging across zone boundaries is what we
>> need to prevent.
>>
>>> Ideally we want to fix the allocator, but for now we can just prevent
>>> adding the MFN 0 in the allocator.
>>>
>>> On x86, the MFN 0 is already kept away from the buddy allocator. So the
>>> bug can only happen on Arm platform where the first memory bank is
>>> starting at 0.
>>>
>>> As this is a specific to the allocator, the MFN 0 is removed in the common code
>>> to cater all the architectures (current and future).
>>>
>>> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Here is Jeff's initial patch for the issue.

To be honest, it would have been nice if you had clarified _why_
you sent this in reply here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator
  2019-08-09 18:34       ` Stewart Hildebrand
  2019-08-09 18:35         ` Julien Grall
@ 2019-08-12  6:16         ` Jan Beulich
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2019-08-12  6:16 UTC (permalink / raw)
  To: Stewart Hildebrand
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, AndrewCooper, Ian Jackson, Jeff Kubascik,
	Tim Deegan, Julien Grall, xen-devel, Jarvis Roach

On 09.08.2019 20:34, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>> On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>>> Here is Jeff's initial patch for the issue.
>>
>> I committed Julien's patch for now,
> 
> Great! Thanks!
> 
>> but if we need to make any changes
>> or decide for a better alternative, we can always revert it.
> 
> Can we entertain committing both patches?
> To paraphrase George from an earlier discussion: Removing MFN 0 fixes the issue by relying on side effects. Adding an explicit check in the merging logic directly fixes the issue.
> 

I thought previous discussion (when you had first posted you variant
of the fix) had clarified that there are objections to you modifying
an often executed code path when the same effect can be achieved by
modifying an infrequently executed one.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-08-12  6:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-09 12:14 [Xen-devel] [PATCH] xen/page_alloc: Keep away MFN 0 from the buddy allocator Julien Grall
2019-08-09 13:38 ` Jan Beulich
2019-08-09 18:15   ` Stewart Hildebrand
2019-08-09 18:23     ` Stefano Stabellini
2019-08-09 18:34       ` Stewart Hildebrand
2019-08-09 18:35         ` Julien Grall
2019-08-12  6:16         ` Jan Beulich
2019-08-12  6:01     ` Jan Beulich
2019-08-09 17:55 ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).