Re: [PATCH] xen/arm: skip first page when RAM starts at 0x0

From: Stefano Stabellini <sstabellini@kernel.org>
To: Jan Beulich <JBeulich@suse.com>
Cc: Stefano Stabellini <stefanos@xilinx.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Julien Grall <julien.grall@arm.com>,
	206497@studenti.unimore.it,
	xen-devel <xen-devel@lists.xenproject.org>,
	nd@arm.com
Subject: Re: [PATCH] xen/arm: skip first page when RAM starts at 0x0
Date: Thu, 2 May 2019 15:25:22 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1905021517420.4309@sstabellini-ThinkPad-X260> (raw)
In-Reply-To: <5CCAB676020000780022B199@prv1-mh.provo.novell.com>

On Thu, 2 May 2019, Jan Beulich wrote:
> >>> On 02.05.19 at 11:02, <julien.grall@arm.com> wrote:
> > On 5/2/19 8:30 AM, Jan Beulich wrote:
> >>>>> On 02.05.19 at 00:44, <sstabellini@kernel.org> wrote:
> >>> Hi Jan, I have a question on the PDX code.
> >>>
> >>> The PDX initialization is a bit different between x86 and ARM, but it
> >>> follows roughly the same pattern, look at
> >>> xen/arch/x86/srat.c:srat_parse_regions (I take that is where things
> >>> happen on x86) and xen/arch/arm/setup.c:init_pdx.
> >>>
> >>> Mask is initialized calling pdx_init_mask on a start address, then a
> >>> loop fills in the rest of the mask calling pdx_region_mask, based on the
> >>> memory regions present.
> >>>
> >>> I wrote a small unit test of the ARM PDX initialization and while I was
> >>> playing with addresses and values I noticed that actually if I simply
> >>> skip pdx_init_mask and just initialize the mask to 0 (mask = 0) in
> >>> init_pdx, the rest of the function always calculates the right mask.
> >>>
> >>> In fact, there are cases where initializing the mask to a value causes
> >>> the rest of the code to miss some potential compressions. While
> >>> initializing the mask to 0 leads to more optimizations. I can provide
> >>> specific examples if you are curious.
> >>>
> >>> Before I make any changes to that code, I would like to understand a bit
> >>> better why things are done that way today. Do you know why the mask is
> >>> initialized to pdx_init_mask(start-of-ram)?
> > 
> > Well, it is not the start-of-ram on Arm. It is whatever is the start of 
> > bank 0. This is because the firmware table (such as DT) may not require 
> > ordering and we don't order banks in Xen.
> > 
> > So it may be possible the PDX will not compress if the banks are not 
> > ordered in the firmware tables.
> 
> Even more so a reason not to use bank 0's start address.
> 
> >> I'm confused, and hence I'm perhaps misunderstanding your
> >> question. To me it looks like you're re-asking a question already
> >> answered. On x86 we don't want to squash out any of the low
> >> 32 bits, because of the special address ranges that live below
> >> 4Gb. Hence we invoke pdx_init_mask(first-block-at-or-above-4Gb).
> >> Note it's not start-of-ram, as you've said.
> > 
> > I think what Stefano is asking is why pdx_init_mask(...) is invoked with 
> > the first block address rather than 4GB (or even 0 thought I don't think 
> > this is right).
> > 
> > By using the first block address, the PDX will not be able to compress 
> > any bits between 0 and the MSB 1' in the first block address. In other 
> > word, for a base address 0x200000000 (8GB), the initial mask will be 
> > 0x1ffffffff.
> > 
> > Stefano and I were wondering whether it would instead be possible to 
> > create the initial mask with pdx_init_mask(4GB) or pdx_init_mask(1GB) 
> > (I.e the maximum contiguous range the buddy allocator can allocate).
> 
> That's indeed an option - it's just that I've yet to see an x86
> system where there's a hole starting at 4Gb. What's better in that
> case can probably be judged only once run into such a case.

All right. Looking at the comment in pfn_pdx_hole_setup, it seems that
it is intending to skip the first MAX_ORDER bits, but actually it is
skipping the first MAX_ORDER-1 bits, if my calculations are correct.

MAX_ORDER is 18 on ARM which correspond to 1GB. With the current
implementation of pfn_pdx_hole_setup, if I pass a mask corresponding to
512MB, I can see "PFN compression on bits 17...19". So the range
512MB-1GB gets compressed.

Shouldn't it be:

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
index 50c21b6..b334eb9 100644
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -81,7 +81,7 @@ void __init pfn_pdx_hole_setup(unsigned long mask)
      * contiguous aligned ranges of 2^MAX_ORDER pages. Among others, our
      * buddy allocator relies on this assumption.
      */
-    for ( j = MAX_ORDER-1; ; )
+    for ( j = MAX_ORDER; ; )
     {
         i = find_next_zero_bit(&mask, BITS_PER_LONG, j);
         j = find_next_bit(&mask, BITS_PER_LONG, i); 


With this change, I don't see pfn_pdx_hole_setup trying to compress bit
17 anymore.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel