xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: George Dunlap <dunlapg@umich.edu>
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>, "Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Varad Gautam" <vrd@amazon.de>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"Hongyan Xia" <hongyxia@amazon.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Paul Durrant" <pdurrant@amazon.co.uk>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH 5/8] xen/vmap: allow vmap() to be called during early boot
Date: Tue, 04 Feb 2020 11:06:46 +0000	[thread overview]
Message-ID: <ddaea6f4dfec77aacd42352aca7328310418800e.camel@infradead.org> (raw)
In-Reply-To: <CAFLBxZa9oUE8bAOCK0JaDpyOwFSZU-rvwvSf7h=2zzU643oOww@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 4191 bytes --]

On Tue, 2020-02-04 at 11:00 +0000, George Dunlap wrote:
> On Mon, Feb 3, 2020 at 4:37 PM David Woodhouse <dwmw2@infradead.org> wrote:
> > 
> > On Mon, 2020-02-03 at 14:00 +0000, Julien Grall wrote:
> > > Hi David,
> > > 
> > > On 01/02/2020 00:33, David Woodhouse wrote:
> > > > From: David Woodhouse <dwmw@amazon.co.uk>
> > > 
> > > I am a bit concerned with this change, particularly the consequence this
> > > have for the page-tables. There is an assumption that intermediate
> > > page-tables allocated via the boot allocator will never be freed.
> > > 
> > > On x86, a call to vunmap() will not free page-tables, but a subsequent
> > > call to vmap() may free it depending on the mapping size. So we would
> > > call free_domheap_pages() rather than init_heap_pages().
> > > 
> > > I am not entirely sure what is the full consequence, but I think this is
> > > a call for investigation and write it down a summary in the commit message.
> > 
> > This isn't just about page tables, right? It's about *any* allocation
> > given out by the boot allocator, being freed with free_heap_pages() ?
> > 
> > Given the amount of code that has conditionals in both alloc and free
> > paths along the lines of…
> > 
> >   if (system_state > SYS_STATE_boot)
> >       use xenheap
> >   else
> >       use boot allocator
> > 
> > … I'm not sure I'd really trust the assumption that such a thing never
> > happens; that no pages are ever allocated from the boot allocator and
> > then freed into the heap.
> > 
> > In fact it does work fine except for some esoteric corner cases,
> > because init_heap_pages() is mostly just a trivial loop over
> > free_heap_pages().
> > 
> > The corner cases are if you call free_heap_pages() on boot-allocated
> > memory which matches one or more of the following criteria:
> > 
> >  • Includes MFN #0,
> > 
> >  • Includes the first page the heap has seen on a given node, so
> >    init_node_heap() has to be called, or
> > 
> >  • High-order allocations crossing from one node to another.
> 
> I was asked to forward a message relating to MFN 0 and allocations
> crossing zones from a private discussion on the security list:
> 
> 8<---
> 
> > I am having difficulty seeing how invalidating MFN0 would solve the issue here.
> > The zone number for a specific page is calculated from the most significant bit
> > position set in it's MFN. As a result, each successive zone contains an order of
> > magnitude more pages. You would need to invalidate the first or last MFN in each
> > zone.
> 
> Because (unless Jan and I are reading the code wrong):
> 
> * Chunks can only be merged such that they end up on order-boundaries.
> * Chunks can only be merged if they are the same order.
> * Zone boundaries are on order boundaries.
> 
> So say you're freeing mfn 0x100, and mfn 0xff is free.  In that loop, (1
> << order) & mfn will always be 0, so it will always only look "forward"
> fro things to merge, not backwards.
> 
> Suppose on the other hand, that you're freeing mfn 0x101, and 0x98
> through 0x100 are free.  The loop will look "backwards" and merge with
> 0x100; but then it will look "forwards" again.
> 
> Now suppose you've merged 0x100-0x1ff, and the order moves up to size
> 0x100.  Now the mask becomes 0x1ff; so it can't merge with 0x200-0x2ff
> (which would cross zones); instead it looks backwards to 0x0-0xff.
> 
> We don't think it's possible for things to be merged across zones unless
> it can (say) start at 0xff, and merge all the way back to 0x0; which
> can't be done if 0x0 is never on the free list.
> 
> That's the idea anyway.  That would explain why we've never seen it on
> x86 -- due to the way the architecture is, mfn 0 is never on the free list.
> 
> --->8

Thanks.

I still don't really get it. What if the zone boundary is at MFN 0x300?

What prevents the buddy allocator from merging a range a 0x200-0x2FF
with another from 0x300-0x3FF, creating a single range 0x200-0x400
which crosses nodes?

The MFN0 trick only works if all zone boundaries must be at an address
which is 2ⁿ, doesn't it? Is that always true?


[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5174 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2020-02-04 11:07 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-01  0:32 [Xen-devel] [PATCH 0/8] Early cleanups and bug fixes in preparation for live update David Woodhouse
2020-02-01  0:32 ` [Xen-devel] [PATCH 1/8] x86/smp: reset x2apic_enabled in smp_send_stop() David Woodhouse
2020-02-03 16:18   ` Roger Pau Monné
2020-02-01  0:32 ` [Xen-devel] [PATCH 2/8] x86/setup: Fix badpage= handling for memory above HYPERVISOR_VIRT_END David Woodhouse
2020-02-03 10:57   ` Julien Grall
2020-02-20 15:38   ` Jan Beulich
2020-03-06 22:52   ` Julien Grall
2020-02-01  0:32 ` [Xen-devel] [PATCH 3/8] x86/setup: Don't skip 2MiB underneath relocated Xen image David Woodhouse
2020-02-01  0:32 ` [Xen-devel] [PATCH 4/8] xen/vmap: allow vm_init_type to be called during early_boot David Woodhouse
2020-02-13 10:36   ` Julien Grall
2020-02-21 16:42   ` Jan Beulich
2020-02-01  0:33 ` [Xen-devel] [PATCH 5/8] xen/vmap: allow vmap() to be called during early boot David Woodhouse
2020-02-03 14:00   ` Julien Grall
2020-02-03 16:37     ` David Woodhouse
2020-02-04 11:00       ` George Dunlap
2020-02-04 11:06         ` David Woodhouse [this message]
2020-02-04 11:18           ` David Woodhouse
2020-02-09 18:19       ` Julien Grall
2020-02-21 16:46   ` Jan Beulich
2020-02-01  0:33 ` [Xen-devel] [PATCH 6/8] x86/setup: move vm_init() before end_boot_allocator() David Woodhouse
2020-02-03 11:10   ` Xia, Hongyan
2020-02-03 14:03     ` David Woodhouse
2020-02-21 16:48   ` Jan Beulich
2020-02-01  0:33 ` [Xen-devel] [PATCH 7/8] x86/setup: simplify handling of initrdidx when no initrd present David Woodhouse
2020-02-13 10:47   ` Julien Grall
2020-02-21 16:59   ` Jan Beulich
2020-02-24 13:31     ` Julien Grall
2020-02-25 12:34       ` Jan Beulich
2020-02-26  7:13         ` Julien Grall
2020-02-26  8:37           ` Jan Beulich
2020-02-01  0:33 ` [Xen-devel] [PATCH 8/8] x86/setup: lift dom0 creation out into create_dom0() function David Woodhouse
2020-02-03 14:28   ` Julien Grall
2020-02-03 15:03     ` David Woodhouse
2020-02-21 17:06   ` Jan Beulich
2020-03-17 23:45     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ddaea6f4dfec77aacd42352aca7328310418800e.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=dunlapg@umich.edu \
    --cc=hongyxia@amazon.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=julien@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=pdurrant@amazon.co.uk \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=vrd@amazon.de \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).