On Wed, 2020-02-05 at 10:22 +0000, Julien Grall wrote: > Hi, > > On 05/02/2020 09:50, David Woodhouse wrote: > > On Tue, 2020-02-04 at 15:37 +0000, George Dunlap wrote: > > > At very least it's more robust this way; the algorithm is also less > > > "magic". We just had a long discussion this morning trying to re-create > > > the logic for why "Remove MFN 0" was sufficient to prevent this issue, > > > and even then David wasn't sure it was correct at first. > > > > Right. So the real reason I'm staring hard at this is because I can't > > convince myself there aren't places where memory allocated by the boot > > allocator is later freed with free_xenheap_pages(). > > > > We have a few pieces of code which decide whether to use the boot > > allocator vs. heap based on system_state >= SYS_STATE_boot, and *if* > > the rule is "thou shalt not allocate with boot allocator and free > > later" then it's *extremely* fragile and probably being violated — > > especially because it actually *works* most of the time, except in some > > esoteric corner cases like MFN#0, boot allocations which cross > > zones/regions, etc. > > > > So because we want to make that *more* likely by allowing vmap() to > > happen earlier, I'd like to clean things up by addressing those corner > > cases and making it unconditionally OK to free boot-allocated pages > > into the heap. > > > > I think might be as simple as checking for (first_pg)->count_info == 0 > > in free_xenheap_pages(). That's quick enough, and if the count_info is > > zero then I think it does indicate a boot-allocated page, because pages > > from alloc_xenheap_pages() would have PGC_xen_heap set? > > > > It would suffice just to pass such pages to init_heap_pages() instead > > of directly to free_heap_pages(), I think. Julien? > > > > The straw man version of that looks a bit like this... > > > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -2304,6 +2304,12 @@ void free_xenheap_pages(void *v, unsigned int order) > > > > pg = virt_to_page(v); > > > > + /* Pages from the boot allocator need to pass through init_heap_pages() */ > > + if ( unlikely(!pg->count_info) ) > > Note that there is two versions of free_xenheap_pages(). This one only > cover the case where the domheap and xenheap are the same. > > But you can't use the same trick when xenheap is separated (like on > Arm32) because PGC_xen_heap is not set. So you would be calling > init_heap_pages() everytime. Right. We'd want to set PGC_xen_heap there too on the corresponding pages. > However, I don't like the idea of relying on count_info == 0. Indeed, > there are valid case where count_info == 0 because it means the page is > inuse (PCC_state_inuse). For xenheap pages, not just domheap pages? > It might be best if we introduce a new page state that would be the > default value set in the frametable. ... and which is easily set with memset() so it's fast, which I think you considered since you were suggesting all 0xFF, but which we haven't explciitly stated out loud. The other thing to note is that it's easy to make a default state for pages which *weren't* given out by the boot allocator, but we don't have a simple way to do anything to the pages which *were* given out by the boot allocator, since we don't track those and — fairly much by definition — we don't have the frametable yet when we start the boot allocator.