On Wed, 24 Nov 2021, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko > > The main reason of this change is that unpopulated-alloc > code cannot be used in its current form on Arm, but there > is a desire to reuse it to avoid wasting real RAM pages > for the grant/foreign mappings. > > The problem is that system "iomem_resource" is used for > the address space allocation, but the really unallocated > space can't be figured out precisely by the domain on Arm > without hypervisor involvement. For example, not all device > I/O regions are known by the time domain starts creating > grant/foreign mappings. And following the advise from > "iomem_resource" we might end up reusing these regions by > a mistake. So, the hypervisor which maintains the P2M for > the domain is in the best position to provide unused regions > of guest physical address space which could be safely used > to create grant/foreign mappings. > > Introduce new helper arch_xen_unpopulated_init() which purpose > is to create specific Xen resource based on the memory regions > provided by the hypervisor to be used as unused space for Xen > scratch pages. If arch doesn't define arch_xen_unpopulated_init() > the default "iomem_resource" will be used. > > Update the arguments list of allocate_resource() in fill_list() > to always allocate a region from the hotpluggable range > (maximum possible addressable physical memory range for which > the linear mapping could be created). If arch doesn't define > arch_get_mappable_range() the default range (0,-1) will be used. > > The behaviour on x86 won't be changed by current patch as both > arch_xen_unpopulated_init() and arch_get_mappable_range() > are not implemented for it. > > Also fallback to allocate xenballooned pages (balloon out RAM > pages) if we do not have any suitable resource to work with > (target_resource is invalid) and as the result we won't be able > to provide unpopulated pages on a request. > > Signed-off-by: Oleksandr Tyshchenko Reviewed-by: Stefano Stabellini > --- > Please note the following: > for V3 arch_xen_unpopulated_init() was moved to init() as was agreed > and gained __init specifier. So the target_resource is initialized there. > > With current patch series applied if CONFIG_XEN_UNPOPULATED_ALLOC > is enabled: > > 1. On Arm, under normal circumstances, the xen_alloc_unpopulated_pages() > won't be called “before” arch_xen_unpopulated_init(). It will only be > called "before" when either ACPI is in use or something wrong happened > with DT (and we failed to read xen_grant_frames), so we fallback to > xen_xlate_map_ballooned_pages() in arm/xen/enlighten.c:xen_guest_init(), > please see "arm/xen: Switch to use gnttab_setup_auto_xlat_frames() for DT" > for details. But in that case, I think, it doesn't matter much whether > xen_alloc_unpopulated_pages() is called "before" of "after" target_resource > initialization, as we don't have extended regions in place the target_resource > will remain invalid even after initialization, so xen_alloc_ballooned_pages() > will be used in both scenarios. > > 2. On x86, I am not quite sure which modes use unpopulated-alloc (PVH?), > but it looks like xen_alloc_unpopulated_pages() can (and will) be called > “before” arch_xen_unpopulated_init(). > At least, I see that xen_xlate_map_ballooned_pages() is called in > x86/xen/grant-table.c:xen_pvh_gnttab_setup(). According to the initcall > levels for both xen_pvh_gnttab_setup() and init() I expect the former > to be called earlier. > If it is true, the sentence in the commit description which mentions > that “behaviour on x86 is not changed” is not precise. I don’t think > it would be correct to fallback to xen_alloc_ballooned_pages() just > because we haven’t initialized target_resource yet (on x86 it is just > assigning it iomem_resource), at least this doesn't look like an expected > behaviour and unlikely would be welcome. > > I am wondering whether it would be better to move arch_xen_unpopulated_init() > to a dedicated init() marked with an appropriate initcall level (early_initcall?) > to make sure it will always be called *before* xen_xlate_map_ballooned_pages(). > What do you think? > > Changes RFC -> V2: > - new patch, instead of > "[RFC PATCH 2/2] xen/unpopulated-alloc: Query hypervisor to provide unallocated space" > > Changes V2 -> V3: > - update patch description and comments in code > - modify arch_xen_unpopulated_init() to pass target_resource as an argument > and update default helper to assign iomem_resource to it, also drop > xen_resource as it will be located in arch code in the future > - allocate region from hotpluggable range instead of hardcoded range (0,-1) > in fill_list() > - use %pR specifier in error message > - do not call unpopulated_init() at runtime from xen_alloc_unpopulated_pages(), > drop an extra helper and call arch_xen_unpopulated_init() directly from __init() > - include linux/ioport.h instead of forward declaration of struct resource > - replace insert_resource() with request_resource() in fill_list() > - add __init specifier to arch_xen_unpopulated_init() > --- > drivers/xen/unpopulated-alloc.c | 83 +++++++++++++++++++++++++++++++++++++---- > include/xen/xen.h | 2 + > 2 files changed, 78 insertions(+), 7 deletions(-) > > diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c > index a03dc5b..07d3578 100644 > --- a/drivers/xen/unpopulated-alloc.c > +++ b/drivers/xen/unpopulated-alloc.c > @@ -8,6 +8,7 @@ > > #include > > +#include > #include > #include > > @@ -15,13 +16,29 @@ static DEFINE_MUTEX(list_lock); > static struct page *page_list; > static unsigned int list_count; > > +static struct resource *target_resource; > + > +/* > + * If arch is not happy with system "iomem_resource" being used for > + * the region allocation it can provide it's own view by creating specific > + * Xen resource with unused regions of guest physical address space provided > + * by the hypervisor. > + */ > +int __weak __init arch_xen_unpopulated_init(struct resource **res) > +{ > + *res = &iomem_resource; > + > + return 0; > +} > + > static int fill_list(unsigned int nr_pages) > { > struct dev_pagemap *pgmap; > - struct resource *res; > + struct resource *res, *tmp_res = NULL; > void *vaddr; > unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION); > - int ret = -ENOMEM; > + struct range mhp_range; > + int ret; > > res = kzalloc(sizeof(*res), GFP_KERNEL); > if (!res) > @@ -30,14 +47,40 @@ static int fill_list(unsigned int nr_pages) > res->name = "Xen scratch"; > res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; > > - ret = allocate_resource(&iomem_resource, res, > - alloc_pages * PAGE_SIZE, 0, -1, > + mhp_range = mhp_get_pluggable_range(true); > + > + ret = allocate_resource(target_resource, res, > + alloc_pages * PAGE_SIZE, mhp_range.start, mhp_range.end, > PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL); > if (ret < 0) { > pr_err("Cannot allocate new IOMEM resource\n"); > goto err_resource; > } > > + /* > + * Reserve the region previously allocated from Xen resource to avoid > + * re-using it by someone else. > + */ > + if (target_resource != &iomem_resource) { > + tmp_res = kzalloc(sizeof(*tmp_res), GFP_KERNEL); > + if (!res) { > + ret = -ENOMEM; > + goto err_insert; > + } > + > + tmp_res->name = res->name; > + tmp_res->start = res->start; > + tmp_res->end = res->end; > + tmp_res->flags = res->flags; > + > + ret = request_resource(&iomem_resource, tmp_res); > + if (ret < 0) { > + pr_err("Cannot request resource %pR (%d)\n", tmp_res, ret); > + kfree(tmp_res); > + goto err_insert; > + } > + } > + > pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL); > if (!pgmap) { > ret = -ENOMEM; > @@ -95,6 +138,11 @@ static int fill_list(unsigned int nr_pages) > err_memremap: > kfree(pgmap); > err_pgmap: > + if (tmp_res) { > + release_resource(tmp_res); > + kfree(tmp_res); > + } > +err_insert: > release_resource(res); > err_resource: > kfree(res); > @@ -112,6 +160,14 @@ int xen_alloc_unpopulated_pages(unsigned int nr_pages, struct page **pages) > unsigned int i; > int ret = 0; > > + /* > + * Fallback to default behavior if we do not have any suitable resource > + * to allocate required region from and as the result we won't be able to > + * construct pages. > + */ > + if (!target_resource) > + return xen_alloc_ballooned_pages(nr_pages, pages); > + > mutex_lock(&list_lock); > if (list_count < nr_pages) { > ret = fill_list(nr_pages - list_count); > @@ -159,6 +215,11 @@ void xen_free_unpopulated_pages(unsigned int nr_pages, struct page **pages) > { > unsigned int i; > > + if (!target_resource) { > + xen_free_ballooned_pages(nr_pages, pages); > + return; > + } > + > mutex_lock(&list_lock); > for (i = 0; i < nr_pages; i++) { > pages[i]->zone_device_data = page_list; > @@ -169,9 +230,11 @@ void xen_free_unpopulated_pages(unsigned int nr_pages, struct page **pages) > } > EXPORT_SYMBOL(xen_free_unpopulated_pages); > > -#ifdef CONFIG_XEN_PV > static int __init init(void) > { > + int ret; > + > +#ifdef CONFIG_XEN_PV > unsigned int i; > > if (!xen_domain()) > @@ -196,8 +259,14 @@ static int __init init(void) > list_count++; > } > } > +#endif > > - return 0; > + ret = arch_xen_unpopulated_init(&target_resource); > + if (ret) { > + pr_err("xen:unpopulated: Cannot initialize target resource\n"); > + target_resource = NULL; > + } > + > + return ret; > } > subsys_initcall(init); > -#endif > diff --git a/include/xen/xen.h b/include/xen/xen.h > index 86c5b37..a99bab8 100644 > --- a/include/xen/xen.h > +++ b/include/xen/xen.h > @@ -55,6 +55,8 @@ extern u64 xen_saved_max_mem_size; > #ifdef CONFIG_XEN_UNPOPULATED_ALLOC > int xen_alloc_unpopulated_pages(unsigned int nr_pages, struct page **pages); > void xen_free_unpopulated_pages(unsigned int nr_pages, struct page **pages); > +#include > +int arch_xen_unpopulated_init(struct resource **res); > #else > #include > static inline int xen_alloc_unpopulated_pages(unsigned int nr_pages, > -- > 2.7.4 >