From: Oleksandr <olekstysh@gmail.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien@xen.org>,
xen-devel@lists.xenproject.org,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
Henry Wang <Henry.Wang@arm.com>,
Bertrand Marquis <bertrand.marquis@arm.com>,
Wei Chen <Wei.Chen@arm.com>
Subject: Re: [PATCH V2 2/3] xen/arm: Add handling of extended regions for Dom0
Date: Thu, 23 Sep 2021 20:44:30 +0300 [thread overview]
Message-ID: <28503e09-44c3-f623-bb8d-8778bb94225f@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.21.2109230931110.17979@sstabellini-ThinkPad-T480s>
On 23.09.21 19:38, Stefano Stabellini wrote:
Hi Stefano
> On Thu, 23 Sep 2021, Oleksandr wrote:
>> On 18.09.21 19:59, Oleksandr wrote:
>>>>>> +#define EXT_REGION_END 0x80003fffffffULL
>>>>>> +
>>>>>> +static int __init find_unallocated_memory(const struct kernel_info
>>>>>> *kinfo,
>>>>>> + struct meminfo
>>>>>> *ext_regions)
>>>>>> +{
>>>>>> + const struct meminfo *assign_mem = &kinfo->mem;
>>>>>> + struct rangeset *unalloc_mem;
>>>>>> + paddr_t start, end;
>>>>>> + unsigned int i;
>>>>>> + int res;
>>>>> We technically already know which range of memory is unused. This is
>>>>> pretty much any region in the freelist of the page allocator. So how
>>>>> about walking the freelist instead?
>>>> ok, I will investigate the page allocator code (right now I have no
>>>> understanding of how to do that). BTW, I have just grepped "freelist"
>>>> through the code and all page context related appearances are in x86 code
>>>> only.
>>>>
>>>>> The advantage is we don't need to worry about modifying the function
>>>>> when adding new memory type.
>>>>>
>>>>> One disavantage is this will not cover *all* the unused memory as this
>>>>> is doing. But I think this is an acceptable downside.
>>> I did some investigations and create test patch. Although, I am not 100%
>>> sure this is exactly what you meant, but I will provide results anyway.
>>>
>>> 1. Below the extended regions (unallocated memory, regions >=64MB )
>>> calculated by my initial method (bootinfo.mem - kinfo->mem -
>>> bootinfo.reserved_mem - kinfo->gnttab):
>>>
>>> (XEN) Extended region 0: 0x48000000->0x54000000
>>> (XEN) Extended region 1: 0x57000000->0x60000000
>>> (XEN) Extended region 2: 0x70000000->0x78000000
>>> (XEN) Extended region 3: 0x78200000->0xc0000000
>>> (XEN) Extended region 4: 0x500000000->0x580000000
>>> (XEN) Extended region 5: 0x600000000->0x680000000
>>> (XEN) Extended region 6: 0x700000000->0x780000000
>>>
>>> 2. Below the extended regions (unallocated memory, regions >=64MB)
>>> calculated by new method (free memory in page allocator):
>>>
>>> (XEN) Extended region 0: 0x48000000->0x54000000
>>> (XEN) Extended region 1: 0x58000000->0x60000000
>>> (XEN) Extended region 2: 0x70000000->0x78000000
>>> (XEN) Extended region 3: 0x78200000->0x84000000
>>> (XEN) Extended region 4: 0x86000000->0x8a000000
>>> (XEN) Extended region 5: 0x8c200000->0xc0000000
>>> (XEN) Extended region 6: 0x500000000->0x580000000
>>> (XEN) Extended region 7: 0x600000000->0x680000000
>>> (XEN) Extended region 8: 0x700000000->0x765e00000
>>>
>>> Some thoughts regarding that.
>>>
>>> 1. A few ranges below 4GB are absent in resulting extended regions. I
>>> assume, this is because of the modules:
>>>
>>> (XEN) Checking for initrd in /chosen
>>> (XEN) Initrd 0000000084000040-0000000085effc48
>>> (XEN) RAM: 0000000048000000 - 00000000bfffffff
>>> (XEN) RAM: 0000000500000000 - 000000057fffffff
>>> (XEN) RAM: 0000000600000000 - 000000067fffffff
>>> (XEN) RAM: 0000000700000000 - 000000077fffffff
>>> (XEN)
>>> (XEN) MODULE[0]: 0000000078080000 - 00000000781d74c8 Xen
>>> (XEN) MODULE[1]: 0000000057fe7000 - 0000000057ffd080 Device Tree
>>> (XEN) MODULE[2]: 0000000084000040 - 0000000085effc48 Ramdisk
>>> (XEN) MODULE[3]: 000000008a000000 - 000000008c000000 Kernel
>>> (XEN) MODULE[4]: 000000008c000000 - 000000008c010000 XSM
>>> (XEN) RESVD[0]: 0000000084000040 - 0000000085effc48
>>> (XEN) RESVD[1]: 0000000054000000 - 0000000056ffffff
>>>
>>> 2. Also, it worth mentioning that relatively large chunk (~417MB) of memory
>>> above 4GB is absent (to be precise, at the end of last RAM bank), which I
>>> assume, used for Xen internals.
>>> We could really use it for extended regions.
>>> Below free regions in the heap (for last RAM bank) just in case:
>>>
>>> (XEN) heap[node=0][zone=23][order=5] 0x00000765ec0000-0x00000765ee0000
>>> (XEN) heap[node=0][zone=23][order=6] 0x00000765e80000-0x00000765ec0000
>>> (XEN) heap[node=0][zone=23][order=7] 0x00000765e00000-0x00000765e80000
>>> (XEN) heap[node=0][zone=23][order=9] 0x00000765c00000-0x00000765e00000
>>> (XEN) heap[node=0][zone=23][order=10] 0x00000765800000-0x00000765c00000
>>> (XEN) heap[node=0][zone=23][order=11] 0x00000765000000-0x00000765800000
>>> (XEN) heap[node=0][zone=23][order=12] 0x00000764000000-0x00000765000000
>>> (XEN) heap[node=0][zone=23][order=14] 0x00000760000000-0x00000764000000
>>> (XEN) heap[node=0][zone=23][order=17] 0x00000740000000-0x00000760000000
>>> (XEN) heap[node=0][zone=23][order=18] 0x00000540000000-0x00000580000000
>>> (XEN) heap[node=0][zone=23][order=18] 0x00000500000000-0x00000540000000
>>> (XEN) heap[node=0][zone=23][order=18] 0x00000640000000-0x00000680000000
>>> (XEN) heap[node=0][zone=23][order=18] 0x00000600000000-0x00000640000000
>>> (XEN) heap[node=0][zone=23][order=18] 0x00000700000000-0x00000740000000
>>>
>>> Yes, you already pointed out this disadvantage, so if it is an acceptable
>>> downside, I am absolutely OK.
>>>
>>>
>>> 3. Common code updates. There is a question how to properly make a
>>> connection between common allocator internals and Arm's code for creating
>>> DT. I didn’t come up with anything better
>>> than creating for_each_avail_page() for invoking a callback with page and
>>> its order.
>>>
>>> **********
>>>
>>> Below the proposed changes on top of the initial patch, would this be
>>> acceptable in general?
>>>
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index 523eb19..1e58fc5 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -753,16 +753,33 @@ static int __init add_ext_regions(unsigned long s,
>>> unsigned long e, void *data)
>>> return 0;
>>> }
>>>
>>> +static int __init add_unalloc_mem(struct page_info *page, unsigned int
>>> order,
>>> + void *data)
>>> +{
>>> + struct rangeset *unalloc_mem = data;
>>> + paddr_t start, end;
>>> + int res;
>>> +
>>> + start = page_to_maddr(page);
>>> + end = start + pfn_to_paddr(1UL << order);
>>> + res = rangeset_add_range(unalloc_mem, start, end - 1);
>>> + if ( res )
>>> + {
>>> + printk(XENLOG_ERR "Failed to add: %#"PRIx64"->%#"PRIx64"\n",
>>> + start, end);
>>> + return res;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> #define EXT_REGION_START 0x40000000ULL
>>> #define EXT_REGION_END 0x80003fffffffULL
>>>
>>> -static int __init find_unallocated_memory(const struct kernel_info *kinfo,
>>> - struct meminfo *ext_regions)
>>> +static int __init find_unallocated_memory(struct meminfo *ext_regions)
>>> {
>>> - const struct meminfo *assign_mem = &kinfo->mem;
>>> struct rangeset *unalloc_mem;
>>> paddr_t start, end;
>>> - unsigned int i;
>>> int res;
>>>
>>> dt_dprintk("Find unallocated memory for extended regions\n");
>>> @@ -771,59 +788,9 @@ static int __init find_unallocated_memory(const struct
>>> kernel_info *kinfo,
>>> if ( !unalloc_mem )
>>> return -ENOMEM;
>>>
>>> - /* Start with all available RAM */
>>> - for ( i = 0; i < bootinfo.mem.nr_banks; i++ )
>>> - {
>>> - start = bootinfo.mem.bank[i].start;
>>> - end = bootinfo.mem.bank[i].start + bootinfo.mem.bank[i].size;
>>> - res = rangeset_add_range(unalloc_mem, start, end - 1);
>>> - if ( res )
>>> - {
>>> - printk(XENLOG_ERR "Failed to add: %#"PRIx64"->%#"PRIx64"\n",
>>> - start, end);
>>> - goto out;
>>> - }
>>> - }
>>> -
>>> - /* Remove RAM assigned to Dom0 */
>>> - for ( i = 0; i < assign_mem->nr_banks; i++ )
>>> - {
>>> - start = assign_mem->bank[i].start;
>>> - end = assign_mem->bank[i].start + assign_mem->bank[i].size;
>>> - res = rangeset_remove_range(unalloc_mem, start, end - 1);
>>> - if ( res )
>>> - {
>>> - printk(XENLOG_ERR "Failed to remove: %#"PRIx64"->%#"PRIx64"\n",
>>> - start, end);
>>> - goto out;
>>> - }
>>> - }
>>> -
>>> - /* Remove reserved-memory regions */
>>> - for ( i = 0; i < bootinfo.reserved_mem.nr_banks; i++ )
>>> - {
>>> - start = bootinfo.reserved_mem.bank[i].start;
>>> - end = bootinfo.reserved_mem.bank[i].start +
>>> - bootinfo.reserved_mem.bank[i].size;
>>> - res = rangeset_remove_range(unalloc_mem, start, end - 1);
>>> - if ( res )
>>> - {
>>> - printk(XENLOG_ERR "Failed to remove: %#"PRIx64"->%#"PRIx64"\n",
>>> - start, end);
>>> - goto out;
>>> - }
>>> - }
>>> -
>>> - /* Remove grant table region */
>>> - start = kinfo->gnttab_start;
>>> - end = kinfo->gnttab_start + kinfo->gnttab_size;
>>> - res = rangeset_remove_range(unalloc_mem, start, end - 1);
>>> + res = for_each_avail_page(add_unalloc_mem, unalloc_mem);
>>> if ( res )
>>> - {
>>> - printk(XENLOG_ERR "Failed to remove: %#"PRIx64"->%#"PRIx64"\n",
>>> - start, end);
>>> goto out;
>>> - }
>>>
>>> start = EXT_REGION_START;
>>> end = min((1ULL << p2m_ipa_bits) - 1, EXT_REGION_END);
>>> @@ -840,8 +807,7 @@ out:
>>> return res;
>>> }
>>>
>>> -static int __init find_memory_holes(const struct kernel_info *kinfo,
>>> - struct meminfo *ext_regions)
>>> +static int __init find_memory_holes(struct meminfo *ext_regions)
>>> {
>>> struct dt_device_node *np;
>>> struct rangeset *mem_holes;
>>> @@ -961,9 +927,9 @@ static int __init make_hypervisor_node(struct domain *d,
>>> else
>>> {
>>> if ( !is_iommu_enabled(d) )
>>> - res = find_unallocated_memory(kinfo, ext_regions);
>>> + res = find_unallocated_memory(ext_regions);
>>> else
>>> - res = find_memory_holes(kinfo, ext_regions);
>>> + res = find_memory_holes(ext_regions);
>>>
>>> if ( res )
>>> printk(XENLOG_WARNING "Failed to allocate extended regions\n");
>>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>>> index 8fad139..7cd1020 100644
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -1572,6 +1572,40 @@ static int reserve_heap_page(struct page_info *pg)
>>>
>>> }
>>>
>>> +/* TODO heap_lock? */
>>> +int for_each_avail_page(int (*cb)(struct page_info *, unsigned int, void
>>> *),
>>> + void *data)
>>> +{
>>> + unsigned int node, zone, order;
>>> + int ret;
>>> +
>>> + for ( node = 0; node < MAX_NUMNODES; node++ )
>>> + {
>>> + if ( !avail[node] )
>>> + continue;
>>> +
>>> + for ( zone = 0; zone < NR_ZONES; zone++ )
>>> + {
>>> + for ( order = 0; order <= MAX_ORDER; order++ )
>>> + {
>>> + struct page_info *head, *tmp;
>>> +
>>> + if ( page_list_empty(&heap(node, zone, order)) )
>>> + continue;
>>> +
>>> + page_list_for_each_safe ( head, tmp, &heap(node, zone,
>>> order) )
>>> + {
>>> + ret = cb(head, order, data);
>>> + if ( ret )
>>> + return ret;
>>> + }
>>> + }
>>> + }
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> int offline_page(mfn_t mfn, int broken, uint32_t *status)
>>> {
>>> unsigned long old_info = 0;
>>> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
>>> index 667f9da..64dd3e2 100644
>>> --- a/xen/include/xen/mm.h
>>> +++ b/xen/include/xen/mm.h
>>> @@ -123,6 +123,9 @@ unsigned int online_page(mfn_t mfn, uint32_t *status);
>>> int offline_page(mfn_t mfn, int broken, uint32_t *status);
>>> int query_page_offline(mfn_t mfn, uint32_t *status);
>>>
>>> +int for_each_avail_page(int (*cb)(struct page_info *, unsigned int, void
>>> *),
>>> + void *data);
>>> +
>>> void heap_init_late(void);
>>>
>>> int assign_pages(
>>
>> I am sorry, but may I please clarify regarding that? Whether we will go this
>> new direction (free memory in page allocator) or leave things as they are
>> (bootinfo.mem - kinfo->mem - bootinfo.reserved_mem - kinfo->gnttab). This is
>> only one still unclear moment to me in current patch before preparing V3.
> I think both approaches are fine. Your original approach leads to better
> results in terms of extended regions but the difference is not drastic.
> The original approach requires more code (bad) but probably less CPU
> cycles (good).
>
> Personally I am fine either way but as Julien was the one to provide
> feedback on this it would be best to get his opinion.
>
> But in the meantime I think it is OK to send a v3 so that we can review
> the rest.
OK, thank you for the clarification.
I am also fine either way, I just wanted to know which one to pick.
Anyway, I think I will be able to make updates later on.
--
Regards,
Oleksandr Tyshchenko
next prev parent reply other threads:[~2021-09-23 17:44 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-10 18:18 [PATCH V2 0/3] Add handling of extended regions (safe ranges) on Arm (Was "xen/memory: Introduce a hypercall to provide unallocated space") Oleksandr Tyshchenko
2021-09-10 18:18 ` [PATCH V2 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo Oleksandr Tyshchenko
2021-09-16 14:49 ` Jan Beulich
2021-09-16 15:43 ` Oleksandr
2021-09-16 15:47 ` Jan Beulich
2021-09-16 16:05 ` Oleksandr
2021-09-10 18:18 ` [PATCH V2 2/3] xen/arm: Add handling of extended regions for Dom0 Oleksandr Tyshchenko
2021-09-14 0:55 ` Stefano Stabellini
2021-09-15 19:10 ` Oleksandr
2021-09-15 21:21 ` Stefano Stabellini
2021-09-16 20:57 ` Oleksandr
2021-09-16 21:30 ` Stefano Stabellini
2021-09-17 7:28 ` Oleksandr
2021-09-17 14:08 ` Oleksandr
2021-09-17 15:52 ` Julien Grall
2021-09-17 20:13 ` Oleksandr
2021-09-17 15:48 ` Julien Grall
2021-09-17 19:51 ` Oleksandr
2021-09-17 21:56 ` Stefano Stabellini
2021-09-17 22:37 ` Stefano Stabellini
2021-09-19 14:34 ` Julien Grall
2021-09-19 20:18 ` Oleksandr
2021-09-20 23:21 ` Stefano Stabellini
2021-09-21 18:14 ` Oleksandr
2021-09-21 22:00 ` Stefano Stabellini
2021-09-22 18:25 ` Oleksandr
2021-09-22 20:50 ` Stefano Stabellini
2021-09-23 10:10 ` Oleksandr
2021-09-20 23:55 ` Stefano Stabellini
2021-09-21 19:43 ` Oleksandr
2021-09-22 18:18 ` Oleksandr
2021-09-22 21:05 ` Stefano Stabellini
2021-09-23 10:11 ` Oleksandr
2021-09-18 16:59 ` Oleksandr
2021-09-23 10:41 ` Oleksandr
2021-09-23 16:38 ` Stefano Stabellini
2021-09-23 17:44 ` Oleksandr [this message]
2021-09-19 14:00 ` Julien Grall
2021-09-19 17:59 ` Oleksandr
2021-09-10 18:18 ` [PATCH V2 3/3] libxl/arm: Add handling of extended regions for DomU Oleksandr Tyshchenko
2021-09-16 22:35 ` Stefano Stabellini
2021-09-20 20:07 ` Oleksandr
2021-09-21 17:35 ` Oleksandr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28503e09-44c3-f623-bb8d-8778bb94225f@gmail.com \
--to=olekstysh@gmail.com \
--cc=Henry.Wang@arm.com \
--cc=Volodymyr_Babchuk@epam.com \
--cc=Wei.Chen@arm.com \
--cc=bertrand.marquis@arm.com \
--cc=julien@xen.org \
--cc=oleksandr_tyshchenko@epam.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).