* [PATCH v3 0/3] Cleanup and fixups for vmemmap handling
@ 2021-02-04 13:43 Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range Oscar Salvador
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Oscar Salvador @ 2021-02-04 13:43 UTC (permalink / raw)
To: akpm
Cc: David Hildenbrand, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H . Peter Anvin, Michal Hocko, linux-mm, linux-kernel,
Oscar Salvador
Hi,
this series contains cleanups to remove dead code that handles
unaligned cases for 4K and 1GB pages (patch#1 and pathc#2) when
removing the vemmmap range, and a fix (patch#3) to handle the case
when two vmemmap ranges intersect a PMD.
More details can be found in the respective changelogs.
v2 -> v3:
- Make sure we do not clear the PUD entry in case
we are not removing the whole range.
- Add Reviewed-by
v1 -> v2:
- Remove dead code in remove_pud_table as well
- Addessed feedback by David
- Place the vmemap functions that take care of unaligned PMDs
within CONFIG_SPARSEMEM_VMEMMAP
Oscar Salvador (3):
x86/vmemmap: Drop handling of 4K unaligned vmemmap range
x86/vmemmap: Drop handling of 1GB vmemmap ranges
x86/vmemmap: Handle unpopulated sub-pmd ranges
arch/x86/mm/init_64.c | 189 ++++++++++++++++++++++++++----------------
1 file changed, 118 insertions(+), 71 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range
2021-02-04 13:43 [PATCH v3 0/3] Cleanup and fixups for vmemmap handling Oscar Salvador
@ 2021-02-04 13:43 ` Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges Oscar Salvador
2 siblings, 0 replies; 7+ messages in thread
From: Oscar Salvador @ 2021-02-04 13:43 UTC (permalink / raw)
To: akpm
Cc: David Hildenbrand, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H . Peter Anvin, Michal Hocko, linux-mm, linux-kernel,
Oscar Salvador
remove_pte_table() is prepared to handle the case where either the
start or the end of the range is not PAGE aligned.
This cannot actually happen:
__populate_section_memmap enforces the range to be PMD aligned,
so as long as the size of the struct page remains multiple of 8,
the vmemmap range will be aligned to PAGE_SIZE.
Drop the dead code and place a VM_BUG_ON in vmemmap_{populate,free}
to catch nasty cases.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
arch/x86/mm/init_64.c | 48 ++++++++++++-------------------------------
1 file changed, 13 insertions(+), 35 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b5a3fa4033d3..b0e1d215c83e 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -962,7 +962,6 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end,
{
unsigned long next, pages = 0;
pte_t *pte;
- void *page_addr;
phys_addr_t phys_addr;
pte = pte_start + pte_index(addr);
@@ -983,42 +982,15 @@ remove_pte_table(pte_t *pte_start, unsigned long addr, unsigned long end,
if (phys_addr < (phys_addr_t)0x40000000)
return;
- if (PAGE_ALIGNED(addr) && PAGE_ALIGNED(next)) {
- /*
- * Do not free direct mapping pages since they were
- * freed when offlining, or simplely not in use.
- */
- if (!direct)
- free_pagetable(pte_page(*pte), 0);
-
- spin_lock(&init_mm.page_table_lock);
- pte_clear(&init_mm, addr, pte);
- spin_unlock(&init_mm.page_table_lock);
+ if (!direct)
+ free_pagetable(pte_page(*pte), 0);
- /* For non-direct mapping, pages means nothing. */
- pages++;
- } else {
- /*
- * If we are here, we are freeing vmemmap pages since
- * direct mapped memory ranges to be freed are aligned.
- *
- * If we are not removing the whole page, it means
- * other page structs in this page are being used and
- * we canot remove them. So fill the unused page_structs
- * with 0xFD, and remove the page when it is wholly
- * filled with 0xFD.
- */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pte_page(*pte));
- if (!memchr_inv(page_addr, PAGE_INUSE, PAGE_SIZE)) {
- free_pagetable(pte_page(*pte), 0);
+ spin_lock(&init_mm.page_table_lock);
+ pte_clear(&init_mm, addr, pte);
+ spin_unlock(&init_mm.page_table_lock);
- spin_lock(&init_mm.page_table_lock);
- pte_clear(&init_mm, addr, pte);
- spin_unlock(&init_mm.page_table_lock);
- }
- }
+ /* For non-direct mapping, pages means nothing. */
+ pages++;
}
/* Call free_pte_table() in remove_pmd_table(). */
@@ -1197,6 +1169,9 @@ remove_pagetable(unsigned long start, unsigned long end, bool direct,
void __ref vmemmap_free(unsigned long start, unsigned long end,
struct vmem_altmap *altmap)
{
+ VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE));
+ VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE));
+
remove_pagetable(start, end, false, altmap);
}
@@ -1556,6 +1531,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
{
int err;
+ VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE));
+ VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE));
+
if (end - start < PAGES_PER_SECTION * sizeof(struct page))
err = vmemmap_populate_basepages(start, end, node, NULL);
else if (boot_cpu_has(X86_FEATURE_PSE))
--
2.26.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges
2021-02-04 13:43 [PATCH v3 0/3] Cleanup and fixups for vmemmap handling Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range Oscar Salvador
@ 2021-02-04 13:43 ` Oscar Salvador
2021-02-04 16:06 ` David Hildenbrand
2021-02-04 13:43 ` [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges Oscar Salvador
2 siblings, 1 reply; 7+ messages in thread
From: Oscar Salvador @ 2021-02-04 13:43 UTC (permalink / raw)
To: akpm
Cc: David Hildenbrand, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H . Peter Anvin, Michal Hocko, linux-mm, linux-kernel,
Oscar Salvador
We never get to allocate 1GB pages when mapping the vmemmap range.
Drop the dead code both for the aligned and unaligned cases and leave
only the direct map handling.
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
---
arch/x86/mm/init_64.c | 35 +++++++----------------------------
1 file changed, 7 insertions(+), 28 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b0e1d215c83e..9ecb3c488ac8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
unsigned long next, pages = 0;
pmd_t *pmd_base;
pud_t *pud;
- void *page_addr;
pud = pud_start + pud_index(addr);
for (; addr < end; addr = next, pud++) {
@@ -1071,33 +1070,13 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
if (!pud_present(*pud))
continue;
- if (pud_large(*pud)) {
- if (IS_ALIGNED(addr, PUD_SIZE) &&
- IS_ALIGNED(next, PUD_SIZE)) {
- if (!direct)
- free_pagetable(pud_page(*pud),
- get_order(PUD_SIZE));
-
- spin_lock(&init_mm.page_table_lock);
- pud_clear(pud);
- spin_unlock(&init_mm.page_table_lock);
- pages++;
- } else {
- /* If here, we are freeing vmemmap pages. */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pud_page(*pud));
- if (!memchr_inv(page_addr, PAGE_INUSE,
- PUD_SIZE)) {
- free_pagetable(pud_page(*pud),
- get_order(PUD_SIZE));
-
- spin_lock(&init_mm.page_table_lock);
- pud_clear(pud);
- spin_unlock(&init_mm.page_table_lock);
- }
- }
-
+ if (pud_large(*pud) &&
+ IS_ALIGNED(addr, PUD_SIZE) &&
+ IS_ALIGNED(next, PUD_SIZE)) {
+ spin_lock(&init_mm.page_table_lock);
+ pud_clear(pud);
+ spin_unlock(&init_mm.page_table_lock);
+ pages++;
continue;
}
--
2.26.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges
2021-02-04 13:43 [PATCH v3 0/3] Cleanup and fixups for vmemmap handling Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges Oscar Salvador
@ 2021-02-04 13:43 ` Oscar Salvador
2021-02-05 8:26 ` David Hildenbrand
2021-02-05 18:05 ` kernel test robot
2 siblings, 2 replies; 7+ messages in thread
From: Oscar Salvador @ 2021-02-04 13:43 UTC (permalink / raw)
To: akpm
Cc: David Hildenbrand, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
H . Peter Anvin, Michal Hocko, linux-mm, linux-kernel,
Oscar Salvador
When the size of a struct page is not multiple of 2MB, sections do
not span a PMD anymore and so when populating them some parts of the
PMD will remain unused.
Because of this, PMDs will be left behind when depopulating sections
since remove_pmd_table() thinks that those unused parts are still in
use.
Fix this by marking the unused parts with PAGE_UNUSED, so memchr_inv()
will do the right thing and will let us free the PMD when the last user
of it is gone.
This patch is based on a similar patch by David Hildenbrand:
https://lore.kernel.org/linux-mm/20200722094558.9828-9-david@redhat.com/
https://lore.kernel.org/linux-mm/20200722094558.9828-10-david@redhat.com/
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
arch/x86/mm/init_64.c | 106 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 98 insertions(+), 8 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9ecb3c488ac8..7e8de63f02b3 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -871,7 +871,93 @@ int arch_add_memory(int nid, u64 start, u64 size,
return add_pages(nid, start_pfn, nr_pages, params);
}
-#define PAGE_INUSE 0xFD
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+#define PAGE_UNUSED 0xFD
+
+/*
+ * The unused vmemmap range, which was not yet memset(PAGE_UNUSED) ranges
+ * from unused_pmd_start to next PMD_SIZE boundary.
+ */
+static unsigned long unused_pmd_start __meminitdata;
+
+static void __meminit vmemmap_flush_unused_pmd(void)
+{
+ if (!unused_pmd_start)
+ return;
+ /*
+ * Clears (unused_pmd_start, PMD_END]
+ */
+ memset((void *)unused_pmd_start, PAGE_UNUSED,
+ ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start);
+ unused_pmd_start = 0;
+}
+
+/* Returns true if the PMD is completely unused and thus it can be freed */
+static bool __meminit vmemmap_unuse_sub_pmd(unsigned long addr, unsigned long end)
+{
+ unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
+
+ vmemmap_flush_unused_pmd();
+ memset((void *)addr, PAGE_UNUSED, end - addr);
+
+ return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
+}
+
+static void __meminit __vmemmap_use_sub_pmd(unsigned long start)
+{
+ /*
+ * As we expect to add in the same granularity as we remove, it's
+ * sufficient to mark only some piece used to block the memmap page from
+ * getting removed when removing some other adjacent memmap (just in
+ * case the first memmap never gets initialized e.g., because the memory
+ * block never gets onlined).
+ */
+ memset((void *)start, 0, sizeof(struct page));
+}
+
+static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long end)
+{
+ /*
+ * We only optimize if the new used range directly follows the
+ * previously unused range (esp., when populating consecutive sections).
+ */
+ if (unused_pmd_start == start) {
+ if (likely(IS_ALIGNED(end, PMD_SIZE)))
+ unused_pmd_start = 0;
+ else
+ unused_pmd_start = end;
+ return;
+ }
+
+ vmemmap_flush_unused_pmd();
+ __vmemmap_use_sub_pmd(start);
+}
+
+static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
+{
+ vmemmap_flush_unused_pmd();
+
+ /*
+ * Could be our memmap page is filled with PAGE_UNUSED already from a
+ * previous remove.
+ */
+ __vmemmap_use_sub_pmd(start);
+
+ /*
+ * Mark the unused parts of the new memmap range
+ */
+ if (!IS_ALIGNED(start, PMD_SIZE))
+ memset((void *)start, PAGE_UNUSED,
+ start - ALIGN_DOWN(start, PMD_SIZE));
+ /*
+ * We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of
+ * consecutive sections. Remember for the last added PMD the last
+ * unused range in the populated PMD.
+ */
+ if (!IS_ALIGNED(end, PMD_SIZE))
+ unused_pmd_start = end;
+}
+#endif
static void __meminit free_pagetable(struct page *page, int order)
{
@@ -1006,7 +1092,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
unsigned long next, pages = 0;
pte_t *pte_base;
pmd_t *pmd;
- void *page_addr;
pmd = pmd_start + pmd_index(addr);
for (; addr < end; addr = next, pmd++) {
@@ -1027,12 +1112,11 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
spin_unlock(&init_mm.page_table_lock);
pages++;
} else {
- /* If here, we are freeing vmemmap pages. */
- memset((void *)addr, PAGE_INUSE, next - addr);
-
- page_addr = page_address(pmd_page(*pmd));
- if (!memchr_inv(page_addr, PAGE_INUSE,
- PMD_SIZE)) {
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+ /*
+ * Free the PMD if the whole range is unused.
+ */
+ if (vmemmap_unuse_sub_pmd(addr, next)) {
free_hugepage_table(pmd_page(*pmd),
altmap);
@@ -1040,6 +1124,7 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
pmd_clear(pmd);
spin_unlock(&init_mm.page_table_lock);
}
+#endif
}
continue;
@@ -1492,11 +1577,16 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start,
addr_end = addr + PMD_SIZE;
p_end = p + PMD_SIZE;
+
+ if (!IS_ALIGNED(addr, PMD_SIZE) ||
+ !IS_ALIGNED(next, PMD_SIZE))
+ vmemmap_use_new_sub_pmd(addr, next);
continue;
} else if (altmap)
return -ENOMEM; /* no fallback */
} else if (pmd_large(*pmd)) {
vmemmap_verify((pte_t *)pmd, node, addr, next);
+ vmemmap_use_sub_pmd(addr, next);
continue;
}
if (vmemmap_populate_basepages(addr, next, node, NULL))
--
2.26.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges
2021-02-04 13:43 ` [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges Oscar Salvador
@ 2021-02-04 16:06 ` David Hildenbrand
0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-04 16:06 UTC (permalink / raw)
To: Oscar Salvador, akpm
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin, Michal Hocko,
linux-mm, linux-kernel
On 04.02.21 14:43, Oscar Salvador wrote:
> We never get to allocate 1GB pages when mapping the vmemmap range.
> Drop the dead code both for the aligned and unaligned cases and leave
> only the direct map handling.
>
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Suggested-by: David Hildenbrand <david@redhat.com>
> ---
> arch/x86/mm/init_64.c | 35 +++++++----------------------------
> 1 file changed, 7 insertions(+), 28 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index b0e1d215c83e..9ecb3c488ac8 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1062,7 +1062,6 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
> unsigned long next, pages = 0;
> pmd_t *pmd_base;
> pud_t *pud;
> - void *page_addr;
>
> pud = pud_start + pud_index(addr);
> for (; addr < end; addr = next, pud++) {
> @@ -1071,33 +1070,13 @@ remove_pud_table(pud_t *pud_start, unsigned long addr, unsigned long end,
> if (!pud_present(*pud))
> continue;
>
> - if (pud_large(*pud)) {
> - if (IS_ALIGNED(addr, PUD_SIZE) &&
> - IS_ALIGNED(next, PUD_SIZE)) {
> - if (!direct)
> - free_pagetable(pud_page(*pud),
> - get_order(PUD_SIZE));
> -
> - spin_lock(&init_mm.page_table_lock);
> - pud_clear(pud);
> - spin_unlock(&init_mm.page_table_lock);
> - pages++;
> - } else {
> - /* If here, we are freeing vmemmap pages. */
> - memset((void *)addr, PAGE_INUSE, next - addr);
> -
> - page_addr = page_address(pud_page(*pud));
> - if (!memchr_inv(page_addr, PAGE_INUSE,
> - PUD_SIZE)) {
> - free_pagetable(pud_page(*pud),
> - get_order(PUD_SIZE));
> -
> - spin_lock(&init_mm.page_table_lock);
> - pud_clear(pud);
> - spin_unlock(&init_mm.page_table_lock);
> - }
> - }
> -
> + if (pud_large(*pud) &&
> + IS_ALIGNED(addr, PUD_SIZE) &&
> + IS_ALIGNED(next, PUD_SIZE)) {
> + spin_lock(&init_mm.page_table_lock);
> + pud_clear(pud);
> + spin_unlock(&init_mm.page_table_lock);
> + pages++;
> continue;
> }
>
>
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges
2021-02-04 13:43 ` [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges Oscar Salvador
@ 2021-02-05 8:26 ` David Hildenbrand
2021-02-05 18:05 ` kernel test robot
1 sibling, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2021-02-05 8:26 UTC (permalink / raw)
To: Oscar Salvador, akpm
Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin, Michal Hocko,
linux-mm, linux-kernel
On 04.02.21 14:43, Oscar Salvador wrote:
> When the size of a struct page is not multiple of 2MB, sections do
> not span a PMD anymore and so when populating them some parts of the
> PMD will remain unused.
> Because of this, PMDs will be left behind when depopulating sections
> since remove_pmd_table() thinks that those unused parts are still in
> use.
>
> Fix this by marking the unused parts with PAGE_UNUSED, so memchr_inv()
> will do the right thing and will let us free the PMD when the last user
> of it is gone.
>
> This patch is based on a similar patch by David Hildenbrand:
>
> https://lore.kernel.org/linux-mm/20200722094558.9828-9-david@redhat.com/
> https://lore.kernel.org/linux-mm/20200722094558.9828-10-david@redhat.com/
>
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
> arch/x86/mm/init_64.c | 106 ++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 98 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 9ecb3c488ac8..7e8de63f02b3 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -871,7 +871,93 @@ int arch_add_memory(int nid, u64 start, u64 size,
> return add_pages(nid, start_pfn, nr_pages, params);
> }
>
> -#define PAGE_INUSE 0xFD
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +#define PAGE_UNUSED 0xFD
> +
> +/*
> + * The unused vmemmap range, which was not yet memset(PAGE_UNUSED) ranges
> + * from unused_pmd_start to next PMD_SIZE boundary.
> + */
> +static unsigned long unused_pmd_start __meminitdata;
> +
> +static void __meminit vmemmap_flush_unused_pmd(void)
> +{
> + if (!unused_pmd_start)
> + return;
> + /*
> + * Clears (unused_pmd_start, PMD_END]
> + */
> + memset((void *)unused_pmd_start, PAGE_UNUSED,
> + ALIGN(unused_pmd_start, PMD_SIZE) - unused_pmd_start);
> + unused_pmd_start = 0;
> +}
> +
> +/* Returns true if the PMD is completely unused and thus it can be freed */
> +static bool __meminit vmemmap_unuse_sub_pmd(unsigned long addr, unsigned long end)
> +{
> + unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
> +
> + vmemmap_flush_unused_pmd();
> + memset((void *)addr, PAGE_UNUSED, end - addr);
> +
> + return !memchr_inv((void *)start, PAGE_UNUSED, PMD_SIZE);
> +}
> +
> +static void __meminit __vmemmap_use_sub_pmd(unsigned long start)
> +{
> + /*
> + * As we expect to add in the same granularity as we remove, it's
> + * sufficient to mark only some piece used to block the memmap page from
> + * getting removed when removing some other adjacent memmap (just in
> + * case the first memmap never gets initialized e.g., because the memory
> + * block never gets onlined).
> + */
> + memset((void *)start, 0, sizeof(struct page));
> +}
> +
> +static void __meminit vmemmap_use_sub_pmd(unsigned long start, unsigned long end)
> +{
> + /*
> + * We only optimize if the new used range directly follows the
> + * previously unused range (esp., when populating consecutive sections).
> + */
> + if (unused_pmd_start == start) {
> + if (likely(IS_ALIGNED(end, PMD_SIZE)))
> + unused_pmd_start = 0;
> + else
> + unused_pmd_start = end;
> + return;
> + }
> +
> + vmemmap_flush_unused_pmd();
> + __vmemmap_use_sub_pmd(start);
> +}
> +
> +static void __meminit vmemmap_use_new_sub_pmd(unsigned long start, unsigned long end)
> +{
> + vmemmap_flush_unused_pmd();
> +
> + /*
> + * Could be our memmap page is filled with PAGE_UNUSED already from a
> + * previous remove.
> + */
> + __vmemmap_use_sub_pmd(start);
> +
> + /*
> + * Mark the unused parts of the new memmap range
> + */
> + if (!IS_ALIGNED(start, PMD_SIZE))
> + memset((void *)start, PAGE_UNUSED,
> + start - ALIGN_DOWN(start, PMD_SIZE));
> + /*
> + * We want to avoid memset(PAGE_UNUSED) when populating the vmemmap of
> + * consecutive sections. Remember for the last added PMD the last
> + * unused range in the populated PMD.
> + */
> + if (!IS_ALIGNED(end, PMD_SIZE))
> + unused_pmd_start = end;
> +}
> +#endif
>
> static void __meminit free_pagetable(struct page *page, int order)
> {
> @@ -1006,7 +1092,6 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
> unsigned long next, pages = 0;
> pte_t *pte_base;
> pmd_t *pmd;
> - void *page_addr;
>
> pmd = pmd_start + pmd_index(addr);
> for (; addr < end; addr = next, pmd++) {
> @@ -1027,12 +1112,11 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
> spin_unlock(&init_mm.page_table_lock);
> pages++;
> } else {
> - /* If here, we are freeing vmemmap pages. */
> - memset((void *)addr, PAGE_INUSE, next - addr);
> -
> - page_addr = page_address(pmd_page(*pmd));
> - if (!memchr_inv(page_addr, PAGE_INUSE,
> - PMD_SIZE)) {
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> + /*
> + * Free the PMD if the whole range is unused.
> + */
> + if (vmemmap_unuse_sub_pmd(addr, next)) {
> free_hugepage_table(pmd_page(*pmd),
> altmap);
>
> @@ -1040,6 +1124,7 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long addr, unsigned long end,
> pmd_clear(pmd);
> spin_unlock(&init_mm.page_table_lock);
> }
> +#endif
> }
>
> continue;
> @@ -1492,11 +1577,16 @@ static int __meminit vmemmap_populate_hugepages(unsigned long start,
>
> addr_end = addr + PMD_SIZE;
> p_end = p + PMD_SIZE;
> +
> + if (!IS_ALIGNED(addr, PMD_SIZE) ||
> + !IS_ALIGNED(next, PMD_SIZE))
> + vmemmap_use_new_sub_pmd(addr, next);
> continue;
> } else if (altmap)
> return -ENOMEM; /* no fallback */
> } else if (pmd_large(*pmd)) {
> vmemmap_verify((pte_t *)pmd, node, addr, next);
> + vmemmap_use_sub_pmd(addr, next);
> continue;
> }
> if (vmemmap_populate_basepages(addr, next, node, NULL))
>
LGTM, thanks
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges
2021-02-04 13:43 ` [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges Oscar Salvador
2021-02-05 8:26 ` David Hildenbrand
@ 2021-02-05 18:05 ` kernel test robot
1 sibling, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-02-05 18:05 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 4578 bytes --]
Hi Oscar,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on tip/x86/mm]
[also build test ERROR on hnaz-linux-mm/master v5.11-rc6 next-20210125]
[cannot apply to luto/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Oscar-Salvador/Cleanup-and-fixups-for-vmemmap-handling/20210204-215019
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 167dcfc08b0b1f964ea95d410aa496fd78adf475
config: x86_64-randconfig-r026-20210205 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# https://github.com/0day-ci/linux/commit/8eed800ab9b1129124ca4af26be911f1ea800339
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Oscar-Salvador/Cleanup-and-fixups-for-vmemmap-handling/20210204-215019
git checkout 8eed800ab9b1129124ca4af26be911f1ea800339
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
>> arch/x86/mm/init_64.c:1583:6: error: implicit declaration of function 'vmemmap_use_new_sub_pmd' [-Werror,-Wimplicit-function-declaration]
vmemmap_use_new_sub_pmd(addr, next);
^
>> arch/x86/mm/init_64.c:1589:4: error: implicit declaration of function 'vmemmap_use_sub_pmd' [-Werror,-Wimplicit-function-declaration]
vmemmap_use_sub_pmd(addr, next);
^
2 errors generated.
vim +/vmemmap_use_new_sub_pmd +1583 arch/x86/mm/init_64.c
1530
1531 static int __meminit vmemmap_populate_hugepages(unsigned long start,
1532 unsigned long end, int node, struct vmem_altmap *altmap)
1533 {
1534 unsigned long addr;
1535 unsigned long next;
1536 pgd_t *pgd;
1537 p4d_t *p4d;
1538 pud_t *pud;
1539 pmd_t *pmd;
1540
1541 for (addr = start; addr < end; addr = next) {
1542 next = pmd_addr_end(addr, end);
1543
1544 pgd = vmemmap_pgd_populate(addr, node);
1545 if (!pgd)
1546 return -ENOMEM;
1547
1548 p4d = vmemmap_p4d_populate(pgd, addr, node);
1549 if (!p4d)
1550 return -ENOMEM;
1551
1552 pud = vmemmap_pud_populate(p4d, addr, node);
1553 if (!pud)
1554 return -ENOMEM;
1555
1556 pmd = pmd_offset(pud, addr);
1557 if (pmd_none(*pmd)) {
1558 void *p;
1559
1560 p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
1561 if (p) {
1562 pte_t entry;
1563
1564 entry = pfn_pte(__pa(p) >> PAGE_SHIFT,
1565 PAGE_KERNEL_LARGE);
1566 set_pmd(pmd, __pmd(pte_val(entry)));
1567
1568 /* check to see if we have contiguous blocks */
1569 if (p_end != p || node_start != node) {
1570 if (p_start)
1571 pr_debug(" [%lx-%lx] PMD -> [%p-%p] on node %d\n",
1572 addr_start, addr_end-1, p_start, p_end-1, node_start);
1573 addr_start = addr;
1574 node_start = node;
1575 p_start = p;
1576 }
1577
1578 addr_end = addr + PMD_SIZE;
1579 p_end = p + PMD_SIZE;
1580
1581 if (!IS_ALIGNED(addr, PMD_SIZE) ||
1582 !IS_ALIGNED(next, PMD_SIZE))
> 1583 vmemmap_use_new_sub_pmd(addr, next);
1584 continue;
1585 } else if (altmap)
1586 return -ENOMEM; /* no fallback */
1587 } else if (pmd_large(*pmd)) {
1588 vmemmap_verify((pte_t *)pmd, node, addr, next);
> 1589 vmemmap_use_sub_pmd(addr, next);
1590 continue;
1591 }
1592 if (vmemmap_populate_basepages(addr, next, node, NULL))
1593 return -ENOMEM;
1594 }
1595 return 0;
1596 }
1597
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 36101 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-05 18:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04 13:43 [PATCH v3 0/3] Cleanup and fixups for vmemmap handling Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 1/3] x86/vmemmap: Drop handling of 4K unaligned vmemmap range Oscar Salvador
2021-02-04 13:43 ` [PATCH v3 2/3] x86/vmemmap: Drop handling of 1GB vmemmap ranges Oscar Salvador
2021-02-04 16:06 ` David Hildenbrand
2021-02-04 13:43 ` [PATCH v3 3/3] x86/vmemmap: Handle unpopulated sub-pmd ranges Oscar Salvador
2021-02-05 8:26 ` David Hildenbrand
2021-02-05 18:05 ` kernel test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.