All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
@ 2017-02-15 20:58 Nicolai Stange
  2017-02-15 20:58 ` [RFC 1/3] sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range Nicolai Stange
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Nicolai Stange @ 2017-02-15 20:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: Andrew Morton, linux-mm, Nicolai Stange

Hi Dan,

your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
populate_section_memmap()") seems to cause some issues with respect to
Kasan initialization on x86.

This is because Kasan's initialization (ab)uses the arch provided
vmemmap_populate().

The first one is a boot failure, see [1/3]. The commit before the
aforementioned one works fine.

The second one, i.e. [2/3], is something that hit my eye while browsing
the source and I verified that this is indeed an issue by printk'ing and
dumping the page tables.

The third one are excessive warnings from vmemmap_verify() due to Kasan's
NUMA_NO_NODE page populations.


I'll be travelling the next two days and certainly not be able to respond
or polish these patches any further. Furthermore, the next merge window is
close. So please, take these three patches as bug reports only, meant to
illustrate the issues. Feel free to use, change and adopt them however
you deemed best.

That being said,
- [2/3] will break arm64 due to the current lack of a pmd_large().
- Maybe it's easier and better to restore former behaviour by letting
  Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
  directly rather than vmemmap_populate(). This would require x86_64
  implying X86_FEATURE_PSE though. I'm not sure whether this holds,
  in particular not since the vmemmap_populate() from
  arch/x86/mm/init_64.c checks for it.

Thanks,

Nicolai

Nicolai Stange (3):
  sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range
  sparse-vmemmap: make vmemmap_populate_basepages() skip HP mapped
    ranges
  sparse-vmemmap: let vmemmap_verify() ignore NUMA_NO_NODE requests

 mm/sparse-vmemmap.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

-- 
2.11.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC 1/3] sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range
  2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
@ 2017-02-15 20:58 ` Nicolai Stange
  2017-02-15 20:58 ` [RFC 2/3] sparse-vmemmap: make vmemmap_populate_basepages() skip HP mapped ranges Nicolai Stange
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Nicolai Stange @ 2017-02-15 20:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: Andrew Morton, linux-mm, Nicolai Stange

vmemmap_populate_basepages() takes two memory addresses, start and end,
and attempts to populate the page range covering it.

Due to the way this is done, namely by means of a

  for (addr = start; addr < end; addr += PAGE_SIZE) {
     ...
  }

loop, this misses the last necessary page in case of

  start % PAGE_SIZE > end % PAGE_SIZE.

On x86, Kasan's initizalization in arch/x86/mm/kasan_init_64.c (ab)uses
the arch-provided vmemmap_populate() for shadow memory population.
The start and end addresses passed aren't necessarily page aligned.

With commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
populate_section_memmap()"), the x86 specific vmemmap_populate() sometimes
uses the aforementioned vmemmap_populate_basepages(). This results in
non-populated shadow memory:

  BUG: unable to handle kernel paging request at ffffed0017b4d000
  IP: memset_erms+0x9/0x10
  [...]
  Call Trace:
   ? kasan_free_pages+0x50/0x60
   free_hot_cold_page+0x382/0x9e0
   [...]
   __free_pages+0xe8/0x100
   [...]
   __free_pages_bootmem+0x1c9/0x202
   ? page_alloc_init_late+0x3a/0x3a
   ? kmemleak_free_part+0x42/0x150
   free_bootmem_late+0x5f/0x7d
   efi_free_boot_services+0x10d/0x233
   [...]

Fix this by making vmemmap_populate_basepages() round the start argument
down to a multiple of PAGE_SIZE such that the above condition can never
hold.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 mm/sparse-vmemmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 8679d4a81b98..d45bd2714a2b 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -223,7 +223,7 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node)
 int __meminit vmemmap_populate_basepages(unsigned long start,
 					 unsigned long end, int node)
 {
-	unsigned long addr = start;
+	unsigned long addr = start & ~(PAGE_SIZE - 1);
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
-- 
2.11.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC 2/3] sparse-vmemmap: make vmemmap_populate_basepages() skip HP mapped ranges
  2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
  2017-02-15 20:58 ` [RFC 1/3] sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range Nicolai Stange
@ 2017-02-15 20:58 ` Nicolai Stange
  2017-02-15 20:58 ` [RFC 3/3] sparse-vmemmap: let vmemmap_verify() ignore NUMA_NO_NODE requests Nicolai Stange
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Nicolai Stange @ 2017-02-15 20:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: Andrew Morton, linux-mm, Nicolai Stange

WARNING: this will break at least arm64 due to the lack of a pmd_large()!!!

While x86' vmemmap_populate_hugepages() checks whether the range to
populate has already been covered in part by conventional pages and falls
back to vmemmap_populate_basepages() if so, the converse is not true:
vmemmap_populate_basepages() will happily allocate conventional pages for
regions already covered by a hugepage and write the corresponding PTEs to
that hugepage, pretending that it's a PMD. At best, this results in those
conventional pages getting leaked.

Such a situation does exist: the initialization code in
arch/x86/mm/kasan_init_64.c calls into vmemmap_populate().
Since commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
populate_section_memmap()"), the latter invokes either
vmemmap_populate_basepages() or vmemmap_populate_hugepages(), depending on
the requested region's size. vmemmap_populate_basepages() invocations
on regions already covered by a hugepage have actually been obvserved in
this context.

Make vmemmap_populate_basepages() skip regions covered by hugepages
already.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 mm/sparse-vmemmap.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index d45bd2714a2b..f08872b58e48 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -224,12 +224,13 @@ int __meminit vmemmap_populate_basepages(unsigned long start,
 					 unsigned long end, int node)
 {
 	unsigned long addr = start & ~(PAGE_SIZE - 1);
+	unsigned long next;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
-	for (; addr < end; addr += PAGE_SIZE) {
+	for (; addr < end; addr = next) {
 		pgd = vmemmap_pgd_populate(addr, node);
 		if (!pgd)
 			return -ENOMEM;
@@ -239,10 +240,16 @@ int __meminit vmemmap_populate_basepages(unsigned long start,
 		pmd = vmemmap_pmd_populate(pud, addr, node);
 		if (!pmd)
 			return -ENOMEM;
-		pte = vmemmap_pte_populate(pmd, addr, node);
-		if (!pte)
-			return -ENOMEM;
-		vmemmap_verify(pte, node, addr, addr + PAGE_SIZE);
+		if (!pmd_large(*pmd)) {
+			pte = vmemmap_pte_populate(pmd, addr, node);
+			if (!pte)
+				return -ENOMEM;
+			next = addr + PAGE_SIZE;
+		} else {
+			pte = (pte_t *)pmd;
+			next = (addr & ~(PMD_SIZE - 1)) + PMD_SIZE;
+		}
+		vmemmap_verify(pte, node, addr, next);
 	}
 
 	return 0;
-- 
2.11.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC 3/3] sparse-vmemmap: let vmemmap_verify() ignore NUMA_NO_NODE requests
  2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
  2017-02-15 20:58 ` [RFC 1/3] sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range Nicolai Stange
  2017-02-15 20:58 ` [RFC 2/3] sparse-vmemmap: make vmemmap_populate_basepages() skip HP mapped ranges Nicolai Stange
@ 2017-02-15 20:58 ` Nicolai Stange
  2017-02-15 21:10 ` [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Andrew Morton
  2017-02-25 19:03 ` Dan Williams
  4 siblings, 0 replies; 12+ messages in thread
From: Nicolai Stange @ 2017-02-15 20:58 UTC (permalink / raw)
  To: Dan Williams; +Cc: Andrew Morton, linux-mm, Nicolai Stange

On x86, Kasan's initizalization in arch/x86/mm/kasan_init_64.c calls
vmemmap_populate() and thus, since commit 7b79d10a2d64 ("mm: convert
kmalloc_section_memmap() to populate_section_memmap()"),
vmemmap_populate_basepages() with a node value of NUMA_NO_NODE.

Since a page's actual NUMA node is never equal to NUMA_NO_NODE, this
results in excessive warnings from vmemmap_verify():

  [ffffed00179c6e00-ffffed00179c7dff] potential offnode page_structs
  [ffffed00179c7e00-ffffed00179c8dff] potential offnode page_structs
  [ffffed00179c8e00-ffffed00179c9dff] potential offnode page_structs
  [ffffed00179c9e00-ffffed00179cadff] potential offnode page_structs
  [ffffed00179cae00-ffffed00179cbdff] potential offnode page_structs
  [...]

Make vmemmap_verify() return early if the requested node equals
NUMA_NO_NODE.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 mm/sparse-vmemmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index f08872b58e48..e38aaf6c312c 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -165,6 +165,9 @@ void __meminit vmemmap_verify(pte_t *pte, int node,
 	unsigned long pfn = pte_pfn(*pte);
 	int actual_node = early_pfn_to_nid(pfn);
 
+	if (node == NUMA_NO_NODE)
+		return;
+
 	if (node_distance(actual_node, node) > LOCAL_DISTANCE)
 		pr_warn("[%lx-%lx] potential offnode page_structs\n",
 			start, end - 1);
-- 
2.11.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
                   ` (2 preceding siblings ...)
  2017-02-15 20:58 ` [RFC 3/3] sparse-vmemmap: let vmemmap_verify() ignore NUMA_NO_NODE requests Nicolai Stange
@ 2017-02-15 21:10 ` Andrew Morton
  2017-02-15 21:26   ` Dan Williams
  2017-02-25 19:03 ` Dan Williams
  4 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2017-02-15 21:10 UTC (permalink / raw)
  To: Nicolai Stange; +Cc: Dan Williams, linux-mm

On Wed, 15 Feb 2017 21:58:23 +0100 Nicolai Stange <nicstange@gmail.com> wrote:

> Hi Dan,
> 
> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
> populate_section_memmap()") seems to cause some issues with respect to
> Kasan initialization on x86.
> 
> This is because Kasan's initialization (ab)uses the arch provided
> vmemmap_populate().
> 
> The first one is a boot failure, see [1/3]. The commit before the
> aforementioned one works fine.
> 
> The second one, i.e. [2/3], is something that hit my eye while browsing
> the source and I verified that this is indeed an issue by printk'ing and
> dumping the page tables.
> 
> The third one are excessive warnings from vmemmap_verify() due to Kasan's
> NUMA_NO_NODE page populations.

urggggh.

That means these two series:

mm-fix-type-width-of-section-to-from-pfn-conversion-macros.patch
mm-devm_memremap_pages-use-multi-order-radix-for-zone_device-lookups.patch
mm-introduce-struct-mem_section_usage-to-track-partial-population-of-a-section.patch
mm-introduce-common-definitions-for-the-size-and-mask-of-a-section.patch
mm-cleanup-sparse_init_one_section-return-value.patch
mm-track-active-portions-of-a-section-at-boot.patch
mm-track-active-portions-of-a-section-at-boot-fix.patch
mm-track-active-portions-of-a-section-at-boot-fix-fix.patch
mm-fix-register_new_memory-zone-type-detection.patch
mm-convert-kmalloc_section_memmap-to-populate_section_memmap.patch
mm-prepare-for-hot-add-remove-of-sub-section-ranges.patch
mm-support-section-unaligned-zone_device-memory-ranges.patch
mm-support-section-unaligned-zone_device-memory-ranges-fix.patch
mm-support-section-unaligned-zone_device-memory-ranges-fix-2.patch
mm-enable-section-unaligned-devm_memremap_pages.patch
libnvdimm-pfn-dax-stop-padding-pmem-namespaces-to-section-alignment.patch

and

mm-devm_memremap_pages-hold-device_hotplug-lock-over-mem_hotplug_begin-done.patch
mm-validate-device_hotplug-is-held-for-memory-hotplug.patch

aren't mergable into 4.10 and presumably won't be fixed in time.  I
think I'll drop all the above.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-15 21:10 ` [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Andrew Morton
@ 2017-02-15 21:26   ` Dan Williams
  2017-02-15 21:54     ` Andrew Morton
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Williams @ 2017-02-15 21:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nicolai Stange, Linux MM

On Wed, Feb 15, 2017 at 1:10 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 15 Feb 2017 21:58:23 +0100 Nicolai Stange <nicstange@gmail.com> wrote:
>
>> Hi Dan,
>>
>> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
>> populate_section_memmap()") seems to cause some issues with respect to
>> Kasan initialization on x86.
>>
>> This is because Kasan's initialization (ab)uses the arch provided
>> vmemmap_populate().
>>
>> The first one is a boot failure, see [1/3]. The commit before the
>> aforementioned one works fine.
>>
>> The second one, i.e. [2/3], is something that hit my eye while browsing
>> the source and I verified that this is indeed an issue by printk'ing and
>> dumping the page tables.
>>
>> The third one are excessive warnings from vmemmap_verify() due to Kasan's
>> NUMA_NO_NODE page populations.
>
> urggggh.
>
> That means these two series:
>
> mm-fix-type-width-of-section-to-from-pfn-conversion-macros.patch
> mm-devm_memremap_pages-use-multi-order-radix-for-zone_device-lookups.patch
> mm-introduce-struct-mem_section_usage-to-track-partial-population-of-a-section.patch
> mm-introduce-common-definitions-for-the-size-and-mask-of-a-section.patch
> mm-cleanup-sparse_init_one_section-return-value.patch
> mm-track-active-portions-of-a-section-at-boot.patch
> mm-track-active-portions-of-a-section-at-boot-fix.patch
> mm-track-active-portions-of-a-section-at-boot-fix-fix.patch
> mm-fix-register_new_memory-zone-type-detection.patch
> mm-convert-kmalloc_section_memmap-to-populate_section_memmap.patch
> mm-prepare-for-hot-add-remove-of-sub-section-ranges.patch
> mm-support-section-unaligned-zone_device-memory-ranges.patch
> mm-support-section-unaligned-zone_device-memory-ranges-fix.patch
> mm-support-section-unaligned-zone_device-memory-ranges-fix-2.patch
> mm-enable-section-unaligned-devm_memremap_pages.patch
> libnvdimm-pfn-dax-stop-padding-pmem-namespaces-to-section-alignment.patch
>

Yes, let's drop these and try again for 4.12. Thanks for the report
and the debug Nicolai!

> and
>
> mm-devm_memremap_pages-hold-device_hotplug-lock-over-mem_hotplug_begin-done.patch
> mm-validate-device_hotplug-is-held-for-memory-hotplug.patch

No, these are separate and are still valid for the merge window.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-15 21:26   ` Dan Williams
@ 2017-02-15 21:54     ` Andrew Morton
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2017-02-15 21:54 UTC (permalink / raw)
  To: Dan Williams; +Cc: Nicolai Stange, Linux MM

On Wed, 15 Feb 2017 13:26:43 -0800 Dan Williams <dan.j.williams@intel.com> wrote:

> >> The second one, i.e. [2/3], is something that hit my eye while browsing
> >> the source and I verified that this is indeed an issue by printk'ing and
> >> dumping the page tables.
> >>
> >> The third one are excessive warnings from vmemmap_verify() due to Kasan's
> >> NUMA_NO_NODE page populations.
> >
> > urggggh.
> >
> > That means these two series:
> >
> > mm-fix-type-width-of-section-to-from-pfn-conversion-macros.patch
> > mm-devm_memremap_pages-use-multi-order-radix-for-zone_device-lookups.patch
> > mm-introduce-struct-mem_section_usage-to-track-partial-population-of-a-section.patch
> > mm-introduce-common-definitions-for-the-size-and-mask-of-a-section.patch
> > mm-cleanup-sparse_init_one_section-return-value.patch
> > mm-track-active-portions-of-a-section-at-boot.patch
> > mm-track-active-portions-of-a-section-at-boot-fix.patch
> > mm-track-active-portions-of-a-section-at-boot-fix-fix.patch
> > mm-fix-register_new_memory-zone-type-detection.patch
> > mm-convert-kmalloc_section_memmap-to-populate_section_memmap.patch
> > mm-prepare-for-hot-add-remove-of-sub-section-ranges.patch
> > mm-support-section-unaligned-zone_device-memory-ranges.patch
> > mm-support-section-unaligned-zone_device-memory-ranges-fix.patch
> > mm-support-section-unaligned-zone_device-memory-ranges-fix-2.patch
> > mm-enable-section-unaligned-devm_memremap_pages.patch
> > libnvdimm-pfn-dax-stop-padding-pmem-namespaces-to-section-alignment.patch
> >
> 
> Yes, let's drop these and try again for 4.12. Thanks for the report
> and the debug Nicolai!

Please don't lose track of

mm-track-active-portions-of-a-section-at-boot-fix.patch
mm-track-active-portions-of-a-section-at-boot-fix-fix.patch
mm-support-section-unaligned-zone_device-memory-ranges-fix.patch
 mm-support-section-unaligned-zone_device-memory-ranges-fix-2.patch

> > and
> >
> > mm-devm_memremap_pages-hold-device_hotplug-lock-over-mem_hotplug_begin-done.patch
> > mm-validate-device_hotplug-is-held-for-memory-hotplug.patch
> 
> No, these are separate and are still valid for the merge window.

OK.  A bunch of rejects needed fixing.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
                   ` (3 preceding siblings ...)
  2017-02-15 21:10 ` [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Andrew Morton
@ 2017-02-25 19:03 ` Dan Williams
  2017-02-27  9:34   ` Dmitry Vyukov
  2017-03-03 16:08   ` Andrey Ryabinin
  4 siblings, 2 replies; 12+ messages in thread
From: Dan Williams @ 2017-02-25 19:03 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Andrew Morton, Linux MM, Dmitry Vyukov, Alexander Potapenko,
	Andrey Ryabinin

[ adding kasan folks ]

On Wed, Feb 15, 2017 at 12:58 PM, Nicolai Stange <nicstange@gmail.com> wrote:
> Hi Dan,
>
> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
> populate_section_memmap()") seems to cause some issues with respect to
> Kasan initialization on x86.
>
> This is because Kasan's initialization (ab)uses the arch provided
> vmemmap_populate().
>
> The first one is a boot failure, see [1/3]. The commit before the
> aforementioned one works fine.
>
> The second one, i.e. [2/3], is something that hit my eye while browsing
> the source and I verified that this is indeed an issue by printk'ing and
> dumping the page tables.
>
> The third one are excessive warnings from vmemmap_verify() due to Kasan's
> NUMA_NO_NODE page populations.
>
>
> I'll be travelling the next two days and certainly not be able to respond
> or polish these patches any further. Furthermore, the next merge window is
> close. So please, take these three patches as bug reports only, meant to
> illustrate the issues. Feel free to use, change and adopt them however
> you deemed best.
>
> That being said,
> - [2/3] will break arm64 due to the current lack of a pmd_large().
> - Maybe it's easier and better to restore former behaviour by letting
>   Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
>   directly rather than vmemmap_populate(). This would require x86_64
>   implying X86_FEATURE_PSE though. I'm not sure whether this holds,
>   in particular not since the vmemmap_populate() from
>   arch/x86/mm/init_64.c checks for it.

I think your intuition is correct here, and yes, it is a safe
assumption that x86_64 implies X86_FEATURE_PSE. The following patch
works for me. If there's no objections I'll roll it into the series
and resubmit the sub-section hotplug support after testing on top of
4.11-rc1.

--- gmail mangled-whitespace patch follows ---

Subject: x86, kasan: clarify kasan's dependency on vmemmap_populate_hugepages()

From: Dan Williams <dan.j.williams@intel.com>

Historically kasan has not been careful about whether vmemmap_populate()
internally allocates a section worth of memmap even if the parameters
call for less.  For example, a request to shadow map a single page is
internally results in mapping the full section (128MB) that contains
that page. Also, kasan has not been careful to handle cases where this
section promotion causes overlaps / overrides of previous calls to
vmemmap_populate().

Before we teach vmemmap_populate() to support sub-section hotplug,
arrange for kasan to explicitly avoid vmemmap_populate_basepages().
This should be functionally equivalent to the current state since
CONFIG_KASAN requires x86_64 (implies PSE) and it does not collide with
sub-section hotplug support since CONFIG_KASAN disables
CONFIG_MEMORY_HOTPLUG.

Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/mm/init_64.c       |    2 +-
 arch/x86/mm/kasan_init_64.c |   30 ++++++++++++++++++++++++++----
 include/linux/mm.h          |    2 ++
 3 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index af85b686a7b0..32e0befcbfe8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1157,7 +1157,7 @@ static long __meminitdata addr_start, addr_end;
 static void __meminitdata *p_start, *p_end;
 static int __meminitdata node_start;

-static int __meminit vmemmap_populate_hugepages(unsigned long start,
+int __meminit vmemmap_populate_hugepages(unsigned long start,
  unsigned long end, int node, struct vmem_altmap *altmap)
 {
  unsigned long addr;
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 0493c17b8a51..4cfc0fb43af3 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -12,6 +12,25 @@
 extern pgd_t early_level4_pgt[PTRS_PER_PGD];
 extern struct range pfn_mapped[E820_X_MAX];

+static int __init kasan_vmemmap_populate(unsigned long start,
unsigned long end)
+{
+ /*
+ * Historically kasan has not been careful about whether
+ * vmemmap_populate() internally allocates a section worth of memmap
+ * even if the parameters call for less.  For example, a request to
+ * shadow map a single page is internally results in mapping the full
+ * section (128MB) that contains that page.  Also, kasan has not been
+ * careful to handle cases where this section promotion causes overlaps
+ * / overrides of previous calls to vmemmap_populate(). Make this
+ * implicit dependency explicit to avoid interactions with sub-section
+ * memory hotplug support.
+ */
+ if (!boot_cpu_has(X86_FEATURE_PSE))
+ return -ENXIO;
+
+ return vmemmap_populate_hugepages(start, end, NUMA_NO_NODE, NULL);
+}
+
 static int __init map_range(struct range *range)
 {
  unsigned long start;
@@ -25,7 +44,7 @@ static int __init map_range(struct range *range)
  * to slightly speed up fastpath. In some rare cases we could cross
  * boundary of mapped shadow, so we just map some more here.
  */
- return vmemmap_populate(start, end + 1, NUMA_NO_NODE);
+ return kasan_vmemmap_populate(start, end + 1);
 }

 static void __init clear_pgds(unsigned long start,
@@ -89,6 +108,10 @@ void __init kasan_init(void)
 {
  int i;

+ /* should never trigger, x86_64 implies PSE */
+ WARN(!boot_cpu_has(X86_FEATURE_PSE),
+ "kasan requires page size extensions\n");
+
 #ifdef CONFIG_KASAN_INLINE
  register_die_notifier(&kasan_die_notifier);
 #endif
@@ -113,9 +136,8 @@ void __init kasan_init(void)
  kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
  kasan_mem_to_shadow((void *)__START_KERNEL_map));

- vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
- (unsigned long)kasan_mem_to_shadow(_end),
- NUMA_NO_NODE);
+ kasan_vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
+ (unsigned long)kasan_mem_to_shadow(_end));

  kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
  (void *)KASAN_SHADOW_END);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b84615b0f64c..fb3e84aec5c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2331,6 +2331,8 @@ void vmemmap_verify(pte_t *, int, unsigned long,
unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long end,
        int node);
 int vmemmap_populate(unsigned long start, unsigned long end, int node);
+int vmemmap_populate_hugepages(unsigned long start, unsigned long
end, int node,
+ struct vmem_altmap *altmap);
 void vmemmap_populate_print_last(void);
 #ifdef CONFIG_MEMORY_HOTPLUG
 void vmemmap_free(unsigned long start, unsigned long end);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-25 19:03 ` Dan Williams
@ 2017-02-27  9:34   ` Dmitry Vyukov
  2017-03-03 16:08   ` Andrey Ryabinin
  1 sibling, 0 replies; 12+ messages in thread
From: Dmitry Vyukov @ 2017-02-27  9:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Nicolai Stange, Andrew Morton, Linux MM, Alexander Potapenko,
	Andrey Ryabinin, kasan-dev

On Sat, Feb 25, 2017 at 8:03 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> [ adding kasan folks ]
>
> On Wed, Feb 15, 2017 at 12:58 PM, Nicolai Stange <nicstange@gmail.com> wrote:
>> Hi Dan,
>>
>> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
>> populate_section_memmap()") seems to cause some issues with respect to
>> Kasan initialization on x86.
>>
>> This is because Kasan's initialization (ab)uses the arch provided
>> vmemmap_populate().
>>
>> The first one is a boot failure, see [1/3]. The commit before the
>> aforementioned one works fine.
>>
>> The second one, i.e. [2/3], is something that hit my eye while browsing
>> the source and I verified that this is indeed an issue by printk'ing and
>> dumping the page tables.
>>
>> The third one are excessive warnings from vmemmap_verify() due to Kasan's
>> NUMA_NO_NODE page populations.
>>
>>
>> I'll be travelling the next two days and certainly not be able to respond
>> or polish these patches any further. Furthermore, the next merge window is
>> close. So please, take these three patches as bug reports only, meant to
>> illustrate the issues. Feel free to use, change and adopt them however
>> you deemed best.
>>
>> That being said,
>> - [2/3] will break arm64 due to the current lack of a pmd_large().
>> - Maybe it's easier and better to restore former behaviour by letting
>>   Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
>>   directly rather than vmemmap_populate(). This would require x86_64
>>   implying X86_FEATURE_PSE though. I'm not sure whether this holds,
>>   in particular not since the vmemmap_populate() from
>>   arch/x86/mm/init_64.c checks for it.
>
> I think your intuition is correct here, and yes, it is a safe
> assumption that x86_64 implies X86_FEATURE_PSE. The following patch
> works for me. If there's no objections I'll roll it into the series
> and resubmit the sub-section hotplug support after testing on top of
> 4.11-rc1.
>
> --- gmail mangled-whitespace patch follows ---
>
> Subject: x86, kasan: clarify kasan's dependency on vmemmap_populate_hugepages()
>
> From: Dan Williams <dan.j.williams@intel.com>
>
> Historically kasan has not been careful about whether vmemmap_populate()
> internally allocates a section worth of memmap even if the parameters
> call for less.  For example, a request to shadow map a single page is
> internally results in mapping the full section (128MB) that contains
> that page. Also, kasan has not been careful to handle cases where this
> section promotion causes overlaps / overrides of previous calls to
> vmemmap_populate().
>
> Before we teach vmemmap_populate() to support sub-section hotplug,
> arrange for kasan to explicitly avoid vmemmap_populate_basepages().
> This should be functionally equivalent to the current state since
> CONFIG_KASAN requires x86_64 (implies PSE) and it does not collide with
> sub-section hotplug support since CONFIG_KASAN disables
> CONFIG_MEMORY_HOTPLUG.
>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Reported-by: Nicolai Stange <nicstange@gmail.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/x86/mm/init_64.c       |    2 +-
>  arch/x86/mm/kasan_init_64.c |   30 ++++++++++++++++++++++++++----
>  include/linux/mm.h          |    2 ++
>  3 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index af85b686a7b0..32e0befcbfe8 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1157,7 +1157,7 @@ static long __meminitdata addr_start, addr_end;
>  static void __meminitdata *p_start, *p_end;
>  static int __meminitdata node_start;
>
> -static int __meminit vmemmap_populate_hugepages(unsigned long start,
> +int __meminit vmemmap_populate_hugepages(unsigned long start,
>   unsigned long end, int node, struct vmem_altmap *altmap)
>  {
>   unsigned long addr;
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index 0493c17b8a51..4cfc0fb43af3 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -12,6 +12,25 @@
>  extern pgd_t early_level4_pgt[PTRS_PER_PGD];
>  extern struct range pfn_mapped[E820_X_MAX];
>
> +static int __init kasan_vmemmap_populate(unsigned long start,
> unsigned long end)
> +{
> + /*
> + * Historically kasan has not been careful about whether
> + * vmemmap_populate() internally allocates a section worth of memmap
> + * even if the parameters call for less.  For example, a request to
> + * shadow map a single page is internally results in mapping the full
> + * section (128MB) that contains that page.  Also, kasan has not been
> + * careful to handle cases where this section promotion causes overlaps
> + * / overrides of previous calls to vmemmap_populate(). Make this
> + * implicit dependency explicit to avoid interactions with sub-section
> + * memory hotplug support.
> + */
> + if (!boot_cpu_has(X86_FEATURE_PSE))
> + return -ENXIO;
> +
> + return vmemmap_populate_hugepages(start, end, NUMA_NO_NODE, NULL);
> +}
> +
>  static int __init map_range(struct range *range)
>  {
>   unsigned long start;
> @@ -25,7 +44,7 @@ static int __init map_range(struct range *range)
>   * to slightly speed up fastpath. In some rare cases we could cross
>   * boundary of mapped shadow, so we just map some more here.
>   */
> - return vmemmap_populate(start, end + 1, NUMA_NO_NODE);
> + return kasan_vmemmap_populate(start, end + 1);
>  }
>
>  static void __init clear_pgds(unsigned long start,
> @@ -89,6 +108,10 @@ void __init kasan_init(void)
>  {
>   int i;
>
> + /* should never trigger, x86_64 implies PSE */
> + WARN(!boot_cpu_has(X86_FEATURE_PSE),
> + "kasan requires page size extensions\n");
> +
>  #ifdef CONFIG_KASAN_INLINE
>   register_die_notifier(&kasan_die_notifier);
>  #endif
> @@ -113,9 +136,8 @@ void __init kasan_init(void)
>   kasan_mem_to_shadow((void *)PAGE_OFFSET + MAXMEM),
>   kasan_mem_to_shadow((void *)__START_KERNEL_map));
>
> - vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
> - (unsigned long)kasan_mem_to_shadow(_end),
> - NUMA_NO_NODE);
> + kasan_vmemmap_populate((unsigned long)kasan_mem_to_shadow(_stext),
> + (unsigned long)kasan_mem_to_shadow(_end));
>
>   kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
>   (void *)KASAN_SHADOW_END);
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b84615b0f64c..fb3e84aec5c4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2331,6 +2331,8 @@ void vmemmap_verify(pte_t *, int, unsigned long,
> unsigned long);
>  int vmemmap_populate_basepages(unsigned long start, unsigned long end,
>         int node);
>  int vmemmap_populate(unsigned long start, unsigned long end, int node);
> +int vmemmap_populate_hugepages(unsigned long start, unsigned long
> end, int node,
> + struct vmem_altmap *altmap);
>  void vmemmap_populate_print_last(void);
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  void vmemmap_free(unsigned long start, unsigned long end);


+kasan-dev

Andrey, do you mind looking at this?

What is the manifestation of the problem? I have kasan bots on tip of
upstream/mmotm/linux-next and they seem to be working.

Re the added comment: is it true that we are wasting up to 128MB per
region? We have some small ones (like text). So is it something to fix
in future?

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-02-25 19:03 ` Dan Williams
  2017-02-27  9:34   ` Dmitry Vyukov
@ 2017-03-03 16:08   ` Andrey Ryabinin
  2017-03-10  0:58     ` Dan Williams
  1 sibling, 1 reply; 12+ messages in thread
From: Andrey Ryabinin @ 2017-03-03 16:08 UTC (permalink / raw)
  To: Dan Williams, Nicolai Stange
  Cc: Andrew Morton, Linux MM, Dmitry Vyukov, Alexander Potapenko

On 02/25/2017 10:03 PM, Dan Williams wrote:
> [ adding kasan folks ]
> 
> On Wed, Feb 15, 2017 at 12:58 PM, Nicolai Stange <nicstange@gmail.com> wrote:
>> Hi Dan,
>>
>> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
>> populate_section_memmap()") seems to cause some issues with respect to
>> Kasan initialization on x86.
>>
>> This is because Kasan's initialization (ab)uses the arch provided
>> vmemmap_populate().
>>
>> The first one is a boot failure, see [1/3]. The commit before the
>> aforementioned one works fine.
>>
>> The second one, i.e. [2/3], is something that hit my eye while browsing
>> the source and I verified that this is indeed an issue by printk'ing and
>> dumping the page tables.
>>
>> The third one are excessive warnings from vmemmap_verify() due to Kasan's
>> NUMA_NO_NODE page populations.
>>
>>
>> I'll be travelling the next two days and certainly not be able to respond
>> or polish these patches any further. Furthermore, the next merge window is
>> close. So please, take these three patches as bug reports only, meant to
>> illustrate the issues. Feel free to use, change and adopt them however
>> you deemed best.
>>
>> That being said,
>> - [2/3] will break arm64 due to the current lack of a pmd_large().
>> - Maybe it's easier and better to restore former behaviour by letting
>>   Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
>>   directly rather than vmemmap_populate(). This would require x86_64
>>   implying X86_FEATURE_PSE though. I'm not sure whether this holds,
>>   in particular not since the vmemmap_populate() from
>>   arch/x86/mm/init_64.c checks for it.
> 
> I think your intuition is correct here, and yes, it is a safe
> assumption that x86_64 implies X86_FEATURE_PSE. The following patch
> works for me. If there's no objections I'll roll it into the series
> and resubmit the sub-section hotplug support after testing on top of
> 4.11-rc1.
> 

Perhaps it would be better to get rid of vmemmap in kasan code at all
and have a separate function that populates kasan shadow.
kasan is abusing API designed for something else. We already had bugs on arm64 (see 2776e0e8ef683)
because of that and now this one on x86_64.
I can cook patches and send them on the next week.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-03-03 16:08   ` Andrey Ryabinin
@ 2017-03-10  0:58     ` Dan Williams
  2017-03-10  8:46       ` Andrey Ryabinin
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Williams @ 2017-03-10  0:58 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Nicolai Stange, Andrew Morton, Linux MM, Dmitry Vyukov,
	Alexander Potapenko

On Fri, Mar 3, 2017 at 8:08 AM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
> On 02/25/2017 10:03 PM, Dan Williams wrote:
>> [ adding kasan folks ]
>>
>> On Wed, Feb 15, 2017 at 12:58 PM, Nicolai Stange <nicstange@gmail.com> wrote:
>>> Hi Dan,
>>>
>>> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
>>> populate_section_memmap()") seems to cause some issues with respect to
>>> Kasan initialization on x86.
>>>
>>> This is because Kasan's initialization (ab)uses the arch provided
>>> vmemmap_populate().
>>>
>>> The first one is a boot failure, see [1/3]. The commit before the
>>> aforementioned one works fine.
>>>
>>> The second one, i.e. [2/3], is something that hit my eye while browsing
>>> the source and I verified that this is indeed an issue by printk'ing and
>>> dumping the page tables.
>>>
>>> The third one are excessive warnings from vmemmap_verify() due to Kasan's
>>> NUMA_NO_NODE page populations.
>>>
>>>
>>> I'll be travelling the next two days and certainly not be able to respond
>>> or polish these patches any further. Furthermore, the next merge window is
>>> close. So please, take these three patches as bug reports only, meant to
>>> illustrate the issues. Feel free to use, change and adopt them however
>>> you deemed best.
>>>
>>> That being said,
>>> - [2/3] will break arm64 due to the current lack of a pmd_large().
>>> - Maybe it's easier and better to restore former behaviour by letting
>>>   Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
>>>   directly rather than vmemmap_populate(). This would require x86_64
>>>   implying X86_FEATURE_PSE though. I'm not sure whether this holds,
>>>   in particular not since the vmemmap_populate() from
>>>   arch/x86/mm/init_64.c checks for it.
>>
>> I think your intuition is correct here, and yes, it is a safe
>> assumption that x86_64 implies X86_FEATURE_PSE. The following patch
>> works for me. If there's no objections I'll roll it into the series
>> and resubmit the sub-section hotplug support after testing on top of
>> 4.11-rc1.
>>
>
> Perhaps it would be better to get rid of vmemmap in kasan code at all
> and have a separate function that populates kasan shadow.
> kasan is abusing API designed for something else. We already had bugs on arm64 (see 2776e0e8ef683)
> because of that and now this one on x86_64.
> I can cook patches and send them on the next week.
>

Any concerns with proceeding with the conversion to explicit
vmemmap_populate_hugepages() calls in the meantime? That allows me to
unblock the sub-section hotplug patches and kasan can move away from
vemmap_populate() on its own schedule.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on
  2017-03-10  0:58     ` Dan Williams
@ 2017-03-10  8:46       ` Andrey Ryabinin
  0 siblings, 0 replies; 12+ messages in thread
From: Andrey Ryabinin @ 2017-03-10  8:46 UTC (permalink / raw)
  To: Dan Williams
  Cc: Nicolai Stange, Andrew Morton, Linux MM, Dmitry Vyukov,
	Alexander Potapenko

On 03/10/2017 03:58 AM, Dan Williams wrote:
> On Fri, Mar 3, 2017 at 8:08 AM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote:
>> On 02/25/2017 10:03 PM, Dan Williams wrote:
>>> [ adding kasan folks ]
>>>
>>> On Wed, Feb 15, 2017 at 12:58 PM, Nicolai Stange <nicstange@gmail.com> wrote:
>>>> Hi Dan,
>>>>
>>>> your recent commit 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to
>>>> populate_section_memmap()") seems to cause some issues with respect to
>>>> Kasan initialization on x86.
>>>>
>>>> This is because Kasan's initialization (ab)uses the arch provided
>>>> vmemmap_populate().
>>>>
>>>> The first one is a boot failure, see [1/3]. The commit before the
>>>> aforementioned one works fine.
>>>>
>>>> The second one, i.e. [2/3], is something that hit my eye while browsing
>>>> the source and I verified that this is indeed an issue by printk'ing and
>>>> dumping the page tables.
>>>>
>>>> The third one are excessive warnings from vmemmap_verify() due to Kasan's
>>>> NUMA_NO_NODE page populations.
>>>>
>>>>
>>>> I'll be travelling the next two days and certainly not be able to respond
>>>> or polish these patches any further. Furthermore, the next merge window is
>>>> close. So please, take these three patches as bug reports only, meant to
>>>> illustrate the issues. Feel free to use, change and adopt them however
>>>> you deemed best.
>>>>
>>>> That being said,
>>>> - [2/3] will break arm64 due to the current lack of a pmd_large().
>>>> - Maybe it's easier and better to restore former behaviour by letting
>>>>   Kasan's shadow initialization on x86 use vmemmap_populate_hugepages()
>>>>   directly rather than vmemmap_populate(). This would require x86_64
>>>>   implying X86_FEATURE_PSE though. I'm not sure whether this holds,
>>>>   in particular not since the vmemmap_populate() from
>>>>   arch/x86/mm/init_64.c checks for it.
>>>
>>> I think your intuition is correct here, and yes, it is a safe
>>> assumption that x86_64 implies X86_FEATURE_PSE. The following patch
>>> works for me. If there's no objections I'll roll it into the series
>>> and resubmit the sub-section hotplug support after testing on top of
>>> 4.11-rc1.
>>>
>>
>> Perhaps it would be better to get rid of vmemmap in kasan code at all
>> and have a separate function that populates kasan shadow.
>> kasan is abusing API designed for something else. We already had bugs on arm64 (see 2776e0e8ef683)
>> because of that and now this one on x86_64.
>> I can cook patches and send them on the next week.
>>
> 
> Any concerns with proceeding with the conversion to explicit
> vmemmap_populate_hugepages() calls in the meantime? That allows me to
> unblock the sub-section hotplug patches and kasan can move away from
> vemmap_populate() on its own schedule.

No objections.
vmemmap_populate_hugepages() seems like the best way to go for now given that
my patches will cause additional conflict with 5-level page tables.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-03-10  8:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-15 20:58 [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Nicolai Stange
2017-02-15 20:58 ` [RFC 1/3] sparse-vmemmap: let vmemmap_populate_basepages() cover the whole range Nicolai Stange
2017-02-15 20:58 ` [RFC 2/3] sparse-vmemmap: make vmemmap_populate_basepages() skip HP mapped ranges Nicolai Stange
2017-02-15 20:58 ` [RFC 3/3] sparse-vmemmap: let vmemmap_verify() ignore NUMA_NO_NODE requests Nicolai Stange
2017-02-15 21:10 ` [RFC 0/3] Regressions due to 7b79d10a2d64 ("mm: convert kmalloc_section_memmap() to populate_section_memmap()") and Kasan initialization on Andrew Morton
2017-02-15 21:26   ` Dan Williams
2017-02-15 21:54     ` Andrew Morton
2017-02-25 19:03 ` Dan Williams
2017-02-27  9:34   ` Dmitry Vyukov
2017-03-03 16:08   ` Andrey Ryabinin
2017-03-10  0:58     ` Dan Williams
2017-03-10  8:46       ` Andrey Ryabinin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.