All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-25 10:15 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen,
	David Hildenbrand, David Rientjes, Edgecombe, Rick P,
	H. Peter Anvin, Heiko Carstens, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Len Brown, Michael Ellerman, Mike Rapoport,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 219+ messages in thread

* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-25 10:15 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0

^ permalink raw reply	[flat|nested] 219+ messages in thread

* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-25 10:15 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-25 10:15 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 219+ messages in thread

* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-25 10:15 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].

Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.

Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.

This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly. 

[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com

Mike Rapoport (4):
  mm: introduce debug_pagealloc_map_pages() helper
  PM: hibernate: improve robustness of mapping pages in the direct map
  arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  arch, mm: make kernel_page_present() always available

 arch/Kconfig                        |  3 +++
 arch/arm64/Kconfig                  |  4 +---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/arm64/mm/pageattr.c            |  6 +++--
 arch/powerpc/Kconfig                |  5 +----
 arch/riscv/Kconfig                  |  4 +---
 arch/riscv/include/asm/pgtable.h    |  2 --
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 31 +++++++++++++++++++++++++
 arch/s390/Kconfig                   |  4 +---
 arch/sparc/Kconfig                  |  4 +---
 arch/x86/Kconfig                    |  4 +---
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  4 ++--
 include/linux/mm.h                  | 35 +++++++++++++----------------
 include/linux/set_memory.h          |  5 +++++
 kernel/power/snapshot.c             | 24 ++++++++++++++++++--
 mm/memory_hotplug.c                 |  3 +--
 mm/page_alloc.c                     |  6 ++---
 mm/slab.c                           |  8 +++----
 20 files changed, 97 insertions(+), 58 deletions(-)

-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
  2020-10-25 10:15 ` Mike Rapoport
                     ` (2 preceding siblings ...)
  (?)
@ 2020-10-25 10:15   ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen,
	David Hildenbrand, David Rientjes, Edgecombe, Rick P,
	H. Peter Anvin, Heiko Carstens, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Len Brown, Michael Ellerman, Mike Rapoport,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().

The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.

To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.

Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.

As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/linux/mm.h      | 16 +++++++---------
 kernel/power/snapshot.c | 11 +++++++++++
 mm/memory_hotplug.c     |  3 +--
 mm/page_alloc.c         |  6 ++----
 mm/slab.c               |  8 +++-----
 5 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
 #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (debug_pagealloc_enabled_static())
+		__kernel_map_pages(page, numpages, enable);
 }
+
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+	__kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
 static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
 	 * so we should map it first. This is better than introducing a special
 	 * case in page freeing fast path.
 	 */
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
 #ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 */
 	arch_free_page(page, order);
 
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 0);
+	debug_pagealloc_map_pages(page, 1 << order, 0);
 
 	kasan_free_nondeferred_pages(page, order);
 
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_refcounted(page);
 
 	arch_alloc_page(page, order);
-	if (debug_pagealloc_enabled_static())
-		kernel_map_pages(page, 1 << order, 1);
+	debug_pagealloc_map_pages(page, 1 << order, 1);
 	kasan_alloc_pages(page, order);
 	kernel_poison_pages(page, 1 << order, 1);
 	set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
 {
-	if (!is_debug_pagealloc_cache(cachep))
-		return;
-
-	kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+	debug_pagealloc_map_pages(virt_to_page(objp),
+				  cachep->size / PAGE_SIZE, map);
 }
 
 #else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
 
 #if DEBUG
 	/*
-	 * If we're going to use the generic kernel_map_pages()
+	 * If we're going to use the generic debug_pagealloc_map_pages()
 	 * poisoning, then it's going to smash the contents of
 	 * the redzone and userword anyhow, so switch them off.
 	 */
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-25 10:15 ` Mike Rapoport
                     ` (2 preceding siblings ...)
  (?)
@ 2020-10-25 10:15   ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen,
	David Hildenbrand, David Rientjes, Edgecombe, Rick P,
	H. Peter Anvin, Heiko Carstens, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Len Brown, Michael Ellerman, Mike Rapoport,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.

On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.

Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.

While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 kernel/power/snapshot.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
 static inline void hibernate_restore_unprotect_page(void *page_address) {}
 #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
 {
-	__kernel_map_pages(page, numpages, enable);
+	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+		unsigned long addr = (unsigned long)page_address(page);
+		int ret;
+
+		if (enable)
+			ret = set_direct_map_default_noflush(page);
+		else
+			ret = set_direct_map_invalid_noflush(page);
+
+		if (WARN_ON(ret))
+			return;
+
+		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	} else {
+		debug_pagealloc_map_pages(page, 1, enable);
+	}
 }
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
 
 static int swsusp_page_is_free(struct page *);
 static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
 	if (kernel_page_present(s_page)) {
 		do_copy_page(dst, page_address(s_page));
 	} else {
-		kernel_map_pages(s_page, 1, 1);
+		hibernate_map_page(s_page, 1);
 		do_copy_page(dst, page_address(s_page));
-		kernel_map_pages(s_page, 1, 0);
+		hibernate_map_page(s_page, 0);
 	}
 }
 
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
  2020-10-25 10:15 ` Mike Rapoport
                     ` (2 preceding siblings ...)
  (?)
@ 2020-10-25 10:15   ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen,
	David Hildenbrand, David Rientjes, Edgecombe, Rick P,
	H. Peter Anvin, Heiko Carstens, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Len Brown, Michael Ellerman, Mike Rapoport,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.

Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.

As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/Kconfig                     |  3 +++
 arch/arm64/Kconfig               |  4 +---
 arch/arm64/mm/pageattr.c         |  6 ++++--
 arch/powerpc/Kconfig             |  5 +----
 arch/riscv/Kconfig               |  4 +---
 arch/riscv/include/asm/pgtable.h |  2 --
 arch/riscv/mm/pageattr.c         |  2 ++
 arch/s390/Kconfig                |  4 +---
 arch/sparc/Kconfig               |  4 +---
 arch/x86/Kconfig                 |  4 +---
 arch/x86/mm/pat/set_memory.c     |  2 ++
 include/linux/mm.h               | 10 +++++++---
 12 files changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
 	bool
 	depends on HAVE_STATIC_CALL
 
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
 
 source "kernel/Kconfig.hz"
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config ARCH_SPARSEMEM_ENABLE
 	def_bool y
 	select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return;
 
 	set_memory_valid((unsigned long)page_address(page), numpages, enable);
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 /*
  * This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
 	pte_t *ptep;
 	unsigned long addr = (unsigned long)page_address(page);
 
-	if (!debug_pagealloc_enabled() && !rodata_full)
+	if (!rodata_full)
 		return true;
 
 	pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ARCH_OPTIONAL_KERNEL_RWX		if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC	if PPC32 || PPC_BOOK3S_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF		if PPC64
 	select ARCH_USE_QUEUED_RWLOCKS		if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
 	depends on PCI
 	depends on PPC64 # not supported on 32 bits yet
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	depends on PPC32 || PPC_BOOK3S_64
-	def_bool y
-
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
 	def_bool y
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
 config ARCH_WANT_GENERAL_HUGETLB
 	def_bool y
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config SYS_SUPPORTS_HUGETLBFS
 	depends on MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
 #define VMALLOC_START		0
 #define VMALLOC_END		TASK_SIZE
 
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
 #endif /* !CONFIG_MMU */
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return ret;
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 		__set_memory((unsigned long)page_address(page), numpages,
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
 config PGSTE
 	def_bool y if KVM
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config AUDIT_ARCH
 	def_bool y
 
@@ -106,6 +103,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_ARCH_AUDITSYSCALL
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select HAVE_NMI
 	select HAVE_REGS_AND_STACK_ACCESS_API
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
 	bool
 	default y if SPARC32
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y if SPARC64
-
 config PGTABLE_LEVELS
 	default 4 if 64BIT
 	default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
 	select ARCH_STACKWALK
 	select ARCH_SUPPORTS_ACPI
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
 	select ARCH_SUPPORTS_NUMA_BALANCING	if X86_64
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
 config AUDIT_ARCH
 	def_bool y if X86_64
 
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
-	def_bool y
-
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
 	return __set_pages_p(page, 1);
 }
 
+#ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
 	if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 
 	arch_flush_lazy_mmu_mode();
 }
+#endif /* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
 	return static_branch_unlikely(&_debug_pagealloc_enabled);
 }
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
 extern void __kernel_map_pages(struct page *page, int numpages, int enable);
 
 static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 #ifdef CONFIG_HIBERNATION
 extern bool kernel_page_present(struct page *page);
 #endif	/* CONFIG_HIBERNATION */
-#else	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
 #ifdef CONFIG_HIBERNATION
 static inline bool kernel_page_present(struct page *page) { return true; }
 #endif	/* CONFIG_HIBERNATION */
-#endif	/* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
 extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
  2020-10-25 10:15 ` Mike Rapoport
                     ` (2 preceding siblings ...)
  (?)
@ 2020-10-25 10:15   ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen,
	David Hildenbrand, David Rientjes, Edgecombe, Rick P,
	H. Peter Anvin, Heiko Carstens, Ingo Molnar, Joonsoo Kim,
	Kirill A. Shutemov, Len Brown, Michael Ellerman, Mike Rapoport,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
	Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
	Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 219+ messages in thread

* [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-25 10:15   ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Peter Zijlstra, Benjamin Herrenschmidt,
	Dave Hansen, linux-mm, Paul Mackerras, Pavel Machek,
	H. Peter Anvin, sparclinux, Christoph Lameter, Will Deacon,
	linux-riscv, linux-s390, Michael Ellerman, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.

Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/cacheflush.h |  1 +
 arch/riscv/include/asm/set_memory.h |  1 +
 arch/riscv/mm/pageattr.c            | 29 +++++++++++++++++++++++++++++
 arch/x86/include/asm/set_memory.h   |  1 +
 arch/x86/mm/pat/set_memory.c        |  2 --
 include/linux/mm.h                  |  7 -------
 include/linux/set_memory.h          |  5 +++++
 7 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #include <asm-generic/cacheflush.h>
 
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 			     __pgprot(0), __pgprot(_PAGE_PRESENT));
 }
 #endif
+
+bool kernel_page_present(struct page *page)
+{
+	unsigned long addr = (unsigned long)page_address(page);
+	pgd_t *pgd;
+	pud_t *pud;
+	p4d_t *p4d;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	pgd = pgd_offset_k(addr);
+	if (!pgd_present(*pgd))
+		return false;
+
+	p4d = p4d_offset(pgd, addr);
+	if (!p4d_present(*p4d))
+		return false;
+
+	pud = pud_offset(p4d, addr);
+	if (!pud_present(*pud))
+		return false;
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd))
+		return false;
+
+	pte = pte_offset_kernel(pmd, addr);
+	return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
 
 int set_direct_map_invalid_noflush(struct page *page);
 int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
 
 extern int kernel_set_to_readonly;
 
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
-#ifdef CONFIG_HIBERNATION
 bool kernel_page_present(struct page *page)
 {
 	unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
 	pte = lookup_address((unsigned long)page_address(page), &level);
 	return (pte_val(*pte) & _PAGE_PRESENT);
 }
-#endif /* CONFIG_HIBERNATION */
 
 int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
 				   unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
 	if (debug_pagealloc_enabled_static())
 		__kernel_map_pages(page, numpages, enable);
 }
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif	/* CONFIG_HIBERNATION */
 #else	/* CONFIG_DEBUG_PAGEALLOC */
 static inline void debug_pagealloc_map_pages(struct page *page,
 					     int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif	/* CONFIG_HIBERNATION */
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
 
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
 {
 	return 0;
 }
+
+static inline bool kernel_page_present(struct page *page)
+{
+	return true;
+}
 #endif
 
 #ifndef set_mce_nospec
-- 
2.28.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-25 10:15   ` Mike Rapoport
                       ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

T24gU3VuLCAyMDIwLTEwLTI1IGF0IDEyOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBGcm9tOiBNaWtlIFJhcG9wb3J0IDxycHB0QGxpbnV4LmlibS5jb20+DQo+IA0KPiBXaGVuIERF
QlVHX1BBR0VBTExPQyBvciBBUkNIX0hBU19TRVRfRElSRUNUX01BUCBpcyBlbmFibGVkIGEgcGFn
ZSBtYXkNCj4gYmUNCj4gbm90IHByZXNlbnQgaW4gdGhlIGRpcmVjdCBtYXAgYW5kIGhhcyB0byBi
ZSBleHBsaWNpdGx5IG1hcHBlZCBiZWZvcmUNCj4gaXQNCj4gY291bGQgYmUgY29waWVkLg0KPiAN
Cj4gT24gYXJtNjQgaXQgaXMgcG9zc2libGUgdGhhdCBhIHBhZ2Ugd291bGQgYmUgcmVtb3ZlZCBm
cm9tIHRoZSBkaXJlY3QNCj4gbWFwDQo+IHVzaW5nIHNldF9kaXJlY3RfbWFwX2ludmFsaWRfbm9m
bHVzaCgpIGJ1dCBfX2tlcm5lbF9tYXBfcGFnZXMoKSB3aWxsDQo+IHJlZnVzZQ0KPiB0byBtYXAg
dGhpcyBwYWdlIGJhY2sgaWYgREVCVUdfUEFHRUFMTE9DIGlzIGRpc2FibGVkLg0KDQpJdCBsb29r
cyB0byBtZSB0aGF0IGFybTY0IF9fa2VybmVsX21hcF9wYWdlcygpIHdpbGwgc3RpbGwgYXR0ZW1w
dCB0bw0KbWFwIGl0IGlmIHJvZGF0YV9mdWxsIGlzIHRydWUsIGhvdyBkb2VzIHRoaXMgaGFwcGVu
Pw0KDQo+IEV4cGxpY2l0bHkgdXNlIHNldF9kaXJlY3RfbWFwX3tkZWZhdWx0LGludmFsaWR9X25v
Zmx1c2goKSBmb3INCj4gQVJDSF9IQVNfU0VUX0RJUkVDVF9NQVAgY2FzZSBhbmQgZGVidWdfcGFn
ZWFsbG9jX21hcF9wYWdlcygpIGZvcg0KPiBERUJVR19QQUdFQUxMT0MgY2FzZS4NCj4gDQo+IFdo
aWxlIG9uIHRoYXQsIHJlbmFtZSBrZXJuZWxfbWFwX3BhZ2VzKCkgdG8gaGliZXJuYXRlX21hcF9w
YWdlKCkgYW5kDQo+IGRyb3ANCj4gbnVtcGFnZXMgcGFyYW1ldGVyLg0KPiANCj4gU2lnbmVkLW9m
Zi1ieTogTWlrZSBSYXBvcG9ydCA8cnBwdEBsaW51eC5pYm0uY29tPg0KPiAtLS0NCj4gIGtlcm5l
bC9wb3dlci9zbmFwc2hvdC5jIHwgMjkgKysrKysrKysrKysrKysrKysrKy0tLS0tLS0tLS0NCj4g
IDEgZmlsZSBjaGFuZ2VkLCAxOSBpbnNlcnRpb25zKCspLCAxMCBkZWxldGlvbnMoLSkNCj4gDQo+
IGRpZmYgLS1naXQgYS9rZXJuZWwvcG93ZXIvc25hcHNob3QuYyBiL2tlcm5lbC9wb3dlci9zbmFw
c2hvdC5jDQo+IGluZGV4IGZhNDk5NDY2ZjY0NS4uZWNiN2IzMmNlNzdjIDEwMDY0NA0KPiAtLS0g
YS9rZXJuZWwvcG93ZXIvc25hcHNob3QuYw0KPiArKysgYi9rZXJuZWwvcG93ZXIvc25hcHNob3Qu
Yw0KPiBAQCAtNzYsMTYgKzc2LDI1IEBAIHN0YXRpYyBpbmxpbmUgdm9pZA0KPiBoaWJlcm5hdGVf
cmVzdG9yZV9wcm90ZWN0X3BhZ2Uodm9pZCAqcGFnZV9hZGRyZXNzKSB7fQ0KPiAgc3RhdGljIGlu
bGluZSB2b2lkIGhpYmVybmF0ZV9yZXN0b3JlX3VucHJvdGVjdF9wYWdlKHZvaWQNCj4gKnBhZ2Vf
YWRkcmVzcykge30NCj4gICNlbmRpZiAvKiBDT05GSUdfU1RSSUNUX0tFUk5FTF9SV1ggICYmIENP
TkZJR19BUkNIX0hBU19TRVRfTUVNT1JZICovDQo+ICANCj4gLSNpZiBkZWZpbmVkKENPTkZJR19E
RUJVR19QQUdFQUxMT0MpIHx8DQo+IGRlZmluZWQoQ09ORklHX0FSQ0hfSEFTX1NFVF9ESVJFQ1Rf
TUFQKQ0KPiAtc3RhdGljIGlubGluZSB2b2lkDQo+IC1rZXJuZWxfbWFwX3BhZ2VzKHN0cnVjdCBw
YWdlICpwYWdlLCBpbnQgbnVtcGFnZXMsIGludCBlbmFibGUpDQo+ICtzdGF0aWMgaW5saW5lIHZv
aWQgaGliZXJuYXRlX21hcF9wYWdlKHN0cnVjdCBwYWdlICpwYWdlLCBpbnQgZW5hYmxlKQ0KPiAg
ew0KPiAtCV9fa2VybmVsX21hcF9wYWdlcyhwYWdlLCBudW1wYWdlcywgZW5hYmxlKTsNCj4gKwlp
ZiAoSVNfRU5BQkxFRChDT05GSUdfQVJDSF9IQVNfU0VUX0RJUkVDVF9NQVApKSB7DQo+ICsJCXVu
c2lnbmVkIGxvbmcgYWRkciA9ICh1bnNpZ25lZCBsb25nKXBhZ2VfYWRkcmVzcyhwYWdlKTsNCj4g
KwkJaW50IHJldDsNCj4gKw0KPiArCQlpZiAoZW5hYmxlKQ0KPiArCQkJcmV0ID0gc2V0X2RpcmVj
dF9tYXBfZGVmYXVsdF9ub2ZsdXNoKHBhZ2UpOw0KPiArCQllbHNlDQo+ICsJCQlyZXQgPSBzZXRf
ZGlyZWN0X21hcF9pbnZhbGlkX25vZmx1c2gocGFnZSk7DQo+ICsNCj4gKwkJaWYgKFdBUk5fT04o
cmV0KSkNCj4gKwkJCXJldHVybjsNCj4gKw0KPiArCQlmbHVzaF90bGJfa2VybmVsX3JhbmdlKGFk
ZHIsIGFkZHIgKyBQQUdFX1NJWkUpOw0KPiArCX0gZWxzZSB7DQo+ICsJCWRlYnVnX3BhZ2VhbGxv
Y19tYXBfcGFnZXMocGFnZSwgMSwgZW5hYmxlKTsNCj4gKwl9DQo+ICB9DQo+IC0jZWxzZQ0KPiAt
c3RhdGljIGlubGluZSB2b2lkDQo+IC1rZXJuZWxfbWFwX3BhZ2VzKHN0cnVjdCBwYWdlICpwYWdl
LCBpbnQgbnVtcGFnZXMsIGludCBlbmFibGUpIHt9DQo+IC0jZW5kaWYNCj4gIA0KPiAgc3RhdGlj
IGludCBzd3N1c3BfcGFnZV9pc19mcmVlKHN0cnVjdCBwYWdlICopOw0KPiAgc3RhdGljIHZvaWQg
c3dzdXNwX3NldF9wYWdlX2ZvcmJpZGRlbihzdHJ1Y3QgcGFnZSAqKTsNCj4gQEAgLTEzNjYsOSAr
MTM3NSw5IEBAIHN0YXRpYyB2b2lkIHNhZmVfY29weV9wYWdlKHZvaWQgKmRzdCwgc3RydWN0DQo+
IHBhZ2UgKnNfcGFnZSkNCj4gIAlpZiAoa2VybmVsX3BhZ2VfcHJlc2VudChzX3BhZ2UpKSB7DQo+
ICAJCWRvX2NvcHlfcGFnZShkc3QsIHBhZ2VfYWRkcmVzcyhzX3BhZ2UpKTsNCj4gIAl9IGVsc2Ug
ew0KPiAtCQlrZXJuZWxfbWFwX3BhZ2VzKHNfcGFnZSwgMSwgMSk7DQo+ICsJCWhpYmVybmF0ZV9t
YXBfcGFnZShzX3BhZ2UsIDEpOw0KPiAgCQlkb19jb3B5X3BhZ2UoZHN0LCBwYWdlX2FkZHJlc3Mo
c19wYWdlKSk7DQo+IC0JCWtlcm5lbF9tYXBfcGFnZXMoc19wYWdlLCAxLCAwKTsNCj4gKwkJaGli
ZXJuYXRlX21hcF9wYWdlKHNfcGFnZSwgMCk7DQo+ICAJfQ0KPiAgfQ0KPiAgDQoNCklmIHNvbWVo
b3cgYSBwYWdlIHdhcyB1bm1hcHBlZCBzdWNoIHRoYXQNCnNldF9kaXJlY3RfbWFwX2RlZmF1bHRf
bm9mbHVzaCgpIHdvdWxkIGZhaWwsIHRoZW4gdGhpcyBjb2RlIGludHJvZHVjZXMNCmEgV0FSTiwg
YnV0IGl0IHdpbGwgc3RpbGwgdHJ5IHRvIHJlYWQgdGhlIHVubWFwcGVkIHBhZ2UuIFdoeSBub3Qg
anVzdA0KaGF2ZSB0aGUgV0FSTidzIGluc2lkZSBvZiBfX2tlcm5lbF9tYXBfcGFnZXMoKSBpZiB0
aGV5IGZhaWwgYW5kIHRoZW4NCmhhdmUgYSB3YXJuaW5nIGZvciB0aGUgZGVidWcgcGFnZSBhbGxv
YyBjYXNlcyBhcyB3ZWxsPyBTaW5jZSBsb2dpYw0KYXJvdW5kIGJvdGggZXhwZWN0cyB0aGVtIG5v
dCB0byBmYWlsLg0KDQoNCg=

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  0:38     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:38 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
> 
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.

It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?

> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
> 
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
>  static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
>  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>  
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
>  {
> -	__kernel_map_pages(page, numpages, enable);
> +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +		unsigned long addr = (unsigned long)page_address(page);
> +		int ret;
> +
> +		if (enable)
> +			ret = set_direct_map_default_noflush(page);
> +		else
> +			ret = set_direct_map_invalid_noflush(page);
> +
> +		if (WARN_ON(ret))
> +			return;
> +
> +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	} else {
> +		debug_pagealloc_map_pages(page, 1, enable);
> +	}
>  }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>  
>  static int swsusp_page_is_free(struct page *);
>  static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
>  	if (kernel_page_present(s_page)) {
>  		do_copy_page(dst, page_address(s_page));
>  	} else {
> -		kernel_map_pages(s_page, 1, 1);
> +		hibernate_map_page(s_page, 1);
>  		do_copy_page(dst, page_address(s_page));
> -		kernel_map_pages(s_page, 1, 0);
> +		hibernate_map_page(s_page, 0);
>  	}
>  }
>  

If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
  2020-10-25 10:15   ` Mike Rapoport
                       ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> index 7f248fc45317..16f878c26667 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> numpages, int enable)
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> -#ifdef CONFIG_HIBERNATION
>  bool kernel_page_present(struct page *page)
>  {
>         unsigned int level;
> @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
>         pte = lookup_address((unsigned long)page_address(page),
> &level);
>         return (pte_val(*pte) & _PAGE_PRESENT);
>  }
> -#endif /* CONFIG_HIBERNATION */

This is only used by hibernate today right? Makes sense that it should
return a correct answer if someone starts to use it without looking too
closely at the header. But could we just remove the default static
inline return true implementation and let the linker fail if someone
starts to use it outside hibernate? Then we could leave it compiled out
until then.

Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
new function added here couldn't be used yet. You could also just let
the linker catch it if riscv ever enables hibernate?

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> index 7f248fc45317..16f878c26667 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> numpages, int enable)
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> -#ifdef CONFIG_HIBERNATION
>  bool kernel_page_present(struct page *page)
>  {
>         unsigned int level;
> @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
>         pte = lookup_address((unsigned long)page_address(page),
> &level);
>         return (pte_val(*pte) & _PAGE_PRESENT);
>  }
> -#endif /* CONFIG_HIBERNATION */

This is only used by hibernate today right? Makes sense that it should
return a correct answer if someone starts to use it without looking too
closely at the header. But could we just remove the default static
inline return true implementation and let the linker fail if someone
starts to use it outside hibernate? Then we could leave it compiled out
until then.

Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
new function added here couldn't be used yet. You could also just let
the linker catch it if riscv ever enables hibernate?

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

T24gU3VuLCAyMDIwLTEwLTI1IGF0IDEyOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBpbmRleCA3ZjI0OGZjNDUzMTcuLjE2Zjg3OGMyNjY2NyAxMDA2NDQNCj4gLS0tIGEvYXJjaC94
ODYvbW0vcGF0L3NldF9tZW1vcnkuYw0KPiArKysgYi9hcmNoL3g4Ni9tbS9wYXQvc2V0X21lbW9y
eS5jDQo+IEBAIC0yMjI4LDcgKzIyMjgsNiBAQCB2b2lkIF9fa2VybmVsX21hcF9wYWdlcyhzdHJ1
Y3QgcGFnZSAqcGFnZSwgaW50DQo+IG51bXBhZ2VzLCBpbnQgZW5hYmxlKQ0KPiAgfQ0KPiAgI2Vu
ZGlmIC8qIENPTkZJR19ERUJVR19QQUdFQUxMT0MgKi8NCj4gIA0KPiAtI2lmZGVmIENPTkZJR19I
SUJFUk5BVElPTg0KPiAgYm9vbCBrZXJuZWxfcGFnZV9wcmVzZW50KHN0cnVjdCBwYWdlICpwYWdl
KQ0KPiAgew0KPiAgICAgICAgIHVuc2lnbmVkIGludCBsZXZlbDsNCj4gQEAgLTIyNDAsNyArMjIz
OSw2IEBAIGJvb2wga2VybmVsX3BhZ2VfcHJlc2VudChzdHJ1Y3QgcGFnZSAqcGFnZSkNCj4gICAg
ICAgICBwdGUgPSBsb29rdXBfYWRkcmVzcygodW5zaWduZWQgbG9uZylwYWdlX2FkZHJlc3MocGFn
ZSksDQo+ICZsZXZlbCk7DQo+ICAgICAgICAgcmV0dXJuIChwdGVfdmFsKCpwdGUpICYgX1BBR0Vf
UFJFU0VOVCk7DQo+ICB9DQo+IC0jZW5kaWYgLyogQ09ORklHX0hJQkVSTkFUSU9OICovDQoNClRo
aXMgaXMgb25seSB1c2VkIGJ5IGhpYmVybmF0ZSB0b2RheSByaWdodD8gTWFrZXMgc2Vuc2UgdGhh
dCBpdCBzaG91bGQNCnJldHVybiBhIGNvcnJlY3QgYW5zd2VyIGlmIHNvbWVvbmUgc3RhcnRzIHRv
IHVzZSBpdCB3aXRob3V0IGxvb2tpbmcgdG9vDQpjbG9zZWx5IGF0IHRoZSBoZWFkZXIuIEJ1dCBj
b3VsZCB3ZSBqdXN0IHJlbW92ZSB0aGUgZGVmYXVsdCBzdGF0aWMNCmlubGluZSByZXR1cm4gdHJ1
ZSBpbXBsZW1lbnRhdGlvbiBhbmQgbGV0IHRoZSBsaW5rZXIgZmFpbCBpZiBzb21lb25lDQpzdGFy
dHMgdG8gdXNlIGl0IG91dHNpZGUgaGliZXJuYXRlPyBUaGVuIHdlIGNvdWxkIGxlYXZlIGl0IGNv
bXBpbGVkIG91dA0KdW50aWwgdGhlbi4NCg0KQWxzbyBpdCBsb29rcyBsaWtlIHJpc2N2IGRvZXMg
bm90IGhhdmUgQVJDSF9ISUJFUk5BVElPTl9QT1NTSUJMRSBzbyB0aGUNCm5ldyBmdW5jdGlvbiBh
ZGRlZCBoZXJlIGNvdWxkbid0IGJlIHVzZWQgeWV0LiBZb3UgY291bGQgYWxzbyBqdXN0IGxldA0K
dGhlIGxpbmtlciBjYXRjaCBpdCBpZiByaXNjdiBldmVyIGVuYWJsZXMgaGliZXJuYXRlPw0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> index 7f248fc45317..16f878c26667 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> numpages, int enable)
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> -#ifdef CONFIG_HIBERNATION
>  bool kernel_page_present(struct page *page)
>  {
>         unsigned int level;
> @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
>         pte = lookup_address((unsigned long)page_address(page),
> &level);
>         return (pte_val(*pte) & _PAGE_PRESENT);
>  }
> -#endif /* CONFIG_HIBERNATION */

This is only used by hibernate today right? Makes sense that it should
return a correct answer if someone starts to use it without looking too
closely at the header. But could we just remove the default static
inline return true implementation and let the linker fail if someone
starts to use it outside hibernate? Then we could leave it compiled out
until then.

Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
new function added here couldn't be used yet. You could also just let
the linker catch it if riscv ever enables hibernate?
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> index 7f248fc45317..16f878c26667 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> numpages, int enable)
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> -#ifdef CONFIG_HIBERNATION
>  bool kernel_page_present(struct page *page)
>  {
>         unsigned int level;
> @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
>         pte = lookup_address((unsigned long)page_address(page),
> &level);
>         return (pte_val(*pte) & _PAGE_PRESENT);
>  }
> -#endif /* CONFIG_HIBERNATION */

This is only used by hibernate today right? Makes sense that it should
return a correct answer if someone starts to use it without looking too
closely at the header. But could we just remove the default static
inline return true implementation and let the linker fail if someone
starts to use it outside hibernate? Then we could leave it compiled out
until then.

Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
new function added here couldn't be used yet. You could also just let
the linker catch it if riscv ever enables hibernate?

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  0:54     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  0:54 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> index 7f248fc45317..16f878c26667 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> numpages, int enable)
>  }
>  #endif /* CONFIG_DEBUG_PAGEALLOC */
>  
> -#ifdef CONFIG_HIBERNATION
>  bool kernel_page_present(struct page *page)
>  {
>         unsigned int level;
> @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
>         pte = lookup_address((unsigned long)page_address(page),
> &level);
>         return (pte_val(*pte) & _PAGE_PRESENT);
>  }
> -#endif /* CONFIG_HIBERNATION */

This is only used by hibernate today right? Makes sense that it should
return a correct answer if someone starts to use it without looking too
closely at the header. But could we just remove the default static
inline return true implementation and let the linker fail if someone
starts to use it outside hibernate? Then we could leave it compiled out
until then.

Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
new function added here couldn't be used yet. You could also just let
the linker catch it if riscv ever enables hibernate?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-25 10:15 ` Mike Rapoport
                     ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> it is
> possible that __kernel_map_pages() would fail, but since this
> function is
> void, the failure will go unnoticed.

Could you elaborate on how this could happen? Do you mean during
runtime today or if something new was introduced?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> it is
> possible that __kernel_map_pages() would fail, but since this
> function is
> void, the failure will go unnoticed.

Could you elaborate on how this could happen? Do you mean during
runtime today or if something new was introduced?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

T24gU3VuLCAyMDIwLTEwLTI1IGF0IDEyOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBJbmRlZWQsIGZvciBhcmNoaXRlY3R1cmVzIHRoYXQgZGVmaW5lIENPTkZJR19BUkNIX0hBU19T
RVRfRElSRUNUX01BUA0KPiBpdCBpcw0KPiBwb3NzaWJsZSB0aGF0IF9fa2VybmVsX21hcF9wYWdl
cygpIHdvdWxkIGZhaWwsIGJ1dCBzaW5jZSB0aGlzDQo+IGZ1bmN0aW9uIGlzDQo+IHZvaWQsIHRo
ZSBmYWlsdXJlIHdpbGwgZ28gdW5ub3RpY2VkLg0KDQpDb3VsZCB5b3UgZWxhYm9yYXRlIG9uIGhv
dyB0aGlzIGNvdWxkIGhhcHBlbj8gRG8geW91IG1lYW4gZHVyaW5nDQpydW50aW1lIHRvZGF5IG9y
IGlmIHNvbWV0aGluZyBuZXcgd2FzIGludHJvZHVjZWQ/DQoNCg=

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> it is
> possible that __kernel_map_pages() would fail, but since this
> function is
> void, the failure will go unnoticed.

Could you elaborate on how this could happen? Do you mean during
runtime today or if something new was introduced?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> it is
> possible that __kernel_map_pages() would fail, but since this
> function is
> void, the failure will go unnoticed.

Could you elaborate on how this could happen? Do you mean during
runtime today or if something new was introduced?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  1:13   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26  1:13 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> it is
> possible that __kernel_map_pages() would fail, but since this
> function is
> void, the failure will go unnoticed.

Could you elaborate on how this could happen? Do you mean during
runtime today or if something new was introduced?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-26  1:13   ` Edgecombe, Rick P
                       ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  9:05     ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  9:05     ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  9:05     ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  9:05     ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  9:05     ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx, akpm,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26  9:05     ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:05 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > it is
> > possible that __kernel_map_pages() would fail, but since this
> > function is
> > void, the failure will go unnoticed.
> 
> Could you elaborate on how this could happen? Do you mean during
> runtime today or if something new was introduced?

A failure in__kernel_map_pages() may happen today. For instance, on x86
if the kernel is built with DEBUG_PAGEALLOC.

	__kernel_map_pages(page, 1, 0);

will need to split, say, 2M page and during the split an allocation of
page table could fail.

Currently, the only user of __kernel_map_pages() outside DEBUG_PAGEALLOC
is hibernation, but I think it would be safer to entirely prevent usage
of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-26  0:38     ` Edgecombe, Rick P
                         ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  9:15       ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  9:15       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  9:15       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  9:15       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  9:15       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx, akpm,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26  9:15       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:15 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> > be
> > not present in the direct map and has to be explicitly mapped before
> > it
> > could be copied.
> > 
> > On arm64 it is possible that a page would be removed from the direct
> > map
> > using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> > refuse
> > to map this page back if DEBUG_PAGEALLOC is disabled.
> 
> It looks to me that arm64 __kernel_map_pages() will still attempt to
> map it if rodata_full is true, how does this happen?

Unless I misread the code, arm64 requires both rodata_full and
debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
anything.
But rodata_full condition applies to set_direct_map_*_noflush() as well,
so with !rodata_full the linear map won't be ever changed.

> > Explicitly use set_direct_map_{default,invalid}_noflush() for
> > ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> > DEBUG_PAGEALLOC case.
> > 
> > While on that, rename kernel_map_pages() to hibernate_map_page() and
> > drop
> > numpages parameter.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > ---
> >  kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index fa499466f645..ecb7b32ce77c 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -76,16 +76,25 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {}
> >  static inline void hibernate_restore_unprotect_page(void
> > *page_address) {}
> >  #endif /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> >  
> > -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable)
> > +static inline void hibernate_map_page(struct page *page, int enable)
> >  {
> > -	__kernel_map_pages(page, numpages, enable);
> > +	if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +		unsigned long addr = (unsigned long)page_address(page);
> > +		int ret;
> > +
> > +		if (enable)
> > +			ret = set_direct_map_default_noflush(page);
> > +		else
> > +			ret = set_direct_map_invalid_noflush(page);
> > +
> > +		if (WARN_ON(ret))
> > +			return;
> > +
> > +		flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +	} else {
> > +		debug_pagealloc_map_pages(page, 1, enable);
> > +	}
> >  }
> > -#else
> > -static inline void
> > -kernel_map_pages(struct page *page, int numpages, int enable) {}
> > -#endif
> >  
> >  static int swsusp_page_is_free(struct page *);
> >  static void swsusp_set_page_forbidden(struct page *);
> > @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> > page *s_page)
> >  	if (kernel_page_present(s_page)) {
> >  		do_copy_page(dst, page_address(s_page));
> >  	} else {
> > -		kernel_map_pages(s_page, 1, 1);
> > +		hibernate_map_page(s_page, 1);
> >  		do_copy_page(dst, page_address(s_page));
> > -		kernel_map_pages(s_page, 1, 0);
> > +		hibernate_map_page(s_page, 0);
> >  	}
> >  }
> >  
> 
> If somehow a page was unmapped such that
> set_direct_map_default_noflush() would fail, then this code introduces
> a WARN, but it will still try to read the unmapped page. Why not just
> have the WARN's inside of __kernel_map_pages() if they fail and then
> have a warning for the debug page alloc cases as well? Since logic
> around both expects them not to fail.

The intention of this series is to disallow usage of
__kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch to
better handle possible errors, but I still want to keep WARN in the
caller.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
  2020-10-26  0:54     ` Edgecombe, Rick P
                         ` (3 preceding siblings ...)
  (?)
@ 2020-10-26  9:31       ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  9:31       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, will, penberg, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe, luto,
	davem, linux-mm, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  9:31       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  9:31       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  9:31       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx, akpm,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 4/4] arch, mm: make kernel_page_present() always available
@ 2020-10-26  9:31       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26  9:31 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 12:54:01AM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > index 7f248fc45317..16f878c26667 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int
> > numpages, int enable)
> >  }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#ifdef CONFIG_HIBERNATION
> >  bool kernel_page_present(struct page *page)
> >  {
> >         unsigned int level;
> > @@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
> >         pte = lookup_address((unsigned long)page_address(page),
> > &level);
> >         return (pte_val(*pte) & _PAGE_PRESENT);
> >  }
> > -#endif /* CONFIG_HIBERNATION */
> 
> This is only used by hibernate today right? Makes sense that it should
> return a correct answer if someone starts to use it without looking too
> closely at the header. But could we just remove the default static
> inline return true implementation and let the linker fail if someone
> starts to use it outside hibernate? Then we could leave it compiled out
> until then.

Hmm, I'm not sure I follow you here.
We'd need some stub for architectures that have
ARCH_HIBERNATION_POSSIBLE and do not have
CONFIG_ARCH_HAS_SET_DIRECT_MAP.

I don't see how the kernel would compile for ppc or sparc with
hibernation enabled if I remove the default implementation.

> Also it looks like riscv does not have ARCH_HIBERNATION_POSSIBLE so the
> new function added here couldn't be used yet. You could also just let
> the linker catch it if riscv ever enables hibernate?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
  2020-10-25 10:15   ` Mike Rapoport
                       ` (2 preceding siblings ...)
  (?)
@ 2020-10-26 11:05     ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:05 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen, David Rientjes,
	Edgecombe, Rick P, H. Peter Anvin, Heiko Carstens, Ingo Molnar,
	Joonsoo Kim, Kirill A. Shutemov, Len Brown, Michael Ellerman,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> kernel direct mapping after free_pages(). The pages than need to be
> mapped back before they could be used. Theese mapping operations use 
> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> 
> The only place that calls __kernel_map_pages() without checking
> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> is set. Still, on arm64, __kernel_map_pages() will bail out when
> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> may render some pages not present in the direct map and hibernation
> code won't be able to save such pages.
> 
> To make page allocation debugging and hibernation interaction more
> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> has to be made more explicit.
> 
> Start with combining the guard condition and the call to 
> __kernel_map_pages() into a single debug_pagealloc_map_pages()
> function to emphasize that __kernel_map_pages() should not be called
> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> when page allocation debug is enabled.
> 
> As the only remaining user of kernel_map_pages() is the hibernation
> code, mode that function into kernel/power/snapshot.c closer to a
> caller.

s/mode/move/

> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index
> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> debug_pagealloc_enabled_static(void) #if
> defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> __kernel_map_pages(struct page *page, int numpages, int enable);
> 
> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> likely be - * guarded by debug_pagealloc_enabled() or
> debug_pagealloc_enabled_static() - */ -static inline void 
> -kernel_map_pages(struct page *page, int numpages, int enable) 
> +static inline void debug_pagealloc_map_pages(struct page *page, +
> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> enable); +	if (debug_pagealloc_enabled_static()) +
> __kernel_map_pages(page, numpages, enable); } + #ifdef
> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> inline void -kernel_map_pages(struct page *page, int numpages, int
> enable) {} +static inline void debug_pagealloc_map_pages(struct page
> *page, +					     int numpages, int enable) {} #ifdef
> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {} static inline
> void hibernate_restore_unprotect_page(void *page_address) {} #endif
> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> 
> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> void +kernel_map_pages(struct page *page, int numpages, int enable)
> {} +#endif +

That change should go into a separate patch.


For the debug_pagealloc_map_pages() parts

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:05     ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:05 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> kernel direct mapping after free_pages(). The pages than need to be
> mapped back before they could be used. Theese mapping operations use 
> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> 
> The only place that calls __kernel_map_pages() without checking
> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> is set. Still, on arm64, __kernel_map_pages() will bail out when
> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> may render some pages not present in the direct map and hibernation
> code won't be able to save such pages.
> 
> To make page allocation debugging and hibernation interaction more
> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> has to be made more explicit.
> 
> Start with combining the guard condition and the call to 
> __kernel_map_pages() into a single debug_pagealloc_map_pages()
> function to emphasize that __kernel_map_pages() should not be called
> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> when page allocation debug is enabled.
> 
> As the only remaining user of kernel_map_pages() is the hibernation
> code, mode that function into kernel/power/snapshot.c closer to a
> caller.

s/mode/move/

> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index
> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> debug_pagealloc_enabled_static(void) #if
> defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> __kernel_map_pages(struct page *page, int numpages, int enable);
> 
> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> likely be - * guarded by debug_pagealloc_enabled() or
> debug_pagealloc_enabled_static() - */ -static inline void 
> -kernel_map_pages(struct page *page, int numpages, int enable) 
> +static inline void debug_pagealloc_map_pages(struct page *page, +
> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> enable); +	if (debug_pagealloc_enabled_static()) +
> __kernel_map_pages(page, numpages, enable); } + #ifdef
> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> inline void -kernel_map_pages(struct page *page, int numpages, int
> enable) {} +static inline void debug_pagealloc_map_pages(struct page
> *page, +					     int numpages, int enable) {} #ifdef
> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {} static inline
> void hibernate_restore_unprotect_page(void *page_address) {} #endif
> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> 
> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> void +kernel_map_pages(struct page *page, int numpages, int enable)
> {} +#endif +

That change should go into a separate patch.


For the debug_pagealloc_map_pages() parts

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:05     ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:05 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> kernel direct mapping after free_pages(). The pages than need to be
> mapped back before they could be used. Theese mapping operations use 
> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> 
> The only place that calls __kernel_map_pages() without checking
> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> is set. Still, on arm64, __kernel_map_pages() will bail out when
> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> may render some pages not present in the direct map and hibernation
> code won't be able to save such pages.
> 
> To make page allocation debugging and hibernation interaction more
> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> has to be made more explicit.
> 
> Start with combining the guard condition and the call to 
> __kernel_map_pages() into a single debug_pagealloc_map_pages()
> function to emphasize that __kernel_map_pages() should not be called
> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> when page allocation debug is enabled.
> 
> As the only remaining user of kernel_map_pages() is the hibernation
> code, mode that function into kernel/power/snapshot.c closer to a
> caller.

s/mode/move/

> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index
> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> debug_pagealloc_enabled_static(void) #if
> defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> __kernel_map_pages(struct page *page, int numpages, int enable);
> 
> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> likely be - * guarded by debug_pagealloc_enabled() or
> debug_pagealloc_enabled_static() - */ -static inline void 
> -kernel_map_pages(struct page *page, int numpages, int enable) 
> +static inline void debug_pagealloc_map_pages(struct page *page, +
> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> enable); +	if (debug_pagealloc_enabled_static()) +
> __kernel_map_pages(page, numpages, enable); } + #ifdef
> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> inline void -kernel_map_pages(struct page *page, int numpages, int
> enable) {} +static inline void debug_pagealloc_map_pages(struct page
> *page, +					     int numpages, int enable) {} #ifdef
> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {} static inline
> void hibernate_restore_unprotect_page(void *page_address) {} #endif
> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> 
> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> void +kernel_map_pages(struct page *page, int numpages, int enable)
> {} +#endif +

That change should go into a separate patch.


For the debug_pagealloc_map_pages() parts

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:05     ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:05 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Dave Hansen, linux-mm, Paul Mackerras,
	Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
	Will Deacon, linux-riscv, linux-s390, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> kernel direct mapping after free_pages(). The pages than need to be
> mapped back before they could be used. Theese mapping operations use 
> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> 
> The only place that calls __kernel_map_pages() without checking
> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> is set. Still, on arm64, __kernel_map_pages() will bail out when
> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> may render some pages not present in the direct map and hibernation
> code won't be able to save such pages.
> 
> To make page allocation debugging and hibernation interaction more
> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> has to be made more explicit.
> 
> Start with combining the guard condition and the call to 
> __kernel_map_pages() into a single debug_pagealloc_map_pages()
> function to emphasize that __kernel_map_pages() should not be called
> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> when page allocation debug is enabled.
> 
> As the only remaining user of kernel_map_pages() is the hibernation
> code, mode that function into kernel/power/snapshot.c closer to a
> caller.

s/mode/move/

> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index
> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> debug_pagealloc_enabled_static(void) #if
> defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> __kernel_map_pages(struct page *page, int numpages, int enable);
> 
> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> likely be - * guarded by debug_pagealloc_enabled() or
> debug_pagealloc_enabled_static() - */ -static inline void 
> -kernel_map_pages(struct page *page, int numpages, int enable) 
> +static inline void debug_pagealloc_map_pages(struct page *page, +
> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> enable); +	if (debug_pagealloc_enabled_static()) +
> __kernel_map_pages(page, numpages, enable); } + #ifdef
> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> inline void -kernel_map_pages(struct page *page, int numpages, int
> enable) {} +static inline void debug_pagealloc_map_pages(struct page
> *page, +					     int numpages, int enable) {} #ifdef
> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {} static inline
> void hibernate_restore_unprotect_page(void *page_address) {} #endif
> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> 
> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> void +kernel_map_pages(struct page *page, int numpages, int enable)
> {} +#endif +

That change should go into a separate patch.


For the debug_pagealloc_map_pages() parts

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:05     ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:05 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> kernel direct mapping after free_pages(). The pages than need to be
> mapped back before they could be used. Theese mapping operations use 
> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> 
> The only place that calls __kernel_map_pages() without checking
> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> is set. Still, on arm64, __kernel_map_pages() will bail out when
> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> may render some pages not present in the direct map and hibernation
> code won't be able to save such pages.
> 
> To make page allocation debugging and hibernation interaction more
> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> has to be made more explicit.
> 
> Start with combining the guard condition and the call to 
> __kernel_map_pages() into a single debug_pagealloc_map_pages()
> function to emphasize that __kernel_map_pages() should not be called
> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> when page allocation debug is enabled.
> 
> As the only remaining user of kernel_map_pages() is the hibernation
> code, mode that function into kernel/power/snapshot.c closer to a
> caller.

s/mode/move/

> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> insertions(+), 20 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index
> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> debug_pagealloc_enabled_static(void) #if
> defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> __kernel_map_pages(struct page *page, int numpages, int enable);
> 
> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> likely be - * guarded by debug_pagealloc_enabled() or
> debug_pagealloc_enabled_static() - */ -static inline void 
> -kernel_map_pages(struct page *page, int numpages, int enable) 
> +static inline void debug_pagealloc_map_pages(struct page *page, +
> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> enable); +	if (debug_pagealloc_enabled_static()) +
> __kernel_map_pages(page, numpages, enable); } + #ifdef
> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> inline void -kernel_map_pages(struct page *page, int numpages, int
> enable) {} +static inline void debug_pagealloc_map_pages(struct page
> *page, +					     int numpages, int enable) {} #ifdef
> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {} static inline
> void hibernate_restore_unprotect_page(void *page_address) {} #endif
> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> 
> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> void +kernel_map_pages(struct page *page, int numpages, int enable)
> {} +#endif +

That change should go into a separate patch.


For the debug_pagealloc_map_pages() parts

Reviewed-by: David Hildenbrand <david@redhat.com>


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
  2020-10-26 11:05     ` David Hildenbrand
                         ` (2 preceding siblings ...)
  (?)
@ 2020-10-26 11:54       ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26 11:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Albert Ou, Andy Lutomirski,
	Benjamin Herrenschmidt, Borislav Petkov, Catalin Marinas,
	Christian Borntraeger, Christoph Lameter, David S. Miller,
	Dave Hansen, David Rientjes, Edgecombe, Rick P, H. Peter Anvin,
	Heiko Carstens, Ingo Molnar, Joonsoo Kim, Kirill A. Shutemov,
	Len Brown, Michael Ellerman, Mike Rapoport, Palmer Dabbelt,
	Paul Mackerras, Paul Walmsley, Pavel Machek, Pekka Enberg,
	Peter Zijlstra, Rafael J. Wysocki, Thomas Gleixner,
	Vasily Gorbik, Will Deacon, linux-arm-kernel, linux-kernel,
	linux-mm, linux-pm, linux-riscv, linux-s390, linuxppc-dev,
	sparclinux, x86

On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
> On 25.10.20 11:15, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> > kernel direct mapping after free_pages(). The pages than need to be
> > mapped back before they could be used. Theese mapping operations use 
> > __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> > 
> > The only place that calls __kernel_map_pages() without checking
> > whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> > presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> > is set. Still, on arm64, __kernel_map_pages() will bail out when
> > DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> > may render some pages not present in the direct map and hibernation
> > code won't be able to save such pages.
> > 
> > To make page allocation debugging and hibernation interaction more
> > robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> > has to be made more explicit.
> > 
> > Start with combining the guard condition and the call to 
> > __kernel_map_pages() into a single debug_pagealloc_map_pages()
> > function to emphasize that __kernel_map_pages() should not be called
> > without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> > when page allocation debug is enabled.
> > 
> > As the only remaining user of kernel_map_pages() is the hibernation
> > code, mode that function into kernel/power/snapshot.c closer to a
> > caller.
> 
> s/mode/move/
> 
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> > include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> > | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> > |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> > insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h index
> > ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> > b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> > debug_pagealloc_enabled_static(void) #if
> > defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> > __kernel_map_pages(struct page *page, int numpages, int enable);
> > 
> > -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> > likely be - * guarded by debug_pagealloc_enabled() or
> > debug_pagealloc_enabled_static() - */ -static inline void 
> > -kernel_map_pages(struct page *page, int numpages, int enable) 
> > +static inline void debug_pagealloc_map_pages(struct page *page, +
> > int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> > enable); +	if (debug_pagealloc_enabled_static()) +
> > __kernel_map_pages(page, numpages, enable); } + #ifdef
> > CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> > *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> > CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> > inline void -kernel_map_pages(struct page *page, int numpages, int
> > enable) {} +static inline void debug_pagealloc_map_pages(struct page
> > *page, +					     int numpages, int enable) {} #ifdef
> > CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> > *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> > a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> > b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {} static inline
> > void hibernate_restore_unprotect_page(void *page_address) {} #endif
> > /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > 
> > +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> > +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> > __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> > void +kernel_map_pages(struct page *page, int numpages, int enable)
> > {} +#endif +
> 
> That change should go into a separate patch.
 
Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
right?

> For the debug_pagealloc_map_pages() parts
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
 
Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26 11:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
> On 25.10.20 11:15, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> > kernel direct mapping after free_pages(). The pages than need to be
> > mapped back before they could be used. Theese mapping operations use 
> > __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> > 
> > The only place that calls __kernel_map_pages() without checking
> > whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> > presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> > is set. Still, on arm64, __kernel_map_pages() will bail out when
> > DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> > may render some pages not present in the direct map and hibernation
> > code won't be able to save such pages.
> > 
> > To make page allocation debugging and hibernation interaction more
> > robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> > has to be made more explicit.
> > 
> > Start with combining the guard condition and the call to 
> > __kernel_map_pages() into a single debug_pagealloc_map_pages()
> > function to emphasize that __kernel_map_pages() should not be called
> > without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> > when page allocation debug is enabled.
> > 
> > As the only remaining user of kernel_map_pages() is the hibernation
> > code, mode that function into kernel/power/snapshot.c closer to a
> > caller.
> 
> s/mode/move/
> 
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> > include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> > | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> > |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> > insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h index
> > ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> > b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> > debug_pagealloc_enabled_static(void) #if
> > defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> > __kernel_map_pages(struct page *page, int numpages, int enable);
> > 
> > -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> > likely be - * guarded by debug_pagealloc_enabled() or
> > debug_pagealloc_enabled_static() - */ -static inline void 
> > -kernel_map_pages(struct page *page, int numpages, int enable) 
> > +static inline void debug_pagealloc_map_pages(struct page *page, +
> > int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> > enable); +	if (debug_pagealloc_enabled_static()) +
> > __kernel_map_pages(page, numpages, enable); } + #ifdef
> > CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> > *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> > CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> > inline void -kernel_map_pages(struct page *page, int numpages, int
> > enable) {} +static inline void debug_pagealloc_map_pages(struct page
> > *page, +					     int numpages, int enable) {} #ifdef
> > CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> > *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> > a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> > b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {} static inline
> > void hibernate_restore_unprotect_page(void *page_address) {} #endif
> > /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > 
> > +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> > +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> > __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> > void +kernel_map_pages(struct page *page, int numpages, int enable)
> > {} +#endif +
> 
> That change should go into a separate patch.
 
Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
right?

> For the debug_pagealloc_map_pages() parts
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
 
Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26 11:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
> On 25.10.20 11:15, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> > kernel direct mapping after free_pages(). The pages than need to be
> > mapped back before they could be used. Theese mapping operations use 
> > __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> > 
> > The only place that calls __kernel_map_pages() without checking
> > whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> > presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> > is set. Still, on arm64, __kernel_map_pages() will bail out when
> > DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> > may render some pages not present in the direct map and hibernation
> > code won't be able to save such pages.
> > 
> > To make page allocation debugging and hibernation interaction more
> > robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> > has to be made more explicit.
> > 
> > Start with combining the guard condition and the call to 
> > __kernel_map_pages() into a single debug_pagealloc_map_pages()
> > function to emphasize that __kernel_map_pages() should not be called
> > without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> > when page allocation debug is enabled.
> > 
> > As the only remaining user of kernel_map_pages() is the hibernation
> > code, mode that function into kernel/power/snapshot.c closer to a
> > caller.
> 
> s/mode/move/
> 
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> > include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> > | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> > |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> > insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h index
> > ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> > b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> > debug_pagealloc_enabled_static(void) #if
> > defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> > __kernel_map_pages(struct page *page, int numpages, int enable);
> > 
> > -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> > likely be - * guarded by debug_pagealloc_enabled() or
> > debug_pagealloc_enabled_static() - */ -static inline void 
> > -kernel_map_pages(struct page *page, int numpages, int enable) 
> > +static inline void debug_pagealloc_map_pages(struct page *page, +
> > int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> > enable); +	if (debug_pagealloc_enabled_static()) +
> > __kernel_map_pages(page, numpages, enable); } + #ifdef
> > CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> > *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> > CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> > inline void -kernel_map_pages(struct page *page, int numpages, int
> > enable) {} +static inline void debug_pagealloc_map_pages(struct page
> > *page, +					     int numpages, int enable) {} #ifdef
> > CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> > *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> > a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> > b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {} static inline
> > void hibernate_restore_unprotect_page(void *page_address) {} #endif
> > /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > 
> > +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> > +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> > __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> > void +kernel_map_pages(struct page *page, int numpages, int enable)
> > {} +#endif +
> 
> That change should go into a separate patch.
 
Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
right?

> For the debug_pagealloc_map_pages() parts
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
 
Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26 11:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, Dave Hansen, linux-mm, Paul Mackerras,
	Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
	Will Deacon, linux-riscv, linux-s390, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
> On 25.10.20 11:15, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> > kernel direct mapping after free_pages(). The pages than need to be
> > mapped back before they could be used. Theese mapping operations use 
> > __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> > 
> > The only place that calls __kernel_map_pages() without checking
> > whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> > presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> > is set. Still, on arm64, __kernel_map_pages() will bail out when
> > DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> > may render some pages not present in the direct map and hibernation
> > code won't be able to save such pages.
> > 
> > To make page allocation debugging and hibernation interaction more
> > robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> > has to be made more explicit.
> > 
> > Start with combining the guard condition and the call to 
> > __kernel_map_pages() into a single debug_pagealloc_map_pages()
> > function to emphasize that __kernel_map_pages() should not be called
> > without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> > when page allocation debug is enabled.
> > 
> > As the only remaining user of kernel_map_pages() is the hibernation
> > code, mode that function into kernel/power/snapshot.c closer to a
> > caller.
> 
> s/mode/move/
> 
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> > include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> > | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> > |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> > insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h index
> > ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> > b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> > debug_pagealloc_enabled_static(void) #if
> > defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> > __kernel_map_pages(struct page *page, int numpages, int enable);
> > 
> > -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> > likely be - * guarded by debug_pagealloc_enabled() or
> > debug_pagealloc_enabled_static() - */ -static inline void 
> > -kernel_map_pages(struct page *page, int numpages, int enable) 
> > +static inline void debug_pagealloc_map_pages(struct page *page, +
> > int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> > enable); +	if (debug_pagealloc_enabled_static()) +
> > __kernel_map_pages(page, numpages, enable); } + #ifdef
> > CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> > *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> > CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> > inline void -kernel_map_pages(struct page *page, int numpages, int
> > enable) {} +static inline void debug_pagealloc_map_pages(struct page
> > *page, +					     int numpages, int enable) {} #ifdef
> > CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> > *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> > a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> > b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {} static inline
> > void hibernate_restore_unprotect_page(void *page_address) {} #endif
> > /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > 
> > +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> > +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> > __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> > void +kernel_map_pages(struct page *page, int numpages, int enable)
> > {} +#endif +
> 
> That change should go into a separate patch.
 
Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
right?

> For the debug_pagealloc_map_pages() parts
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
 
Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-26 11:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
> On 25.10.20 11:15, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> > 
> > When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
> > kernel direct mapping after free_pages(). The pages than need to be
> > mapped back before they could be used. Theese mapping operations use 
> > __kernel_map_pages() guarded with with debug_pagealloc_enabled().
> > 
> > The only place that calls __kernel_map_pages() without checking
> > whether DEBUG_PAGEALLOC is enabled is the hibernation code that
> > presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
> > is set. Still, on arm64, __kernel_map_pages() will bail out when
> > DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
> > may render some pages not present in the direct map and hibernation
> > code won't be able to save such pages.
> > 
> > To make page allocation debugging and hibernation interaction more
> > robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
> > has to be made more explicit.
> > 
> > Start with combining the guard condition and the call to 
> > __kernel_map_pages() into a single debug_pagealloc_map_pages()
> > function to emphasize that __kernel_map_pages() should not be called
> > without DEBUG_PAGEALLOC and use this new function to map/unmap pages
> > when page allocation debug is enabled.
> > 
> > As the only remaining user of kernel_map_pages() is the hibernation
> > code, mode that function into kernel/power/snapshot.c closer to a
> > caller.
> 
> s/mode/move/
> 
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
> > include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
> > | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
> > |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
> > insertions(+), 20 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h index
> > ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
> > b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
> > debug_pagealloc_enabled_static(void) #if
> > defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
> > __kernel_map_pages(struct page *page, int numpages, int enable);
> > 
> > -/* - * When called in DEBUG_PAGEALLOC context, the call should most
> > likely be - * guarded by debug_pagealloc_enabled() or
> > debug_pagealloc_enabled_static() - */ -static inline void 
> > -kernel_map_pages(struct page *page, int numpages, int enable) 
> > +static inline void debug_pagealloc_map_pages(struct page *page, +
> > int numpages, int enable) { -	__kernel_map_pages(page, numpages,
> > enable); +	if (debug_pagealloc_enabled_static()) +
> > __kernel_map_pages(page, numpages, enable); } + #ifdef
> > CONFIG_HIBERNATION extern bool kernel_page_present(struct page
> > *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
> > CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
> > inline void -kernel_map_pages(struct page *page, int numpages, int
> > enable) {} +static inline void debug_pagealloc_map_pages(struct page
> > *page, +					     int numpages, int enable) {} #ifdef
> > CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
> > *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
> > a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
> > b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
> > hibernate_restore_protect_page(void *page_address) {} static inline
> > void hibernate_restore_unprotect_page(void *page_address) {} #endif
> > /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
> > 
> > +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> > defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
> > +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
> > __kernel_map_pages(page, numpages, enable); +} +#else +static inline
> > void +kernel_map_pages(struct page *page, int numpages, int enable)
> > {} +#endif +
> 
> That change should go into a separate patch.
 
Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
right?

> For the debug_pagealloc_map_pages() parts
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
 
Thanks!

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
  2020-10-26 11:54       ` Mike Rapoport
                           ` (2 preceding siblings ...)
  (?)
@ 2020-10-26 11:55         ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Albert Ou, Andy Lutomirski,
	Benjamin Herrenschmidt, Borislav Petkov, Catalin Marinas,
	Christian Borntraeger, Christoph Lameter, David S. Miller,
	Dave Hansen, David Rientjes, Edgecombe, Rick P, H. Peter Anvin,
	Heiko Carstens, Ingo Molnar, Joonsoo Kim, Kirill A. Shutemov,
	Len Brown, Michael Ellerman, Mike Rapoport, Palmer Dabbelt,
	Paul Mackerras, Paul Walmsley, Pavel Machek, Pekka Enberg,
	Peter Zijlstra, Rafael J. Wysocki, Thomas Gleixner,
	Vasily Gorbik, Will Deacon, linux-arm-kernel, linux-kernel,
	linux-mm, linux-pm, linux-riscv, linux-s390, linuxppc-dev,
	sparclinux, x86

On 26.10.20 12:54, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
>> On 25.10.20 11:15, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
>>> kernel direct mapping after free_pages(). The pages than need to be
>>> mapped back before they could be used. Theese mapping operations use 
>>> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
>>>
>>> The only place that calls __kernel_map_pages() without checking
>>> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
>>> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
>>> is set. Still, on arm64, __kernel_map_pages() will bail out when
>>> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
>>> may render some pages not present in the direct map and hibernation
>>> code won't be able to save such pages.
>>>
>>> To make page allocation debugging and hibernation interaction more
>>> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
>>> has to be made more explicit.
>>>
>>> Start with combining the guard condition and the call to 
>>> __kernel_map_pages() into a single debug_pagealloc_map_pages()
>>> function to emphasize that __kernel_map_pages() should not be called
>>> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
>>> when page allocation debug is enabled.
>>>
>>> As the only remaining user of kernel_map_pages() is the hibernation
>>> code, mode that function into kernel/power/snapshot.c closer to a
>>> caller.
>>
>> s/mode/move/
>>
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
>>> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
>>> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
>>> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
>>> insertions(+), 20 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h index
>>> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
>>> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
>>> debug_pagealloc_enabled_static(void) #if
>>> defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
>>> __kernel_map_pages(struct page *page, int numpages, int enable);
>>>
>>> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
>>> likely be - * guarded by debug_pagealloc_enabled() or
>>> debug_pagealloc_enabled_static() - */ -static inline void 
>>> -kernel_map_pages(struct page *page, int numpages, int enable) 
>>> +static inline void debug_pagealloc_map_pages(struct page *page, +
>>> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
>>> enable); +	if (debug_pagealloc_enabled_static()) +
>>> __kernel_map_pages(page, numpages, enable); } + #ifdef
>>> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
>>> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
>>> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
>>> inline void -kernel_map_pages(struct page *page, int numpages, int
>>> enable) {} +static inline void debug_pagealloc_map_pages(struct page
>>> *page, +					     int numpages, int enable) {} #ifdef
>>> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
>>> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
>>> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
>>> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
>>> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
>>> hibernate_restore_protect_page(void *page_address) {} static inline
>>> void hibernate_restore_unprotect_page(void *page_address) {} #endif
>>> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>>>
>>> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
>>> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
>>> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
>>> void +kernel_map_pages(struct page *page, int numpages, int enable)
>>> {} +#endif +
>>
>> That change should go into a separate patch.
>  
> Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
> right?

Sorry, yes!


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:55         ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On 26.10.20 12:54, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
>> On 25.10.20 11:15, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
>>> kernel direct mapping after free_pages(). The pages than need to be
>>> mapped back before they could be used. Theese mapping operations use 
>>> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
>>>
>>> The only place that calls __kernel_map_pages() without checking
>>> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
>>> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
>>> is set. Still, on arm64, __kernel_map_pages() will bail out when
>>> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
>>> may render some pages not present in the direct map and hibernation
>>> code won't be able to save such pages.
>>>
>>> To make page allocation debugging and hibernation interaction more
>>> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
>>> has to be made more explicit.
>>>
>>> Start with combining the guard condition and the call to 
>>> __kernel_map_pages() into a single debug_pagealloc_map_pages()
>>> function to emphasize that __kernel_map_pages() should not be called
>>> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
>>> when page allocation debug is enabled.
>>>
>>> As the only remaining user of kernel_map_pages() is the hibernation
>>> code, mode that function into kernel/power/snapshot.c closer to a
>>> caller.
>>
>> s/mode/move/
>>
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
>>> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
>>> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
>>> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
>>> insertions(+), 20 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h index
>>> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
>>> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
>>> debug_pagealloc_enabled_static(void) #if
>>> defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
>>> __kernel_map_pages(struct page *page, int numpages, int enable);
>>>
>>> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
>>> likely be - * guarded by debug_pagealloc_enabled() or
>>> debug_pagealloc_enabled_static() - */ -static inline void 
>>> -kernel_map_pages(struct page *page, int numpages, int enable) 
>>> +static inline void debug_pagealloc_map_pages(struct page *page, +
>>> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
>>> enable); +	if (debug_pagealloc_enabled_static()) +
>>> __kernel_map_pages(page, numpages, enable); } + #ifdef
>>> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
>>> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
>>> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
>>> inline void -kernel_map_pages(struct page *page, int numpages, int
>>> enable) {} +static inline void debug_pagealloc_map_pages(struct page
>>> *page, +					     int numpages, int enable) {} #ifdef
>>> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
>>> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
>>> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
>>> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
>>> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
>>> hibernate_restore_protect_page(void *page_address) {} static inline
>>> void hibernate_restore_unprotect_page(void *page_address) {} #endif
>>> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>>>
>>> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
>>> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
>>> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
>>> void +kernel_map_pages(struct page *page, int numpages, int enable)
>>> {} +#endif +
>>
>> That change should go into a separate patch.
>  
> Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
> right?

Sorry, yes!


-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:55         ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On 26.10.20 12:54, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
>> On 25.10.20 11:15, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
>>> kernel direct mapping after free_pages(). The pages than need to be
>>> mapped back before they could be used. Theese mapping operations use 
>>> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
>>>
>>> The only place that calls __kernel_map_pages() without checking
>>> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
>>> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
>>> is set. Still, on arm64, __kernel_map_pages() will bail out when
>>> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
>>> may render some pages not present in the direct map and hibernation
>>> code won't be able to save such pages.
>>>
>>> To make page allocation debugging and hibernation interaction more
>>> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
>>> has to be made more explicit.
>>>
>>> Start with combining the guard condition and the call to 
>>> __kernel_map_pages() into a single debug_pagealloc_map_pages()
>>> function to emphasize that __kernel_map_pages() should not be called
>>> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
>>> when page allocation debug is enabled.
>>>
>>> As the only remaining user of kernel_map_pages() is the hibernation
>>> code, mode that function into kernel/power/snapshot.c closer to a
>>> caller.
>>
>> s/mode/move/
>>
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
>>> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
>>> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
>>> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
>>> insertions(+), 20 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h index
>>> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
>>> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
>>> debug_pagealloc_enabled_static(void) #if
>>> defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
>>> __kernel_map_pages(struct page *page, int numpages, int enable);
>>>
>>> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
>>> likely be - * guarded by debug_pagealloc_enabled() or
>>> debug_pagealloc_enabled_static() - */ -static inline void 
>>> -kernel_map_pages(struct page *page, int numpages, int enable) 
>>> +static inline void debug_pagealloc_map_pages(struct page *page, +
>>> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
>>> enable); +	if (debug_pagealloc_enabled_static()) +
>>> __kernel_map_pages(page, numpages, enable); } + #ifdef
>>> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
>>> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
>>> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
>>> inline void -kernel_map_pages(struct page *page, int numpages, int
>>> enable) {} +static inline void debug_pagealloc_map_pages(struct page
>>> *page, +					     int numpages, int enable) {} #ifdef
>>> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
>>> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
>>> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
>>> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
>>> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
>>> hibernate_restore_protect_page(void *page_address) {} static inline
>>> void hibernate_restore_unprotect_page(void *page_address) {} #endif
>>> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>>>
>>> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
>>> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
>>> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
>>> void +kernel_map_pages(struct page *page, int numpages, int enable)
>>> {} +#endif +
>>
>> That change should go into a separate patch.
>  
> Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
> right?

Sorry, yes!


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:55         ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Peter Zijlstra, Dave Hansen, linux-mm, Paul Mackerras,
	Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
	Will Deacon, linux-riscv, linux-s390, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On 26.10.20 12:54, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
>> On 25.10.20 11:15, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
>>> kernel direct mapping after free_pages(). The pages than need to be
>>> mapped back before they could be used. Theese mapping operations use 
>>> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
>>>
>>> The only place that calls __kernel_map_pages() without checking
>>> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
>>> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
>>> is set. Still, on arm64, __kernel_map_pages() will bail out when
>>> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
>>> may render some pages not present in the direct map and hibernation
>>> code won't be able to save such pages.
>>>
>>> To make page allocation debugging and hibernation interaction more
>>> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
>>> has to be made more explicit.
>>>
>>> Start with combining the guard condition and the call to 
>>> __kernel_map_pages() into a single debug_pagealloc_map_pages()
>>> function to emphasize that __kernel_map_pages() should not be called
>>> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
>>> when page allocation debug is enabled.
>>>
>>> As the only remaining user of kernel_map_pages() is the hibernation
>>> code, mode that function into kernel/power/snapshot.c closer to a
>>> caller.
>>
>> s/mode/move/
>>
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
>>> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
>>> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
>>> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
>>> insertions(+), 20 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h index
>>> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
>>> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
>>> debug_pagealloc_enabled_static(void) #if
>>> defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
>>> __kernel_map_pages(struct page *page, int numpages, int enable);
>>>
>>> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
>>> likely be - * guarded by debug_pagealloc_enabled() or
>>> debug_pagealloc_enabled_static() - */ -static inline void 
>>> -kernel_map_pages(struct page *page, int numpages, int enable) 
>>> +static inline void debug_pagealloc_map_pages(struct page *page, +
>>> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
>>> enable); +	if (debug_pagealloc_enabled_static()) +
>>> __kernel_map_pages(page, numpages, enable); } + #ifdef
>>> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
>>> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
>>> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
>>> inline void -kernel_map_pages(struct page *page, int numpages, int
>>> enable) {} +static inline void debug_pagealloc_map_pages(struct page
>>> *page, +					     int numpages, int enable) {} #ifdef
>>> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
>>> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
>>> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
>>> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
>>> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
>>> hibernate_restore_protect_page(void *page_address) {} static inline
>>> void hibernate_restore_unprotect_page(void *page_address) {} #endif
>>> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>>>
>>> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
>>> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
>>> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
>>> void +kernel_map_pages(struct page *page, int numpages, int enable)
>>> {} +#endif +
>>
>> That change should go into a separate patch.
>  
> Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
> right?

Sorry, yes!


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
@ 2020-10-26 11:55         ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-26 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, Joonsoo Kim,
	linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
	Palmer Dabbelt, Andrew Morton, Edgecombe, Rick P, linuxppc-dev,
	David S. Miller

On 26.10.20 12:54, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:05:13PM +0100, David Hildenbrand wrote:
>> On 25.10.20 11:15, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the
>>> kernel direct mapping after free_pages(). The pages than need to be
>>> mapped back before they could be used. Theese mapping operations use 
>>> __kernel_map_pages() guarded with with debug_pagealloc_enabled().
>>>
>>> The only place that calls __kernel_map_pages() without checking
>>> whether DEBUG_PAGEALLOC is enabled is the hibernation code that
>>> presumes availability of this function when ARCH_HAS_SET_DIRECT_MAP
>>> is set. Still, on arm64, __kernel_map_pages() will bail out when
>>> DEBUG_PAGEALLOC is not enabled but set_direct_map_invalid_noflush()
>>> may render some pages not present in the direct map and hibernation
>>> code won't be able to save such pages.
>>>
>>> To make page allocation debugging and hibernation interaction more
>>> robust, the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP
>>> has to be made more explicit.
>>>
>>> Start with combining the guard condition and the call to 
>>> __kernel_map_pages() into a single debug_pagealloc_map_pages()
>>> function to emphasize that __kernel_map_pages() should not be called
>>> without DEBUG_PAGEALLOC and use this new function to map/unmap pages
>>> when page allocation debug is enabled.
>>>
>>> As the only remaining user of kernel_map_pages() is the hibernation
>>> code, mode that function into kernel/power/snapshot.c closer to a
>>> caller.
>>
>> s/mode/move/
>>
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> --- 
>>> include/linux/mm.h      | 16 +++++++--------- kernel/power/snapshot.c
>>> | 11 +++++++++++ mm/memory_hotplug.c     |  3 +-- mm/page_alloc.c
>>> |  6 ++---- mm/slab.c               |  8 +++----- 5 files changed, 24
>>> insertions(+), 20 deletions(-)
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h index
>>> ef360fe70aaf..14e397f3752c 100644 --- a/include/linux/mm.h +++
>>> b/include/linux/mm.h @@ -2927,21 +2927,19 @@ static inline bool
>>> debug_pagealloc_enabled_static(void) #if
>>> defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void
>>> __kernel_map_pages(struct page *page, int numpages, int enable);
>>>
>>> -/* - * When called in DEBUG_PAGEALLOC context, the call should most
>>> likely be - * guarded by debug_pagealloc_enabled() or
>>> debug_pagealloc_enabled_static() - */ -static inline void 
>>> -kernel_map_pages(struct page *page, int numpages, int enable) 
>>> +static inline void debug_pagealloc_map_pages(struct page *page, +
>>> int numpages, int enable) { -	__kernel_map_pages(page, numpages,
>>> enable); +	if (debug_pagealloc_enabled_static()) +
>>> __kernel_map_pages(page, numpages, enable); } + #ifdef
>>> CONFIG_HIBERNATION extern bool kernel_page_present(struct page
>>> *page); #endif	/* CONFIG_HIBERNATION */ #else	/*
>>> CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */ -static
>>> inline void -kernel_map_pages(struct page *page, int numpages, int
>>> enable) {} +static inline void debug_pagealloc_map_pages(struct page
>>> *page, +					     int numpages, int enable) {} #ifdef
>>> CONFIG_HIBERNATION static inline bool kernel_page_present(struct page
>>> *page) { return true; } #endif	/* CONFIG_HIBERNATION */ diff --git
>>> a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
>>> 46b1804c1ddf..fa499466f645 100644 --- a/kernel/power/snapshot.c +++
>>> b/kernel/power/snapshot.c @@ -76,6 +76,17 @@ static inline void
>>> hibernate_restore_protect_page(void *page_address) {} static inline
>>> void hibernate_restore_unprotect_page(void *page_address) {} #endif
>>> /* CONFIG_STRICT_KERNEL_RWX  && CONFIG_ARCH_HAS_SET_MEMORY */
>>>
>>> +#if defined(CONFIG_DEBUG_PAGEALLOC) ||
>>> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) +static inline void 
>>> +kernel_map_pages(struct page *page, int numpages, int enable) +{ +
>>> __kernel_map_pages(page, numpages, enable); +} +#else +static inline
>>> void +kernel_map_pages(struct page *page, int numpages, int enable)
>>> {} +#endif +
>>
>> That change should go into a separate patch.
>  
> Hmm, I beleive you refer to moving kernel_map_pages() to snapshot.c,
> right?

Sorry, yes!


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-26  9:05     ` Mike Rapoport
                         ` (3 preceding siblings ...)
  (?)
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

T24gTW9uLCAyMDIwLTEwLTI2IGF0IDExOjA1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBNb24sIE9jdCAyNiwgMjAyMCBhdCAwMToxMzo1MkFNICswMDAwLCBFZGdlY29tYmUsIFJp
Y2sgUCB3cm90ZToNCj4gPiBPbiBTdW4sIDIwMjAtMTAtMjUgYXQgMTI6MTUgKzAyMDAsIE1pa2Ug
UmFwb3BvcnQgd3JvdGU6DQo+ID4gPiBJbmRlZWQsIGZvciBhcmNoaXRlY3R1cmVzIHRoYXQgZGVm
aW5lDQo+ID4gPiBDT05GSUdfQVJDSF9IQVNfU0VUX0RJUkVDVF9NQVANCj4gPiA+IGl0IGlzDQo+
ID4gPiBwb3NzaWJsZSB0aGF0IF9fa2VybmVsX21hcF9wYWdlcygpIHdvdWxkIGZhaWwsIGJ1dCBz
aW5jZSB0aGlzDQo+ID4gPiBmdW5jdGlvbiBpcw0KPiA+ID4gdm9pZCwgdGhlIGZhaWx1cmUgd2ls
bCBnbyB1bm5vdGljZWQuDQo+ID4gDQo+ID4gQ291bGQgeW91IGVsYWJvcmF0ZSBvbiBob3cgdGhp
cyBjb3VsZCBoYXBwZW4/IERvIHlvdSBtZWFuIGR1cmluZw0KPiA+IHJ1bnRpbWUgdG9kYXkgb3Ig
aWYgc29tZXRoaW5nIG5ldyB3YXMgaW50cm9kdWNlZD8NCj4gDQo+IEEgZmFpbHVyZSBpbl9fa2Vy
bmVsX21hcF9wYWdlcygpIG1heSBoYXBwZW4gdG9kYXkuIEZvciBpbnN0YW5jZSwgb24NCj4geDg2
DQo+IGlmIHRoZSBrZXJuZWwgaXMgYnVpbHQgd2l0aCBERUJVR19QQUdFQUxMT0MuDQo+IA0KPiAg
ICAgICAgIF9fa2VybmVsX21hcF9wYWdlcyhwYWdlLCAxLCAwKTsNCj4gDQo+IHdpbGwgbmVlZCB0
byBzcGxpdCwgc2F5LCAyTSBwYWdlIGFuZCBkdXJpbmcgdGhlIHNwbGl0IGFuIGFsbG9jYXRpb24N
Cj4gb2YNCj4gcGFnZSB0YWJsZSBjb3VsZCBmYWlsLg0KDQpPbiB4ODYgYXQgbGVhc3QsIERFQlVH
X1BBR0VBTExPQyBleHBlY3RzIHRvIG5ldmVyIGhhdmUgdG8gYnJlYWsgYSBwYWdlDQpvbiB0aGUg
ZGlyZWN0IG1hcCBhbmQgZXZlbiBkaXNhYmxlcyBsb2NraW5nIGluIGNwYSBiZWNhdXNlIGl0IGFz
c3VtZXMNCnRoaXMuIElmIHRoaXMgaXMgaGFwcGVuaW5nIHNvbWVob3cgYW55d2F5IHRoZW4gd2Ug
c2hvdWxkIHByb2JhYmx5IGZpeA0KdGhhdC4gRXZlbiBpZiBpdCdzIGEgZGVidWcgZmVhdHVyZSwg
aXQgd2lsbCBub3QgYmUgYXMgdXNlZnVsIGlmIGl0IGlzDQpjYXVzaW5nIGl0cyBvd24gY3Jhc2hl
cy4NCg0KSSdtIHN0aWxsIHdvbmRlcmluZyBpZiB0aGVyZSBpcyBzb21ldGhpbmcgSSdtIG1pc3Np
bmcgaGVyZS4gSXQgc2VlbXMNCmxpa2UgeW91IGFyZSBzYXlpbmcgdGhlcmUgaXMgYSBidWcgaW4g
c29tZSBhcmNoJ3MsIHNvIGxldCdzIGFkZCBhIFdBUk4NCmluIGNyb3NzLWFyY2ggY29kZSB0byBs
b2cgaXQgYXMgaXQgY3Jhc2hlcy4gQSB3YXJuIGFuZCBtYWtpbmcgdGhpbmdzDQpjbGVhcmVyIHNl
ZW0gbGlrZSBnb29kIGlkZWFzLCBidXQgaWYgdGhlcmUgaXMgYSBidWcgd2Ugc2hvdWxkIGZpeCBp
dC4NClRoZSBjb2RlIGFyb3VuZCB0aGUgY2FsbGVycyBzdGlsbCBmdW5jdGlvbmFsbHkgYXNzdW1l
IHJlLW1hcHBpbmcgY2FuJ3QNCmZhaWwuDQoNCj4gQ3VycmVudGx5LCB0aGUgb25seSB1c2VyIG9m
IF9fa2VybmVsX21hcF9wYWdlcygpIG91dHNpZGUNCj4gREVCVUdfUEFHRUFMTE9DDQo+IGlzIGhp
YmVybmF0aW9uLCBidXQgSSB0aGluayBpdCB3b3VsZCBiZSBzYWZlciB0byBlbnRpcmVseSBwcmV2
ZW50DQo+IHVzYWdlDQo+IG9mIF9fa2VybmVsX21hcF9wYWdlcygpIHdoZW4gREVCVUdfUEFHRUFM
TE9DPW4uDQoNCkkgdG90YWxseSBhZ3JlZSBpdCdzIGVycm9yIHByb25lIEZXSVcuIE9uIHg4Niwg
bXkgbWVudGFsIG1vZGVsIG9mIGhvdw0KaXQgaXMgc3VwcG9zZWQgdG8gd29yayBpczogSWYgYSBw
YWdlIGlzIDRrIGFuZCBOUCBpdCBjYW5ub3QgZmFpbCB0byBiZQ0KcmVtYXBwZWQuIHNldF9kaXJl
Y3RfbWFwX2ludmFsaWRfbm9mbHVzaCgpIHNob3VsZCByZXN1bHQgaW4gNGsgTlANCnBhZ2VzLCBh
bmQgREVCVUdfUEFHRUFMTE9DIHNob3VsZCByZXN1bHQgaW4gYWxsIDRrIHBhZ2VzIG9uIHRoZSBk
aXJlY3QNCm1hcC4gQXJlIHlvdSBzZWVpbmcgdGhpcyB2aW9sYXRlZCBvciBkbyBJIGhhdmUgd3Jv
bmcgYXNzdW1wdGlvbnM/DQoNCkJleW9uZCB3aGF0ZXZlciB5b3UgYXJlIHNlZWluZywgZm9yIHRo
ZSBsYXR0ZXIgY2FzZSBvZiBuZXcgdGhpbmdzDQpnZXR0aW5nIGludHJvZHVjZWQgdG8gYW4gaW50
ZXJmYWNlIHdpdGggaGlkZGVuIGRlcGVuZGVuY2llcy4uLiBBbm90aGVyDQplZGdlIGNhc2UgY291
bGQgYmUgYSBuZXcgY2FsbGVyIHRvIHNldF9tZW1vcnlfbnAoKSBjb3VsZCByZXN1bHQgaW4NCmxh
cmdlIE5QIHBhZ2VzLiBOb25lIG9mIHRoZSBjYWxsZXJzIHRvZGF5IHNob3VsZCBjYXVzZSB0aGlz
IEFGQUlDVCwgYnV0DQppdCdzIG5vdCBncmVhdCB0byByZWx5IG9uIHRoZSBjYWxsZXJzIHRvIGtu
b3cgdGhlc2UgZGV0YWlscy4NCg0KDQo

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-26 18:05       ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:05 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > Indeed, for architectures that define
> > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > it is
> > > possible that __kernel_map_pages() would fail, but since this
> > > function is
> > > void, the failure will go unnoticed.
> > 
> > Could you elaborate on how this could happen? Do you mean during
> > runtime today or if something new was introduced?
> 
> A failure in__kernel_map_pages() may happen today. For instance, on
> x86
> if the kernel is built with DEBUG_PAGEALLOC.
> 
>         __kernel_map_pages(page, 1, 0);
> 
> will need to split, say, 2M page and during the split an allocation
> of
> page table could fail.

On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
on the direct map and even disables locking in cpa because it assumes
this. If this is happening somehow anyway then we should probably fix
that. Even if it's a debug feature, it will not be as useful if it is
causing its own crashes.

I'm still wondering if there is something I'm missing here. It seems
like you are saying there is a bug in some arch's, so let's add a WARN
in cross-arch code to log it as it crashes. A warn and making things
clearer seem like good ideas, but if there is a bug we should fix it.
The code around the callers still functionally assume re-mapping can't
fail.

> Currently, the only user of __kernel_map_pages() outside
> DEBUG_PAGEALLOC
> is hibernation, but I think it would be safer to entirely prevent
> usage
> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.

I totally agree it's error prone FWIW. On x86, my mental model of how
it is supposed to work is: If a page is 4k and NP it cannot fail to be
remapped. set_direct_map_invalid_noflush() should result in 4k NP
pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
map. Are you seeing this violated or do I have wrong assumptions?

Beyond whatever you are seeing, for the latter case of new things
getting introduced to an interface with hidden dependencies... Another
edge case could be a new caller to set_memory_np() could result in
large NP pages. None of the callers today should cause this AFAICT, but
it's not great to rely on the callers to know these details.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-26  9:15       ` Mike Rapoport
                           ` (3 preceding siblings ...)
  (?)
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > may
> > > be
> > > not present in the direct map and has to be explicitly mapped
> > > before
> > > it
> > > could be copied.
> > > 
> > > On arm64 it is possible that a page would be removed from the
> > > direct
> > > map
> > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > will
> > > refuse
> > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > 
> > It looks to me that arm64 __kernel_map_pages() will still attempt
> > to
> > map it if rodata_full is true, how does this happen?
> 
> Unless I misread the code, arm64 requires both rodata_full and
> debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> anything.
> But rodata_full condition applies to set_direct_map_*_noflush() as
> well,
> so with !rodata_full the linear map won't be ever changed.

Hmm, looks to me that __kernel_map_pages() will only skip it if both
debug pagealloc and rodata_full are false.

But now I'm wondering if maybe we could simplify things by just moving
the hibernate unmapped page logic off of the direct map. On x86,
text_poke() used to use this reserved fixmap pte thing that it could
rely on to remap memory with. If hibernate had some separate pte for
remapping like that, then we could not have any direct map restrictions
caused by it/kernel_map_pages(), and it wouldn't have to worry about
relying on anything else.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > may
> > > be
> > > not present in the direct map and has to be explicitly mapped
> > > before
> > > it
> > > could be copied.
> > > 
> > > On arm64 it is possible that a page would be removed from the
> > > direct
> > > map
> > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > will
> > > refuse
> > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > 
> > It looks to me that arm64 __kernel_map_pages() will still attempt
> > to
> > map it if rodata_full is true, how does this happen?
> 
> Unless I misread the code, arm64 requires both rodata_full and
> debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> anything.
> But rodata_full condition applies to set_direct_map_*_noflush() as
> well,
> so with !rodata_full the linear map won't be ever changed.

Hmm, looks to me that __kernel_map_pages() will only skip it if both
debug pagealloc and rodata_full are false.

But now I'm wondering if maybe we could simplify things by just moving
the hibernate unmapped page logic off of the direct map. On x86,
text_poke() used to use this reserved fixmap pte thing that it could
rely on to remap memory with. If hibernate had some separate pte for
remapping like that, then we could not have any direct map restrictions
caused by it/kernel_map_pages(), and it wouldn't have to worry about
relying on anything else.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

T24gTW9uLCAyMDIwLTEwLTI2IGF0IDExOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBNb24sIE9jdCAyNiwgMjAyMCBhdCAxMjozODozMkFNICswMDAwLCBFZGdlY29tYmUsIFJp
Y2sgUCB3cm90ZToNCj4gPiBPbiBTdW4sIDIwMjAtMTAtMjUgYXQgMTI6MTUgKzAyMDAsIE1pa2Ug
UmFwb3BvcnQgd3JvdGU6DQo+ID4gPiBGcm9tOiBNaWtlIFJhcG9wb3J0IDxycHB0QGxpbnV4Lmli
bS5jb20+DQo+ID4gPiANCj4gPiA+IFdoZW4gREVCVUdfUEFHRUFMTE9DIG9yIEFSQ0hfSEFTX1NF
VF9ESVJFQ1RfTUFQIGlzIGVuYWJsZWQgYSBwYWdlDQo+ID4gPiBtYXkNCj4gPiA+IGJlDQo+ID4g
PiBub3QgcHJlc2VudCBpbiB0aGUgZGlyZWN0IG1hcCBhbmQgaGFzIHRvIGJlIGV4cGxpY2l0bHkg
bWFwcGVkDQo+ID4gPiBiZWZvcmUNCj4gPiA+IGl0DQo+ID4gPiBjb3VsZCBiZSBjb3BpZWQuDQo+
ID4gPiANCj4gPiA+IE9uIGFybTY0IGl0IGlzIHBvc3NpYmxlIHRoYXQgYSBwYWdlIHdvdWxkIGJl
IHJlbW92ZWQgZnJvbSB0aGUNCj4gPiA+IGRpcmVjdA0KPiA+ID4gbWFwDQo+ID4gPiB1c2luZyBz
ZXRfZGlyZWN0X21hcF9pbnZhbGlkX25vZmx1c2goKSBidXQgX19rZXJuZWxfbWFwX3BhZ2VzKCkN
Cj4gPiA+IHdpbGwNCj4gPiA+IHJlZnVzZQ0KPiA+ID4gdG8gbWFwIHRoaXMgcGFnZSBiYWNrIGlm
IERFQlVHX1BBR0VBTExPQyBpcyBkaXNhYmxlZC4NCj4gPiANCj4gPiBJdCBsb29rcyB0byBtZSB0
aGF0IGFybTY0IF9fa2VybmVsX21hcF9wYWdlcygpIHdpbGwgc3RpbGwgYXR0ZW1wdA0KPiA+IHRv
DQo+ID4gbWFwIGl0IGlmIHJvZGF0YV9mdWxsIGlzIHRydWUsIGhvdyBkb2VzIHRoaXMgaGFwcGVu
Pw0KPiANCj4gVW5sZXNzIEkgbWlzcmVhZCB0aGUgY29kZSwgYXJtNjQgcmVxdWlyZXMgYm90aCBy
b2RhdGFfZnVsbCBhbmQNCj4gZGVidWdfcGFnZWFsbG9jX2VuYWJsZWQoKSB0byBiZSB0cnVlIGZv
ciBfX2tlcm5lbF9tYXBfcGFnZXMoKSB0byBkbw0KPiBhbnl0aGluZy4NCj4gQnV0IHJvZGF0YV9m
dWxsIGNvbmRpdGlvbiBhcHBsaWVzIHRvIHNldF9kaXJlY3RfbWFwXypfbm9mbHVzaCgpIGFzDQo+
IHdlbGwsDQo+IHNvIHdpdGggIXJvZGF0YV9mdWxsIHRoZSBsaW5lYXIgbWFwIHdvbid0IGJlIGV2
ZXIgY2hhbmdlZC4NCg0KSG1tLCBsb29rcyB0byBtZSB0aGF0IF9fa2VybmVsX21hcF9wYWdlcygp
IHdpbGwgb25seSBza2lwIGl0IGlmIGJvdGgNCmRlYnVnIHBhZ2VhbGxvYyBhbmQgcm9kYXRhX2Z1
bGwgYXJlIGZhbHNlLg0KDQpCdXQgbm93IEknbSB3b25kZXJpbmcgaWYgbWF5YmUgd2UgY291bGQg
c2ltcGxpZnkgdGhpbmdzIGJ5IGp1c3QgbW92aW5nDQp0aGUgaGliZXJuYXRlIHVubWFwcGVkIHBh
Z2UgbG9naWMgb2ZmIG9mIHRoZSBkaXJlY3QgbWFwLiBPbiB4ODYsDQp0ZXh0X3Bva2UoKSB1c2Vk
IHRvIHVzZSB0aGlzIHJlc2VydmVkIGZpeG1hcCBwdGUgdGhpbmcgdGhhdCBpdCBjb3VsZA0KcmVs
eSBvbiB0byByZW1hcCBtZW1vcnkgd2l0aC4gSWYgaGliZXJuYXRlIGhhZCBzb21lIHNlcGFyYXRl
IHB0ZSBmb3INCnJlbWFwcGluZyBsaWtlIHRoYXQsIHRoZW4gd2UgY291bGQgbm90IGhhdmUgYW55
IGRpcmVjdCBtYXAgcmVzdHJpY3Rpb25zDQpjYXVzZWQgYnkgaXQva2VybmVsX21hcF9wYWdlcygp
LCBhbmQgaXQgd291bGRuJ3QgaGF2ZSB0byB3b3JyeSBhYm91dA0KcmVseWluZyBvbiBhbnl0aGlu
ZyBlbHNlLg0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > may
> > > be
> > > not present in the direct map and has to be explicitly mapped
> > > before
> > > it
> > > could be copied.
> > > 
> > > On arm64 it is possible that a page would be removed from the
> > > direct
> > > map
> > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > will
> > > refuse
> > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > 
> > It looks to me that arm64 __kernel_map_pages() will still attempt
> > to
> > map it if rodata_full is true, how does this happen?
> 
> Unless I misread the code, arm64 requires both rodata_full and
> debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> anything.
> But rodata_full condition applies to set_direct_map_*_noflush() as
> well,
> so with !rodata_full the linear map won't be ever changed.

Hmm, looks to me that __kernel_map_pages() will only skip it if both
debug pagealloc and rodata_full are false.

But now I'm wondering if maybe we could simplify things by just moving
the hibernate unmapped page logic off of the direct map. On x86,
text_poke() used to use this reserved fixmap pte thing that it could
rely on to remap memory with. If hibernate had some separate pte for
remapping like that, then we could not have any direct map restrictions
caused by it/kernel_map_pages(), and it wouldn't have to worry about
relying on anything else.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > may
> > > be
> > > not present in the direct map and has to be explicitly mapped
> > > before
> > > it
> > > could be copied.
> > > 
> > > On arm64 it is possible that a page would be removed from the
> > > direct
> > > map
> > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > will
> > > refuse
> > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > 
> > It looks to me that arm64 __kernel_map_pages() will still attempt
> > to
> > map it if rodata_full is true, how does this happen?
> 
> Unless I misread the code, arm64 requires both rodata_full and
> debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> anything.
> But rodata_full condition applies to set_direct_map_*_noflush() as
> well,
> so with !rodata_full the linear map won't be ever changed.

Hmm, looks to me that __kernel_map_pages() will only skip it if both
debug pagealloc and rodata_full are false.

But now I'm wondering if maybe we could simplify things by just moving
the hibernate unmapped page logic off of the direct map. On x86,
text_poke() used to use this reserved fixmap pte thing that it could
rely on to remap memory with. If hibernate had some separate pte for
remapping like that, then we could not have any direct map restrictions
caused by it/kernel_map_pages(), and it wouldn't have to worry about
relying on anything else.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-26 18:57         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-26 18:57 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > 
> > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > may
> > > be
> > > not present in the direct map and has to be explicitly mapped
> > > before
> > > it
> > > could be copied.
> > > 
> > > On arm64 it is possible that a page would be removed from the
> > > direct
> > > map
> > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > will
> > > refuse
> > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > 
> > It looks to me that arm64 __kernel_map_pages() will still attempt
> > to
> > map it if rodata_full is true, how does this happen?
> 
> Unless I misread the code, arm64 requires both rodata_full and
> debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> anything.
> But rodata_full condition applies to set_direct_map_*_noflush() as
> well,
> so with !rodata_full the linear map won't be ever changed.

Hmm, looks to me that __kernel_map_pages() will only skip it if both
debug pagealloc and rodata_full are false.

But now I'm wondering if maybe we could simplify things by just moving
the hibernate unmapped page logic off of the direct map. On x86,
text_poke() used to use this reserved fixmap pte thing that it could
rely on to remap memory with. If hibernate had some separate pte for
remapping like that, then we could not have any direct map restrictions
caused by it/kernel_map_pages(), and it wouldn't have to worry about
relying on anything else.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-26  9:15       ` Mike Rapoport
                           ` (3 preceding siblings ...)
  (?)
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> The intention of this series is to disallow usage of
> __kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch
> to
> better handle possible errors, but I still want to keep WARN in the
> caller.

Sorry, I missed this snippet at the bottom, that your intention was to
actually handle errors somehow in the hibernate code. Please ignore my
other comment that assumed you wanted to just add a warn.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> The intention of this series is to disallow usage of
> __kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch
> to
> better handle possible errors, but I still want to keep WARN in the
> caller.

Sorry, I missed this snippet at the bottom, that your intention was to
actually handle errors somehow in the hibernate code. Please ignore my
other comment that assumed you wanted to just add a warn.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

T24gTW9uLCAyMDIwLTEwLTI2IGF0IDExOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBUaGUgaW50ZW50aW9uIG9mIHRoaXMgc2VyaWVzIGlzIHRvIGRpc2FsbG93IHVzYWdlIG9mDQo+
IF9fa2VybmVsX21hcF9wYWdlcygpIHdoZW4gREVCVUdfUEFHRUFMTE9DPW4uIEknbGwgdXBkYXRl
IHRoaXMgcGF0Y2gNCj4gdG8NCj4gYmV0dGVyIGhhbmRsZSBwb3NzaWJsZSBlcnJvcnMsIGJ1dCBJ
IHN0aWxsIHdhbnQgdG8ga2VlcCBXQVJOIGluIHRoZQ0KPiBjYWxsZXIuDQoNClNvcnJ5LCBJIG1p
c3NlZCB0aGlzIHNuaXBwZXQgYXQgdGhlIGJvdHRvbSwgdGhhdCB5b3VyIGludGVudGlvbiB3YXMg
dG8NCmFjdHVhbGx5IGhhbmRsZSBlcnJvcnMgc29tZWhvdyBpbiB0aGUgaGliZXJuYXRlIGNvZGUu
IFBsZWFzZSBpZ25vcmUgbXkNCm90aGVyIGNvbW1lbnQgdGhhdCBhc3N1bWVkIHlvdSB3YW50ZWQg
dG8ganVzdCBhZGQgYSB3YXJuLg0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> The intention of this series is to disallow usage of
> __kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch
> to
> better handle possible errors, but I still want to keep WARN in the
> caller.

Sorry, I missed this snippet at the bottom, that your intention was to
actually handle errors somehow in the hibernate code. Please ignore my
other comment that assumed you wanted to just add a warn.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> The intention of this series is to disallow usage of
> __kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch
> to
> better handle possible errors, but I still want to keep WARN in the
> caller.

Sorry, I missed this snippet at the bottom, that your intention was to
actually handle errors somehow in the hibernate code. Please ignore my
other comment that assumed you wanted to just add a warn.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  1:10         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27  1:10 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> The intention of this series is to disallow usage of
> __kernel_map_pages() when DEBUG_PAGEALLOC=n. I'll update this patch
> to
> better handle possible errors, but I still want to keep WARN in the
> caller.

Sorry, I missed this snippet at the bottom, that your intention was to
actually handle errors somehow in the hibernate code. Please ignore my
other comment that assumed you wanted to just add a warn.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-26 18:05       ` Edgecombe, Rick P
                           ` (3 preceding siblings ...)
  (?)
@ 2020-10-27  8:38         ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:38         ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:38         ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:38         ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:38         ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:38         ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:38 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > Indeed, for architectures that define
> > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > it is
> > > > possible that __kernel_map_pages() would fail, but since this
> > > > function is
> > > > void, the failure will go unnoticed.
> > > 
> > > Could you elaborate on how this could happen? Do you mean during
> > > runtime today or if something new was introduced?
> > 
> > A failure in__kernel_map_pages() may happen today. For instance, on
> > x86
> > if the kernel is built with DEBUG_PAGEALLOC.
> > 
> >         __kernel_map_pages(page, 1, 0);
> > 
> > will need to split, say, 2M page and during the split an allocation
> > of
> > page table could fail.
> 
> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> on the direct map and even disables locking in cpa because it assumes
> this. If this is happening somehow anyway then we should probably fix
> that. Even if it's a debug feature, it will not be as useful if it is
> causing its own crashes.
> 
> I'm still wondering if there is something I'm missing here. It seems
> like you are saying there is a bug in some arch's, so let's add a WARN
> in cross-arch code to log it as it crashes. A warn and making things
> clearer seem like good ideas, but if there is a bug we should fix it.
> The code around the callers still functionally assume re-mapping can't
> fail.

Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
that unmaps pages back in safe_copy_page will just reset a 4K page to
NP because whatever made it NP at the first place already did the split.

Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
between map/unmap dance in __vunmap() and safe_copy_page() that may
cause access to unmapped memory:

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
					safe_copy_page()	
					    __kernel_map_pages()
					    	return
					    do_copy_page() -> fault
					   	
This is a theoretical bug, but it is still not nice :) 							

> > Currently, the only user of __kernel_map_pages() outside
> > DEBUG_PAGEALLOC
> > is hibernation, but I think it would be safer to entirely prevent
> > usage
> > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> 
> I totally agree it's error prone FWIW. On x86, my mental model of how
> it is supposed to work is: If a page is 4k and NP it cannot fail to be
> remapped. set_direct_map_invalid_noflush() should result in 4k NP
> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> map. Are you seeing this violated or do I have wrong assumptions?

You are right, there is a set of assumptions about the remapping of the
direct map pages that make it all work, at least on x86.
But this is very subtle and it's not easy to wrap one's head around
this.

That's why putting __kernel_map_pages() out of "common" use and
keep it only for DEBUG_PAGEALLOC would make things clearer.

> Beyond whatever you are seeing, for the latter case of new things
> getting introduced to an interface with hidden dependencies... Another
> edge case could be a new caller to set_memory_np() could result in
> large NP pages. None of the callers today should cause this AFAICT, but
> it's not great to rely on the callers to know these details.
 
A caller of set_memory_*() or set_direct_map_*() should expect a failure
and be ready for that. So adding a WARN to safe_copy_page() is the first
step in that direction :)

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-27  8:38         ` Mike Rapoport
                             ` (3 preceding siblings ...)
  (?)
@ 2020-10-27  8:46           ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: cl, gor, hpa, peterz, catalin.marinas, dave.hansen, borntraeger,
	penberg, linux-mm, iamjoonsoo.kim, will, aou, kirill, rientjes,
	rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto, davem, mpe,
	benh, linuxppc-dev, rjw, tglx, linux-riscv, x86, linux-pm,
	linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:46           ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: cl, gor, hpa, peterz, catalin.marinas, dave.hansen, borntraeger,
	penberg, linux-mm, iamjoonsoo.kim, will, aou, kirill, rientjes,
	rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto, davem, mpe,
	benh, linuxppc-dev, rjw, tglx, linux-riscv, x86, linux-pm,
	linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:46           ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:46           ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:46           ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	linuxppc-dev, davem

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  8:46           ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27  8:46 UTC (permalink / raw)
  To: Mike Rapoport, Edgecombe, Rick P
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On 27.10.20 09:38, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>> Indeed, for architectures that define
>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>> it is
>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>> function is
>>>>> void, the failure will go unnoticed.
>>>>
>>>> Could you elaborate on how this could happen? Do you mean during
>>>> runtime today or if something new was introduced?
>>>
>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>> x86
>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>
>>>          __kernel_map_pages(page, 1, 0);
>>>
>>> will need to split, say, 2M page and during the split an allocation
>>> of
>>> page table could fail.
>>
>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>> on the direct map and even disables locking in cpa because it assumes
>> this. If this is happening somehow anyway then we should probably fix
>> that. Even if it's a debug feature, it will not be as useful if it is
>> causing its own crashes.
>>
>> I'm still wondering if there is something I'm missing here. It seems
>> like you are saying there is a bug in some arch's, so let's add a WARN
>> in cross-arch code to log it as it crashes. A warn and making things
>> clearer seem like good ideas, but if there is a bug we should fix it.
>> The code around the callers still functionally assume re-mapping can't
>> fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>      vm_remove_mappings()
>          set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							
> 
>>> Currently, the only user of __kernel_map_pages() outside
>>> DEBUG_PAGEALLOC
>>> is hibernation, but I think it would be safer to entirely prevent
>>> usage
>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>
>> I totally agree it's error prone FWIW. On x86, my mental model of how
>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>> map. Are you seeing this violated or do I have wrong assumptions?
> 
> You are right, there is a set of assumptions about the remapping of the
> direct map pages that make it all work, at least on x86.
> But this is very subtle and it's not easy to wrap one's head around
> this.
> 
> That's why putting __kernel_map_pages() out of "common" use and
> keep it only for DEBUG_PAGEALLOC would make things clearer.
> 
>> Beyond whatever you are seeing, for the latter case of new things
>> getting introduced to an interface with hidden dependencies... Another
>> edge case could be a new caller to set_memory_np() could result in
>> large NP pages. None of the callers today should cause this AFAICT, but
>> it's not great to rely on the callers to know these details.
>   
> A caller of set_memory_*() or set_direct_map_*() should expect a failure
> and be ready for that. So adding a WARN to safe_copy_page() is the first
> step in that direction :)
> 

I am probably missing something important, but why are we 
saving/restoring the content of pages that were explicitly removed from 
the identity mapping such that nobody will access them?

Pages that are not allocated should contain garbage or be zero 
(init_on_free). That should be easy to handle without ever reading the 
page content.

The other user seems to be vm_remove_mappings(), where we only 
*temporarily* remove the mapping - while hibernating, that code 
shouldn't be active anymore I guess - or we could protect it from happening.

As I expressed in another mail, secretmem pages should rather not be 
saved when hibernating - hibernation should be rather be disabled.

What am I missing?

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-26 18:57         ` Edgecombe, Rick P
                             ` (3 preceding siblings ...)
  (?)
@ 2020-10-27  8:49           ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  8:49           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  8:49           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  8:49           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  8:49           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27  8:49           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  8:49 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P wrote:
> > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > 
> > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page
> > > > may
> > > > be
> > > > not present in the direct map and has to be explicitly mapped
> > > > before
> > > > it
> > > > could be copied.
> > > > 
> > > > On arm64 it is possible that a page would be removed from the
> > > > direct
> > > > map
> > > > using set_direct_map_invalid_noflush() but __kernel_map_pages()
> > > > will
> > > > refuse
> > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > 
> > > It looks to me that arm64 __kernel_map_pages() will still attempt
> > > to
> > > map it if rodata_full is true, how does this happen?
> > 
> > Unless I misread the code, arm64 requires both rodata_full and
> > debug_pagealloc_enabled() to be true for __kernel_map_pages() to do
> > anything.
> > But rodata_full condition applies to set_direct_map_*_noflush() as
> > well,
> > so with !rodata_full the linear map won't be ever changed.
> 
> Hmm, looks to me that __kernel_map_pages() will only skip it if both
> debug pagealloc and rodata_full are false.
> 
> But now I'm wondering if maybe we could simplify things by just moving
> the hibernate unmapped page logic off of the direct map. On x86,
> text_poke() used to use this reserved fixmap pte thing that it could
> rely on to remap memory with. If hibernate had some separate pte for
> remapping like that, then we could not have any direct map restrictions
> caused by it/kernel_map_pages(), and it wouldn't have to worry about
> relying on anything else.

Well, there is map_kernel_range() that can be used by hibernation as
there is no requirement for particular virtual address, but that would
be quite costly if done for every page.

Maybe we can do somthing like

	if (kernel_page_present(s_page)) {
		do_copy_page(dst, page_address(s_page));
	} else {
		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
					 PROT_READ, &page);
		do_copy_page(dst, page_address(s_page));
		unmap_kernel_range_noflush(page_address(page), PAGE_SIZE);
	}

But it seems that a prerequisite for changing the way a page is mapped
in safe_copy_page() would be to teach hibernation that a mapping here
may fail.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-27  8:46           ` David Hildenbrand
                               ` (3 preceding siblings ...)
  (?)
@ 2020-10-27  9:47             ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  9:47             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  9:47             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  9:47             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  9:47             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27  9:47             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-27  9:47 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >          __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >      vm_remove_mappings()
> >          set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> > 
> > > > Currently, the only user of __kernel_map_pages() outside
> > > > DEBUG_PAGEALLOC
> > > > is hibernation, but I think it would be safer to entirely prevent
> > > > usage
> > > > of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
> > > 
> > > I totally agree it's error prone FWIW. On x86, my mental model of how
> > > it is supposed to work is: If a page is 4k and NP it cannot fail to be
> > > remapped. set_direct_map_invalid_noflush() should result in 4k NP
> > > pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
> > > map. Are you seeing this violated or do I have wrong assumptions?
> > 
> > You are right, there is a set of assumptions about the remapping of the
> > direct map pages that make it all work, at least on x86.
> > But this is very subtle and it's not easy to wrap one's head around
> > this.
> > 
> > That's why putting __kernel_map_pages() out of "common" use and
> > keep it only for DEBUG_PAGEALLOC would make things clearer.
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.
> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?
> 
> Pages that are not allocated should contain garbage or be zero
> (init_on_free). That should be easy to handle without ever reading the page
> content.

I'm not familiar with hibernation to say anything smart here, but the
help text of DEBUG_PAGEALLOC in Kconfig says:

	... this option cannot be enabled in combination with
	hibernation as that would result in incorrect warnings of memory
	corruption after a resume because free pages are not saved to
	the suspend image.

Probably you are right and free pages need to be handled differently,
but it does not seem the case now.

> The other user seems to be vm_remove_mappings(), where we only *temporarily*
> remove the mapping - while hibernating, that code shouldn't be active
> anymore I guess - or we could protect it from happening.

Hmm, I _think_ vm_remove_mappings() shouldn't be active while
hibernating, but I'm not 100% sure.

> As I expressed in another mail, secretmem pages should rather not be saved
> when hibernating - hibernation should be rather be disabled.

Agree.

> What am I missing?

I think I miscommunicated the purpose of this set, which was to hide
__kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
set_direct_map_*() explictly without major rework of free pages handling
during hibernation.

Does it help?

> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-27  9:47             ` Mike Rapoport
                                 ` (3 preceding siblings ...)
  (?)
@ 2020-10-27 10:34               ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27 10:34               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27 10:34               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27 10:34               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27 10:34               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-27 10:34               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-27 10:34 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 27.10.20 10:47, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>> On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
>>>>> On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
>>>>>> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
>>>>>>> Indeed, for architectures that define
>>>>>>> CONFIG_ARCH_HAS_SET_DIRECT_MAP
>>>>>>> it is
>>>>>>> possible that __kernel_map_pages() would fail, but since this
>>>>>>> function is
>>>>>>> void, the failure will go unnoticed.
>>>>>>
>>>>>> Could you elaborate on how this could happen? Do you mean during
>>>>>> runtime today or if something new was introduced?
>>>>>
>>>>> A failure in__kernel_map_pages() may happen today. For instance, on
>>>>> x86
>>>>> if the kernel is built with DEBUG_PAGEALLOC.
>>>>>
>>>>>           __kernel_map_pages(page, 1, 0);
>>>>>
>>>>> will need to split, say, 2M page and during the split an allocation
>>>>> of
>>>>> page table could fail.
>>>>
>>>> On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
>>>> on the direct map and even disables locking in cpa because it assumes
>>>> this. If this is happening somehow anyway then we should probably fix
>>>> that. Even if it's a debug feature, it will not be as useful if it is
>>>> causing its own crashes.
>>>>
>>>> I'm still wondering if there is something I'm missing here. It seems
>>>> like you are saying there is a bug in some arch's, so let's add a WARN
>>>> in cross-arch code to log it as it crashes. A warn and making things
>>>> clearer seem like good ideas, but if there is a bug we should fix it.
>>>> The code around the callers still functionally assume re-mapping can't
>>>> fail.
>>>
>>> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
>>> that unmaps pages back in safe_copy_page will just reset a 4K page to
>>> NP because whatever made it NP at the first place already did the split.
>>>
>>> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
>>> between map/unmap dance in __vunmap() and safe_copy_page() that may
>>> cause access to unmapped memory:
>>>
>>> __vunmap()
>>>       vm_remove_mappings()
>>>           set_direct_map_invalid()
>>> 					safe_copy_page()	
>>> 					    __kernel_map_pages()
>>> 					    	return
>>> 					    do_copy_page() -> fault
>>> 					   	
>>> This is a theoretical bug, but it is still not nice :) 							
>>>
>>>>> Currently, the only user of __kernel_map_pages() outside
>>>>> DEBUG_PAGEALLOC
>>>>> is hibernation, but I think it would be safer to entirely prevent
>>>>> usage
>>>>> of __kernel_map_pages() when DEBUG_PAGEALLOC=n.
>>>>
>>>> I totally agree it's error prone FWIW. On x86, my mental model of how
>>>> it is supposed to work is: If a page is 4k and NP it cannot fail to be
>>>> remapped. set_direct_map_invalid_noflush() should result in 4k NP
>>>> pages, and DEBUG_PAGEALLOC should result in all 4k pages on the direct
>>>> map. Are you seeing this violated or do I have wrong assumptions?
>>>
>>> You are right, there is a set of assumptions about the remapping of the
>>> direct map pages that make it all work, at least on x86.
>>> But this is very subtle and it's not easy to wrap one's head around
>>> this.
>>>
>>> That's why putting __kernel_map_pages() out of "common" use and
>>> keep it only for DEBUG_PAGEALLOC would make things clearer.
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
>>
>> Pages that are not allocated should contain garbage or be zero
>> (init_on_free). That should be easy to handle without ever reading the page
>> content.
> 
> I'm not familiar with hibernation to say anything smart here, but the
> help text of DEBUG_PAGEALLOC in Kconfig says:
> 
> 	... this option cannot be enabled in combination with
> 	hibernation as that would result in incorrect warnings of memory
> 	corruption after a resume because free pages are not saved to
> 	the suspend image.
> 
> Probably you are right and free pages need to be handled differently,
> but it does not seem the case now.
> 
>> The other user seems to be vm_remove_mappings(), where we only *temporarily*
>> remove the mapping - while hibernating, that code shouldn't be active
>> anymore I guess - or we could protect it from happening.
> 
> Hmm, I _think_ vm_remove_mappings() shouldn't be active while
> hibernating, but I'm not 100% sure.
> 
>> As I expressed in another mail, secretmem pages should rather not be saved
>> when hibernating - hibernation should be rather be disabled.
> 
> Agree.
> 
>> What am I missing?
> 
> I think I miscommunicated the purpose of this set, which was to hide
> __kernel_map_pages() under DEBUG_PAGEALLOC and make hibernation use
> set_direct_map_*() explictly without major rework of free pages handling
> during hibernation.
> 
> Does it help?
> 

Heh, as always, once you touch questionable code, people will beg for 
proper cleanups instead :)


-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-27  8:49           ` Mike Rapoport
                               ` (3 preceding siblings ...)
  (?)
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > 
> > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > page
> > > > > may
> > > > > be
> > > > > not present in the direct map and has to be explicitly mapped
> > > > > before
> > > > > it
> > > > > could be copied.
> > > > > 
> > > > > On arm64 it is possible that a page would be removed from the
> > > > > direct
> > > > > map
> > > > > using set_direct_map_invalid_noflush() but
> > > > > __kernel_map_pages()
> > > > > will
> > > > > refuse
> > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > 
> > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > attempt
> > > > to
> > > > map it if rodata_full is true, how does this happen?
> > > 
> > > Unless I misread the code, arm64 requires both rodata_full and
> > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > do
> > > anything.
> > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > as
> > > well,
> > > so with !rodata_full the linear map won't be ever changed.
> > 
> > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > both
> > debug pagealloc and rodata_full are false.
> > 
> > But now I'm wondering if maybe we could simplify things by just
> > moving
> > the hibernate unmapped page logic off of the direct map. On x86,
> > text_poke() used to use this reserved fixmap pte thing that it
> > could
> > rely on to remap memory with. If hibernate had some separate pte
> > for
> > remapping like that, then we could not have any direct map
> > restrictions
> > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > about
> > relying on anything else.
> 
> Well, there is map_kernel_range() that can be used by hibernation as
> there is no requirement for particular virtual address, but that
> would
> be quite costly if done for every page.
> 
> Maybe we can do somthing like
> 
> 	if (kernel_page_present(s_page)) {
> 		do_copy_page(dst, page_address(s_page));
> 	} else {
> 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> 					 PROT_READ, &page);
> 		do_copy_page(dst, page_address(s_page));
> 		unmap_kernel_range_noflush(page_address(page),
> PAGE_SIZE);
> 	}
> 
> But it seems that a prerequisite for changing the way a page is
> mapped
> in safe_copy_page() would be to teach hibernation that a mapping here
> may fail.
> 
Yea that is what I meant, the direct map could still be used for mapped
pages.

But for the unmapped case it could have a pre-setup 4k pte for some non
direct map address. Then just change the pte to point to any unmapped
direct map page that was encountered. The point would be to give
hibernate some 4k pte of its own to manipulate so that it can't fail.

Yet another option would be have hibernate_map_page() just map large
pages if it finds them.

So we could teach hibernate to handle mapping failures, OR we could
change it so it doesn't rely on direct map page sizes in order to
succeed. The latter seems better to me since there isn't a reason why
it should have to fail and the resulting logic might be simpler. Both
seem like improvements in robustness though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > 
> > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > page
> > > > > may
> > > > > be
> > > > > not present in the direct map and has to be explicitly mapped
> > > > > before
> > > > > it
> > > > > could be copied.
> > > > > 
> > > > > On arm64 it is possible that a page would be removed from the
> > > > > direct
> > > > > map
> > > > > using set_direct_map_invalid_noflush() but
> > > > > __kernel_map_pages()
> > > > > will
> > > > > refuse
> > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > 
> > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > attempt
> > > > to
> > > > map it if rodata_full is true, how does this happen?
> > > 
> > > Unless I misread the code, arm64 requires both rodata_full and
> > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > do
> > > anything.
> > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > as
> > > well,
> > > so with !rodata_full the linear map won't be ever changed.
> > 
> > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > both
> > debug pagealloc and rodata_full are false.
> > 
> > But now I'm wondering if maybe we could simplify things by just
> > moving
> > the hibernate unmapped page logic off of the direct map. On x86,
> > text_poke() used to use this reserved fixmap pte thing that it
> > could
> > rely on to remap memory with. If hibernate had some separate pte
> > for
> > remapping like that, then we could not have any direct map
> > restrictions
> > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > about
> > relying on anything else.
> 
> Well, there is map_kernel_range() that can be used by hibernation as
> there is no requirement for particular virtual address, but that
> would
> be quite costly if done for every page.
> 
> Maybe we can do somthing like
> 
> 	if (kernel_page_present(s_page)) {
> 		do_copy_page(dst, page_address(s_page));
> 	} else {
> 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> 					 PROT_READ, &page);
> 		do_copy_page(dst, page_address(s_page));
> 		unmap_kernel_range_noflush(page_address(page),
> PAGE_SIZE);
> 	}
> 
> But it seems that a prerequisite for changing the way a page is
> mapped
> in safe_copy_page() would be to teach hibernation that a mapping here
> may fail.
> 
Yea that is what I meant, the direct map could still be used for mapped
pages.

But for the unmapped case it could have a pre-setup 4k pte for some non
direct map address. Then just change the pte to point to any unmapped
direct map page that was encountered. The point would be to give
hibernate some 4k pte of its own to manipulate so that it can't fail.

Yet another option would be have hibernate_map_page() just map large
pages if it finds them.

So we could teach hibernate to handle mapping failures, OR we could
change it so it doesn't rely on direct map page sizes in order to
succeed. The latter seems better to me since there isn't a reason why
it should have to fail and the resulting logic might be simpler. Both
seem like improvements in robustness though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

T24gVHVlLCAyMDIwLTEwLTI3IGF0IDEwOjQ5ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBNb24sIE9jdCAyNiwgMjAyMCBhdCAwNjo1NzozMlBNICswMDAwLCBFZGdlY29tYmUsIFJp
Y2sgUCB3cm90ZToNCj4gPiBPbiBNb24sIDIwMjAtMTAtMjYgYXQgMTE6MTUgKzAyMDAsIE1pa2Ug
UmFwb3BvcnQgd3JvdGU6DQo+ID4gPiBPbiBNb24sIE9jdCAyNiwgMjAyMCBhdCAxMjozODozMkFN
ICswMDAwLCBFZGdlY29tYmUsIFJpY2sgUA0KPiA+ID4gd3JvdGU6DQo+ID4gPiA+IE9uIFN1biwg
MjAyMC0xMC0yNSBhdCAxMjoxNSArMDIwMCwgTWlrZSBSYXBvcG9ydCB3cm90ZToNCj4gPiA+ID4g
PiBGcm9tOiBNaWtlIFJhcG9wb3J0IDxycHB0QGxpbnV4LmlibS5jb20+DQo+ID4gPiA+ID4gDQo+
ID4gPiA+ID4gV2hlbiBERUJVR19QQUdFQUxMT0Mgb3IgQVJDSF9IQVNfU0VUX0RJUkVDVF9NQVAg
aXMgZW5hYmxlZCBhDQo+ID4gPiA+ID4gcGFnZQ0KPiA+ID4gPiA+IG1heQ0KPiA+ID4gPiA+IGJl
DQo+ID4gPiA+ID4gbm90IHByZXNlbnQgaW4gdGhlIGRpcmVjdCBtYXAgYW5kIGhhcyB0byBiZSBl
eHBsaWNpdGx5IG1hcHBlZA0KPiA+ID4gPiA+IGJlZm9yZQ0KPiA+ID4gPiA+IGl0DQo+ID4gPiA+
ID4gY291bGQgYmUgY29waWVkLg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+IE9uIGFybTY0IGl0IGlz
IHBvc3NpYmxlIHRoYXQgYSBwYWdlIHdvdWxkIGJlIHJlbW92ZWQgZnJvbSB0aGUNCj4gPiA+ID4g
PiBkaXJlY3QNCj4gPiA+ID4gPiBtYXANCj4gPiA+ID4gPiB1c2luZyBzZXRfZGlyZWN0X21hcF9p
bnZhbGlkX25vZmx1c2goKSBidXQNCj4gPiA+ID4gPiBfX2tlcm5lbF9tYXBfcGFnZXMoKQ0KPiA+
ID4gPiA+IHdpbGwNCj4gPiA+ID4gPiByZWZ1c2UNCj4gPiA+ID4gPiB0byBtYXAgdGhpcyBwYWdl
IGJhY2sgaWYgREVCVUdfUEFHRUFMTE9DIGlzIGRpc2FibGVkLg0KPiA+ID4gPiANCj4gPiA+ID4g
SXQgbG9va3MgdG8gbWUgdGhhdCBhcm02NCBfX2tlcm5lbF9tYXBfcGFnZXMoKSB3aWxsIHN0aWxs
DQo+ID4gPiA+IGF0dGVtcHQNCj4gPiA+ID4gdG8NCj4gPiA+ID4gbWFwIGl0IGlmIHJvZGF0YV9m
dWxsIGlzIHRydWUsIGhvdyBkb2VzIHRoaXMgaGFwcGVuPw0KPiA+ID4gDQo+ID4gPiBVbmxlc3Mg
SSBtaXNyZWFkIHRoZSBjb2RlLCBhcm02NCByZXF1aXJlcyBib3RoIHJvZGF0YV9mdWxsIGFuZA0K
PiA+ID4gZGVidWdfcGFnZWFsbG9jX2VuYWJsZWQoKSB0byBiZSB0cnVlIGZvciBfX2tlcm5lbF9t
YXBfcGFnZXMoKSB0bw0KPiA+ID4gZG8NCj4gPiA+IGFueXRoaW5nLg0KPiA+ID4gQnV0IHJvZGF0
YV9mdWxsIGNvbmRpdGlvbiBhcHBsaWVzIHRvIHNldF9kaXJlY3RfbWFwXypfbm9mbHVzaCgpDQo+
ID4gPiBhcw0KPiA+ID4gd2VsbCwNCj4gPiA+IHNvIHdpdGggIXJvZGF0YV9mdWxsIHRoZSBsaW5l
YXIgbWFwIHdvbid0IGJlIGV2ZXIgY2hhbmdlZC4NCj4gPiANCj4gPiBIbW0sIGxvb2tzIHRvIG1l
IHRoYXQgX19rZXJuZWxfbWFwX3BhZ2VzKCkgd2lsbCBvbmx5IHNraXAgaXQgaWYNCj4gPiBib3Ro
DQo+ID4gZGVidWcgcGFnZWFsbG9jIGFuZCByb2RhdGFfZnVsbCBhcmUgZmFsc2UuDQo+ID4gDQo+
ID4gQnV0IG5vdyBJJ20gd29uZGVyaW5nIGlmIG1heWJlIHdlIGNvdWxkIHNpbXBsaWZ5IHRoaW5n
cyBieSBqdXN0DQo+ID4gbW92aW5nDQo+ID4gdGhlIGhpYmVybmF0ZSB1bm1hcHBlZCBwYWdlIGxv
Z2ljIG9mZiBvZiB0aGUgZGlyZWN0IG1hcC4gT24geDg2LA0KPiA+IHRleHRfcG9rZSgpIHVzZWQg
dG8gdXNlIHRoaXMgcmVzZXJ2ZWQgZml4bWFwIHB0ZSB0aGluZyB0aGF0IGl0DQo+ID4gY291bGQN
Cj4gPiByZWx5IG9uIHRvIHJlbWFwIG1lbW9yeSB3aXRoLiBJZiBoaWJlcm5hdGUgaGFkIHNvbWUg
c2VwYXJhdGUgcHRlDQo+ID4gZm9yDQo+ID4gcmVtYXBwaW5nIGxpa2UgdGhhdCwgdGhlbiB3ZSBj
b3VsZCBub3QgaGF2ZSBhbnkgZGlyZWN0IG1hcA0KPiA+IHJlc3RyaWN0aW9ucw0KPiA+IGNhdXNl
ZCBieSBpdC9rZXJuZWxfbWFwX3BhZ2VzKCksIGFuZCBpdCB3b3VsZG4ndCBoYXZlIHRvIHdvcnJ5
DQo+ID4gYWJvdXQNCj4gPiByZWx5aW5nIG9uIGFueXRoaW5nIGVsc2UuDQo+IA0KPiBXZWxsLCB0
aGVyZSBpcyBtYXBfa2VybmVsX3JhbmdlKCkgdGhhdCBjYW4gYmUgdXNlZCBieSBoaWJlcm5hdGlv
biBhcw0KPiB0aGVyZSBpcyBubyByZXF1aXJlbWVudCBmb3IgcGFydGljdWxhciB2aXJ0dWFsIGFk
ZHJlc3MsIGJ1dCB0aGF0DQo+IHdvdWxkDQo+IGJlIHF1aXRlIGNvc3RseSBpZiBkb25lIGZvciBl
dmVyeSBwYWdlLg0KPiANCj4gTWF5YmUgd2UgY2FuIGRvIHNvbXRoaW5nIGxpa2UNCj4gDQo+IAlp
ZiAoa2VybmVsX3BhZ2VfcHJlc2VudChzX3BhZ2UpKSB7DQo+IAkJZG9fY29weV9wYWdlKGRzdCwg
cGFnZV9hZGRyZXNzKHNfcGFnZSkpOw0KPiAJfSBlbHNlIHsNCj4gCQltYXBfa2VybmVsX3Jhbmdl
X25vZmx1c2gocGFnZV9hZGRyZXNzKHBhZ2UpLCBQQUdFX1NJWkUsDQo+IAkJCQkJIFBST1RfUkVB
RCwgJnBhZ2UpOw0KPiAJCWRvX2NvcHlfcGFnZShkc3QsIHBhZ2VfYWRkcmVzcyhzX3BhZ2UpKTsN
Cj4gCQl1bm1hcF9rZXJuZWxfcmFuZ2Vfbm9mbHVzaChwYWdlX2FkZHJlc3MocGFnZSksDQo+IFBB
R0VfU0laRSk7DQo+IAl9DQo+IA0KPiBCdXQgaXQgc2VlbXMgdGhhdCBhIHByZXJlcXVpc2l0ZSBm
b3IgY2hhbmdpbmcgdGhlIHdheSBhIHBhZ2UgaXMNCj4gbWFwcGVkDQo+IGluIHNhZmVfY29weV9w
YWdlKCkgd291bGQgYmUgdG8gdGVhY2ggaGliZXJuYXRpb24gdGhhdCBhIG1hcHBpbmcgaGVyZQ0K
PiBtYXkgZmFpbC4NCj4gDQpZZWEgdGhhdCBpcyB3aGF0IEkgbWVhbnQsIHRoZSBkaXJlY3QgbWFw
IGNvdWxkIHN0aWxsIGJlIHVzZWQgZm9yIG1hcHBlZA0KcGFnZXMuDQoNCkJ1dCBmb3IgdGhlIHVu
bWFwcGVkIGNhc2UgaXQgY291bGQgaGF2ZSBhIHByZS1zZXR1cCA0ayBwdGUgZm9yIHNvbWUgbm9u
DQpkaXJlY3QgbWFwIGFkZHJlc3MuIFRoZW4ganVzdCBjaGFuZ2UgdGhlIHB0ZSB0byBwb2ludCB0
byBhbnkgdW5tYXBwZWQNCmRpcmVjdCBtYXAgcGFnZSB0aGF0IHdhcyBlbmNvdW50ZXJlZC4gVGhl
IHBvaW50IHdvdWxkIGJlIHRvIGdpdmUNCmhpYmVybmF0ZSBzb21lIDRrIHB0ZSBvZiBpdHMgb3du
IHRvIG1hbmlwdWxhdGUgc28gdGhhdCBpdCBjYW4ndCBmYWlsLg0KDQpZZXQgYW5vdGhlciBvcHRp
b24gd291bGQgYmUgaGF2ZSBoaWJlcm5hdGVfbWFwX3BhZ2UoKSBqdXN0IG1hcCBsYXJnZQ0KcGFn
ZXMgaWYgaXQgZmluZHMgdGhlbS4NCg0KU28gd2UgY291bGQgdGVhY2ggaGliZXJuYXRlIHRvIGhh
bmRsZSBtYXBwaW5nIGZhaWx1cmVzLCBPUiB3ZSBjb3VsZA0KY2hhbmdlIGl0IHNvIGl0IGRvZXNu
J3QgcmVseSBvbiBkaXJlY3QgbWFwIHBhZ2Ugc2l6ZXMgaW4gb3JkZXIgdG8NCnN1Y2NlZWQuIFRo
ZSBsYXR0ZXIgc2VlbXMgYmV0dGVyIHRvIG1lIHNpbmNlIHRoZXJlIGlzbid0IGEgcmVhc29uIHdo
eQ0KaXQgc2hvdWxkIGhhdmUgdG8gZmFpbCBhbmQgdGhlIHJlc3VsdGluZyBsb2dpYyBtaWdodCBi
ZSBzaW1wbGVyLiBCb3RoDQpzZWVtIGxpa2UgaW1wcm92ZW1lbnRzIGluIHJvYnVzdG5lc3MgdGhv
dWdoLg0KDQo

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > 
> > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > page
> > > > > may
> > > > > be
> > > > > not present in the direct map and has to be explicitly mapped
> > > > > before
> > > > > it
> > > > > could be copied.
> > > > > 
> > > > > On arm64 it is possible that a page would be removed from the
> > > > > direct
> > > > > map
> > > > > using set_direct_map_invalid_noflush() but
> > > > > __kernel_map_pages()
> > > > > will
> > > > > refuse
> > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > 
> > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > attempt
> > > > to
> > > > map it if rodata_full is true, how does this happen?
> > > 
> > > Unless I misread the code, arm64 requires both rodata_full and
> > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > do
> > > anything.
> > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > as
> > > well,
> > > so with !rodata_full the linear map won't be ever changed.
> > 
> > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > both
> > debug pagealloc and rodata_full are false.
> > 
> > But now I'm wondering if maybe we could simplify things by just
> > moving
> > the hibernate unmapped page logic off of the direct map. On x86,
> > text_poke() used to use this reserved fixmap pte thing that it
> > could
> > rely on to remap memory with. If hibernate had some separate pte
> > for
> > remapping like that, then we could not have any direct map
> > restrictions
> > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > about
> > relying on anything else.
> 
> Well, there is map_kernel_range() that can be used by hibernation as
> there is no requirement for particular virtual address, but that
> would
> be quite costly if done for every page.
> 
> Maybe we can do somthing like
> 
> 	if (kernel_page_present(s_page)) {
> 		do_copy_page(dst, page_address(s_page));
> 	} else {
> 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> 					 PROT_READ, &page);
> 		do_copy_page(dst, page_address(s_page));
> 		unmap_kernel_range_noflush(page_address(page),
> PAGE_SIZE);
> 	}
> 
> But it seems that a prerequisite for changing the way a page is
> mapped
> in safe_copy_page() would be to teach hibernation that a mapping here
> may fail.
> 
Yea that is what I meant, the direct map could still be used for mapped
pages.

But for the unmapped case it could have a pre-setup 4k pte for some non
direct map address. Then just change the pte to point to any unmapped
direct map page that was encountered. The point would be to give
hibernate some 4k pte of its own to manipulate so that it can't fail.

Yet another option would be have hibernate_map_page() just map large
pages if it finds them.

So we could teach hibernate to handle mapping failures, OR we could
change it so it doesn't rely on direct map page sizes in order to
succeed. The latter seems better to me since there isn't a reason why
it should have to fail and the resulting logic might be simpler. Both
seem like improvements in robustness though.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > 
> > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > page
> > > > > may
> > > > > be
> > > > > not present in the direct map and has to be explicitly mapped
> > > > > before
> > > > > it
> > > > > could be copied.
> > > > > 
> > > > > On arm64 it is possible that a page would be removed from the
> > > > > direct
> > > > > map
> > > > > using set_direct_map_invalid_noflush() but
> > > > > __kernel_map_pages()
> > > > > will
> > > > > refuse
> > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > 
> > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > attempt
> > > > to
> > > > map it if rodata_full is true, how does this happen?
> > > 
> > > Unless I misread the code, arm64 requires both rodata_full and
> > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > do
> > > anything.
> > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > as
> > > well,
> > > so with !rodata_full the linear map won't be ever changed.
> > 
> > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > both
> > debug pagealloc and rodata_full are false.
> > 
> > But now I'm wondering if maybe we could simplify things by just
> > moving
> > the hibernate unmapped page logic off of the direct map. On x86,
> > text_poke() used to use this reserved fixmap pte thing that it
> > could
> > rely on to remap memory with. If hibernate had some separate pte
> > for
> > remapping like that, then we could not have any direct map
> > restrictions
> > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > about
> > relying on anything else.
> 
> Well, there is map_kernel_range() that can be used by hibernation as
> there is no requirement for particular virtual address, but that
> would
> be quite costly if done for every page.
> 
> Maybe we can do somthing like
> 
> 	if (kernel_page_present(s_page)) {
> 		do_copy_page(dst, page_address(s_page));
> 	} else {
> 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> 					 PROT_READ, &page);
> 		do_copy_page(dst, page_address(s_page));
> 		unmap_kernel_range_noflush(page_address(page),
> PAGE_SIZE);
> 	}
> 
> But it seems that a prerequisite for changing the way a page is
> mapped
> in safe_copy_page() would be to teach hibernation that a mapping here
> may fail.
> 
Yea that is what I meant, the direct map could still be used for mapped
pages.

But for the unmapped case it could have a pre-setup 4k pte for some non
direct map address. Then just change the pte to point to any unmapped
direct map page that was encountered. The point would be to give
hibernate some 4k pte of its own to manipulate so that it can't fail.

Yet another option would be have hibernate_map_page() just map large
pages if it finds them.

So we could teach hibernate to handle mapping failures, OR we could
change it so it doesn't rely on direct map page sizes in order to
succeed. The latter seems better to me since there isn't a reason why
it should have to fail and the resulting logic might be simpler. Both
seem like improvements in robustness though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-27 22:44             ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-27 22:44 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > 
> > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > page
> > > > > may
> > > > > be
> > > > > not present in the direct map and has to be explicitly mapped
> > > > > before
> > > > > it
> > > > > could be copied.
> > > > > 
> > > > > On arm64 it is possible that a page would be removed from the
> > > > > direct
> > > > > map
> > > > > using set_direct_map_invalid_noflush() but
> > > > > __kernel_map_pages()
> > > > > will
> > > > > refuse
> > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > 
> > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > attempt
> > > > to
> > > > map it if rodata_full is true, how does this happen?
> > > 
> > > Unless I misread the code, arm64 requires both rodata_full and
> > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > do
> > > anything.
> > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > as
> > > well,
> > > so with !rodata_full the linear map won't be ever changed.
> > 
> > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > both
> > debug pagealloc and rodata_full are false.
> > 
> > But now I'm wondering if maybe we could simplify things by just
> > moving
> > the hibernate unmapped page logic off of the direct map. On x86,
> > text_poke() used to use this reserved fixmap pte thing that it
> > could
> > rely on to remap memory with. If hibernate had some separate pte
> > for
> > remapping like that, then we could not have any direct map
> > restrictions
> > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > about
> > relying on anything else.
> 
> Well, there is map_kernel_range() that can be used by hibernation as
> there is no requirement for particular virtual address, but that
> would
> be quite costly if done for every page.
> 
> Maybe we can do somthing like
> 
> 	if (kernel_page_present(s_page)) {
> 		do_copy_page(dst, page_address(s_page));
> 	} else {
> 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> 					 PROT_READ, &page);
> 		do_copy_page(dst, page_address(s_page));
> 		unmap_kernel_range_noflush(page_address(page),
> PAGE_SIZE);
> 	}
> 
> But it seems that a prerequisite for changing the way a page is
> mapped
> in safe_copy_page() would be to teach hibernation that a mapping here
> may fail.
> 
Yea that is what I meant, the direct map could still be used for mapped
pages.

But for the unmapped case it could have a pre-setup 4k pte for some non
direct map address. Then just change the pte to point to any unmapped
direct map page that was encountered. The point would be to give
hibernate some 4k pte of its own to manipulate so that it can't fail.

Yet another option would be have hibernate_map_page() just map large
pages if it finds them.

So we could teach hibernate to handle mapping failures, OR we could
change it so it doesn't rely on direct map page sizes in order to
succeed. The latter seems better to me since there isn't a reason why
it should have to fail and the resulting logic might be simpler. Both
seem like improvements in robustness though.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-27 22:44             ` Edgecombe, Rick P
                                 ` (3 preceding siblings ...)
  (?)
@ 2020-10-28  9:41               ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28  9:41               ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28  9:41               ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28  9:41               ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28  9:41               ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28  9:41               ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28  9:41 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:44:21PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2020-10-27 at 10:49 +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:57:32PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:15 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 12:38:32AM +0000, Edgecombe, Rick P
> > > > wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > From: Mike Rapoport <rppt@linux.ibm.com>
> > > > > > 
> > > > > > When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a
> > > > > > page
> > > > > > may
> > > > > > be
> > > > > > not present in the direct map and has to be explicitly mapped
> > > > > > before
> > > > > > it
> > > > > > could be copied.
> > > > > > 
> > > > > > On arm64 it is possible that a page would be removed from the
> > > > > > direct
> > > > > > map
> > > > > > using set_direct_map_invalid_noflush() but
> > > > > > __kernel_map_pages()
> > > > > > will
> > > > > > refuse
> > > > > > to map this page back if DEBUG_PAGEALLOC is disabled.
> > > > > 
> > > > > It looks to me that arm64 __kernel_map_pages() will still
> > > > > attempt
> > > > > to
> > > > > map it if rodata_full is true, how does this happen?
> > > > 
> > > > Unless I misread the code, arm64 requires both rodata_full and
> > > > debug_pagealloc_enabled() to be true for __kernel_map_pages() to
> > > > do
> > > > anything.
> > > > But rodata_full condition applies to set_direct_map_*_noflush()
> > > > as
> > > > well,
> > > > so with !rodata_full the linear map won't be ever changed.
> > > 
> > > Hmm, looks to me that __kernel_map_pages() will only skip it if
> > > both
> > > debug pagealloc and rodata_full are false.
> > > 
> > > But now I'm wondering if maybe we could simplify things by just
> > > moving
> > > the hibernate unmapped page logic off of the direct map. On x86,
> > > text_poke() used to use this reserved fixmap pte thing that it
> > > could
> > > rely on to remap memory with. If hibernate had some separate pte
> > > for
> > > remapping like that, then we could not have any direct map
> > > restrictions
> > > caused by it/kernel_map_pages(), and it wouldn't have to worry
> > > about
> > > relying on anything else.
> > 
> > Well, there is map_kernel_range() that can be used by hibernation as
> > there is no requirement for particular virtual address, but that
> > would
> > be quite costly if done for every page.
> > 
> > Maybe we can do somthing like
> > 
> > 	if (kernel_page_present(s_page)) {
> > 		do_copy_page(dst, page_address(s_page));
> > 	} else {
> > 		map_kernel_range_noflush(page_address(page), PAGE_SIZE,
> > 					 PROT_READ, &page);
> > 		do_copy_page(dst, page_address(s_page));
> > 		unmap_kernel_range_noflush(page_address(page),
> > PAGE_SIZE);
> > 	}
> > 
> > But it seems that a prerequisite for changing the way a page is
> > mapped
> > in safe_copy_page() would be to teach hibernation that a mapping here
> > may fail.
> > 
> Yea that is what I meant, the direct map could still be used for mapped
> pages.
> 
> But for the unmapped case it could have a pre-setup 4k pte for some non
> direct map address. Then just change the pte to point to any unmapped
> direct map page that was encountered. The point would be to give
> hibernate some 4k pte of its own to manipulate so that it can't fail.
> 
> Yet another option would be have hibernate_map_page() just map large
> pages if it finds them.
> 
> So we could teach hibernate to handle mapping failures, OR we could
> change it so it doesn't rely on direct map page sizes in order to
> succeed. The latter seems better to me since there isn't a reason why
> it should have to fail and the resulting logic might be simpler. Both
> seem like improvements in robustness though.

That's correct, but as the purpose of this series is to prevent usage of
__kernel_map_pages() outside DEBUG_PAGALLOC, for now I'm going to update this
patch changelog and add a comment to hibernate_map_page().

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-27  8:46           ` David Hildenbrand
                               ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 11:09             ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:09             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:09             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:09             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:09             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:09             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:09 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> On 27.10.20 09:38, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > 
> > > Beyond whatever you are seeing, for the latter case of new things
> > > getting introduced to an interface with hidden dependencies... Another
> > > edge case could be a new caller to set_memory_np() could result in
> > > large NP pages. None of the callers today should cause this AFAICT, but
> > > it's not great to rely on the callers to know these details.

> > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > step in that direction :)
> > 
> 
> I am probably missing something important, but why are we saving/restoring
> the content of pages that were explicitly removed from the identity mapping
> such that nobody will access them?

Actually, we should not be saving/restoring free pages during
hibernation as there are several calls to mark_free_pages() that should
exclude the free pages from the snapshot. I've tried to find why the fix
that maps/unmaps a page to save it was required at the first place, but
I could not find bug reports.

The closest I've got is an email from Rafael that asked to update
"hibernate: handle DEBUG_PAGEALLOC" patch:

https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/

Could it be that safe_copy_page() tries to workaround a non-existent
problem?

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 11:09             ` Mike Rapoport
                                 ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 11:17               ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:17               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:17               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:17               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:17               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:17               ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-28 11:17 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On 28.10.20 12:09, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
>> On 27.10.20 09:38, Mike Rapoport wrote:
>>> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
>>>
>>>> Beyond whatever you are seeing, for the latter case of new things
>>>> getting introduced to an interface with hidden dependencies... Another
>>>> edge case could be a new caller to set_memory_np() could result in
>>>> large NP pages. None of the callers today should cause this AFAICT, but
>>>> it's not great to rely on the callers to know these details.
> 
>>> A caller of set_memory_*() or set_direct_map_*() should expect a failure
>>> and be ready for that. So adding a WARN to safe_copy_page() is the first
>>> step in that direction :)
>>>
>>
>> I am probably missing something important, but why are we saving/restoring
>> the content of pages that were explicitly removed from the identity mapping
>> such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that should
> exclude the free pages from the snapshot. I've tried to find why the fix
> that maps/unmaps a page to save it was required at the first place, but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?
> 

Clould be! Also see

https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz

which restores free page content based on more kernel parameters, not 
based on the original content.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-27  8:38         ` Mike Rapoport
                             ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 11:20           ` Will Deacon
  -1 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, david, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:20           ` Will Deacon
  0 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Edgecombe, Rick P, david, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:20           ` Will Deacon
  0 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:20           ` Will Deacon
  0 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:20           ` Will Deacon
  0 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:20           ` Will Deacon
  0 siblings, 0 replies; 219+ messages in thread
From: Will Deacon @ 2020-10-28 11:20 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > Indeed, for architectures that define
> > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > it is
> > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > function is
> > > > > void, the failure will go unnoticed.
> > > > 
> > > > Could you elaborate on how this could happen? Do you mean during
> > > > runtime today or if something new was introduced?
> > > 
> > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > x86
> > > if the kernel is built with DEBUG_PAGEALLOC.
> > > 
> > >         __kernel_map_pages(page, 1, 0);
> > > 
> > > will need to split, say, 2M page and during the split an allocation
> > > of
> > > page table could fail.
> > 
> > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > on the direct map and even disables locking in cpa because it assumes
> > this. If this is happening somehow anyway then we should probably fix
> > that. Even if it's a debug feature, it will not be as useful if it is
> > causing its own crashes.
> > 
> > I'm still wondering if there is something I'm missing here. It seems
> > like you are saying there is a bug in some arch's, so let's add a WARN
> > in cross-arch code to log it as it crashes. A warn and making things
> > clearer seem like good ideas, but if there is a bug we should fix it.
> > The code around the callers still functionally assume re-mapping can't
> > fail.
> 
> Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> that unmaps pages back in safe_copy_page will just reset a 4K page to
> NP because whatever made it NP at the first place already did the split.
> 
> Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> between map/unmap dance in __vunmap() and safe_copy_page() that may
> cause access to unmapped memory:
> 
> __vunmap()
>     vm_remove_mappings()
>         set_direct_map_invalid()
> 					safe_copy_page()	
> 					    __kernel_map_pages()
> 					    	return
> 					    do_copy_page() -> fault
> 					   	
> This is a theoretical bug, but it is still not nice :) 							

Just to clarify: this patch series fixes this problem, right?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 11:20           ` Will Deacon
                               ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 11:30             ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Edgecombe, Rick P, david, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:30             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: Edgecombe, Rick P, david, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:30             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:30             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:30             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 11:30             ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 11:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P wrote:
> > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > Indeed, for architectures that define
> > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > it is
> > > > > > possible that __kernel_map_pages() would fail, but since this
> > > > > > function is
> > > > > > void, the failure will go unnoticed.
> > > > > 
> > > > > Could you elaborate on how this could happen? Do you mean during
> > > > > runtime today or if something new was introduced?
> > > > 
> > > > A failure in__kernel_map_pages() may happen today. For instance, on
> > > > x86
> > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > 
> > > >         __kernel_map_pages(page, 1, 0);
> > > > 
> > > > will need to split, say, 2M page and during the split an allocation
> > > > of
> > > > page table could fail.
> > > 
> > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break a page
> > > on the direct map and even disables locking in cpa because it assumes
> > > this. If this is happening somehow anyway then we should probably fix
> > > that. Even if it's a debug feature, it will not be as useful if it is
> > > causing its own crashes.
> > > 
> > > I'm still wondering if there is something I'm missing here. It seems
> > > like you are saying there is a bug in some arch's, so let's add a WARN
> > > in cross-arch code to log it as it crashes. A warn and making things
> > > clearer seem like good ideas, but if there is a bug we should fix it.
> > > The code around the callers still functionally assume re-mapping can't
> > > fail.
> > 
> > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed the call
> > that unmaps pages back in safe_copy_page will just reset a 4K page to
> > NP because whatever made it NP at the first place already did the split.
> > 
> > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of a race
> > between map/unmap dance in __vunmap() and safe_copy_page() that may
> > cause access to unmapped memory:
> > 
> > __vunmap()
> >     vm_remove_mappings()
> >         set_direct_map_invalid()
> > 					safe_copy_page()	
> > 					    __kernel_map_pages()
> > 					    	return
> > 					    do_copy_page() -> fault
> > 					   	
> > This is a theoretical bug, but it is still not nice :) 							
> 
> Just to clarify: this patch series fixes this problem, right?

Yes.

> Will

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 11:17               ` David Hildenbrand
                                   ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 12:22                 ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 12:22                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Edgecombe, Rick P, cl, gor, hpa, peterz, catalin.marinas,
	dave.hansen, borntraeger, penberg, linux-mm, iamjoonsoo.kim,
	will, aou, kirill, rientjes, rppt, paulus, hca, bp, pavel,
	sparclinux, akpm, luto, davem, mpe, benh, linuxppc-dev, rjw,
	tglx, linux-riscv, x86, linux-pm, linux-arm-kernel, palmer,
	Brown, Len, mingo, linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 12:22                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 12:22                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 12:22                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 12:22                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-28 12:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, Edgecombe, Rick P, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 12:17:35PM +0100, David Hildenbrand wrote:
> On 28.10.20 12:09, Mike Rapoport wrote:
> > On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > > On 27.10.20 09:38, Mike Rapoport wrote:
> > > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P wrote:
> > > > 
> > > > > Beyond whatever you are seeing, for the latter case of new things
> > > > > getting introduced to an interface with hidden dependencies... Another
> > > > > edge case could be a new caller to set_memory_np() could result in
> > > > > large NP pages. None of the callers today should cause this AFAICT, but
> > > > > it's not great to rely on the callers to know these details.
> > 
> > > > A caller of set_memory_*() or set_direct_map_*() should expect a failure
> > > > and be ready for that. So adding a WARN to safe_copy_page() is the first
> > > > step in that direction :)
> > > > 
> > > 
> > > I am probably missing something important, but why are we saving/restoring
> > > the content of pages that were explicitly removed from the identity mapping
> > > such that nobody will access them?
> > 
> > Actually, we should not be saving/restoring free pages during
> > hibernation as there are several calls to mark_free_pages() that should
> > exclude the free pages from the snapshot. I've tried to find why the fix
> > that maps/unmaps a page to save it was required at the first place, but
> > I could not find bug reports.
> > 
> > The closest I've got is an email from Rafael that asked to update
> > "hibernate: handle DEBUG_PAGEALLOC" patch:
> > 
> > https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> > 
> > Could it be that safe_copy_page() tries to workaround a non-existent
> > problem?
> > 
> 
> Clould be! Also see
> 
> https://lkml.kernel.org/r/38de5bb0-5559-d069-0ce0-daec66ef2746@suse.cz
> 
> which restores free page content based on more kernel parameters, not based
> on the original content.

Ah, after looking at it now I've run kernel with DEBUG_PAGEALLOC=y and
CONFIG_INIT_ON_FREE_DEFAULT_ON=y and restore crahsed nicely.

[   27.210093] PM: Image successfully loaded
[   27.226709] Disabling non-boot CPUs ...                                      
[   27.231208] smpboot: CPU 1 is now offline                                    
[   27.363926] kvm-clock: cpu 0, msr 5c889001, primary cpu clock, resume        
[   27.363995] BUG: unable to handle page fault for address: ffff9f7a40108000   
[   27.367996] #PF: supervisor write access in kernel mode                      
[   27.369558] #PF: error_code(0x0002) - not-present page                       
[   27.371098] PGD 5ca01067 P4D 5ca01067 PUD 5ca02067 PMD 5ca03067 PTE 800ffffff
fef7060                                                                         
[   27.373421] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC PTI                          
[   27.374905] CPU: 0 PID: 1200 Comm: bash Not tainted 5.10.0-rc1 #5            
[   27.376700] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14
.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014                                 
[   27.379879] RIP: 0010:clear_page_rep+0x7/0x10          
[   27.381218] Code: e8 be 88 75 00 44 89 e2 48 89 ee 48 89 df e8 60 ff ff ff c6
 03 00 5b 5d 41 5c c3 cc cc cc cc cc cc cc cc b9 00 02 00 00 31 c0 <f3> 48 ab c3
 0f 1f 44 00 00 31 c0 b9 40 00 00 00 66 0f 1f 84 00 00                          
[   27.386457] RSP: 0018:ffffb6838046be08 EFLAGS: 00010046                      
[   27.388011] RAX: 0000000000000000 RBX: ffff9f7a487c0ec0 RCX: 0000000000000200
[   27.390082] RDX: ffff9f7a4c788000 RSI: 0000000000000000 RDI: ffff9f7a40108000
[   27.392138] RBP: ffffffff8629c860 R08: 0000000000000000 R09: 0000000000000007
[   27.394205] R10: 0000000000000004 R11: ffffb6838046bbf8 R12: 0000000000000000
[   27.396271] R13: ffff9f7a419a62a0 R14: 0000000000000005 R15: ffff9f7a484f4da0
[   27.398334] FS:  00007fe0c3f6a700(0000) GS:ffff9f7abf800000(0000) knlGS:0000000000000000                                                                     
[   27.400717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                
[   27.402432] CR2: ffff9f7a40108000 CR3: 000000000859a001 CR4: 0000000000060ef0
[   27.404485] Call Trace:                                                      
[   27.405326]  clear_free_pages+0xf5/0x150                                     
[   27.406568]  hibernation_snapshot+0x390/0x3d0                                
[   27.407908]  hibernate+0xdb/0x240                                            
[   27.408978]  state_store+0xd7/0xe0                                           
[   27.410078]  kernfs_fop_write+0x10e/0x1a0                                    
[   27.411333]  vfs_write+0xbb/0x210                                            
[   27.412423]  ksys_write+0x9c/0xd0                      
[   27.413488]  do_syscall_64+0x33/0x40                                         
[   27.414636]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                        
[   27.416150] RIP: 0033:0x7fe0c364e380                                         
 66 0f 1f 44 00 00 83 3d c9 23 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0
 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[   27.422500] RSP: 002b:00007ffeb64bd0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000
00001
[   27.424724] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fe0c364e380
[   27.426761] RDX: 0000000000000005 RSI: 0000000001eb6408 RDI: 0000000000000001
[   27.428791] RBP: 0000000001eb6408 R08: 00007fe0c391d780 R09: 00007fe0c3f6a700
[   27.430863] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000005
[   27.432920] R13: 0000000000000001 R14: 00007fe0c391c620 R15: 0000000000000000
[   27.434989] Modules linked in:
[   27.436004] CR2: ffff9f7a40108000
[   27.437075] ---[ end trace 424c466bcd2bfcad ]---



> -- 
> Thanks,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 11:09             ` Mike Rapoport
                                 ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: rientjes, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, 2020-10-28 at 13:09 +0200, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > On 27.10.20 09:38, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > 
> > > > Beyond whatever you are seeing, for the latter case of new
> > > > things
> > > > getting introduced to an interface with hidden dependencies...
> > > > Another
> > > > edge case could be a new caller to set_memory_np() could result
> > > > in
> > > > large NP pages. None of the callers today should cause this
> > > > AFAICT, but
> > > > it's not great to rely on the callers to know these details.
> > > A caller of set_memory_*() or set_direct_map_*() should expect a
> > > failure
> > > and be ready for that. So adding a WARN to safe_copy_page() is
> > > the first
> > > step in that direction :)
> > > 
> > 
> > I am probably missing something important, but why are we
> > saving/restoring
> > the content of pages that were explicitly removed from the identity
> > mapping
> > such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that
> should
> exclude the free pages from the snapshot. I've tried to find why the
> fix
> that maps/unmaps a page to save it was required at the first place,
> but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?

It looks like inside page_alloc.c it unmaps the page before it actually
frees it, so to hibernate it could look like the page is still
allocated even though it's unmapped? Maybe that small window is what it
cared about initially.

There is also now the vmalloc case, which I am actually working on
expanding. So I think the re-mapping logic is needed.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: rientjes, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, 2020-10-28 at 13:09 +0200, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > On 27.10.20 09:38, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > 
> > > > Beyond whatever you are seeing, for the latter case of new
> > > > things
> > > > getting introduced to an interface with hidden dependencies...
> > > > Another
> > > > edge case could be a new caller to set_memory_np() could result
> > > > in
> > > > large NP pages. None of the callers today should cause this
> > > > AFAICT, but
> > > > it's not great to rely on the callers to know these details.
> > > A caller of set_memory_*() or set_direct_map_*() should expect a
> > > failure
> > > and be ready for that. So adding a WARN to safe_copy_page() is
> > > the first
> > > step in that direction :)
> > > 
> > 
> > I am probably missing something important, but why are we
> > saving/restoring
> > the content of pages that were explicitly removed from the identity
> > mapping
> > such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that
> should
> exclude the free pages from the snapshot. I've tried to find why the
> fix
> that maps/unmaps a page to save it was required at the first place,
> but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?

It looks like inside page_alloc.c it unmaps the page before it actually
frees it, so to hibernate it could look like the page is still
allocated even though it's unmapped? Maybe that small window is what it
cared about initially.

There is also now the vmalloc case, which I am actually working on
expanding. So I think the re-mapping logic is needed.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

T24gV2VkLCAyMDIwLTEwLTI4IGF0IDEzOjA5ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBUdWUsIE9jdCAyNywgMjAyMCBhdCAwOTo0NjozNUFNICswMTAwLCBEYXZpZCBIaWxkZW5i
cmFuZCB3cm90ZToNCj4gPiBPbiAyNy4xMC4yMCAwOTozOCwgTWlrZSBSYXBvcG9ydCB3cm90ZToN
Cj4gPiA+IE9uIE1vbiwgT2N0IDI2LCAyMDIwIGF0IDA2OjA1OjMwUE0gKzAwMDAsIEVkZ2Vjb21i
ZSwgUmljayBQDQo+ID4gPiB3cm90ZToNCj4gPiA+IA0KPiA+ID4gPiBCZXlvbmQgd2hhdGV2ZXIg
eW91IGFyZSBzZWVpbmcsIGZvciB0aGUgbGF0dGVyIGNhc2Ugb2YgbmV3DQo+ID4gPiA+IHRoaW5n
cw0KPiA+ID4gPiBnZXR0aW5nIGludHJvZHVjZWQgdG8gYW4gaW50ZXJmYWNlIHdpdGggaGlkZGVu
IGRlcGVuZGVuY2llcy4uLg0KPiA+ID4gPiBBbm90aGVyDQo+ID4gPiA+IGVkZ2UgY2FzZSBjb3Vs
ZCBiZSBhIG5ldyBjYWxsZXIgdG8gc2V0X21lbW9yeV9ucCgpIGNvdWxkIHJlc3VsdA0KPiA+ID4g
PiBpbg0KPiA+ID4gPiBsYXJnZSBOUCBwYWdlcy4gTm9uZSBvZiB0aGUgY2FsbGVycyB0b2RheSBz
aG91bGQgY2F1c2UgdGhpcw0KPiA+ID4gPiBBRkFJQ1QsIGJ1dA0KPiA+ID4gPiBpdCdzIG5vdCBn
cmVhdCB0byByZWx5IG9uIHRoZSBjYWxsZXJzIHRvIGtub3cgdGhlc2UgZGV0YWlscy4NCj4gPiA+
IEEgY2FsbGVyIG9mIHNldF9tZW1vcnlfKigpIG9yIHNldF9kaXJlY3RfbWFwXyooKSBzaG91bGQg
ZXhwZWN0IGENCj4gPiA+IGZhaWx1cmUNCj4gPiA+IGFuZCBiZSByZWFkeSBmb3IgdGhhdC4gU28g
YWRkaW5nIGEgV0FSTiB0byBzYWZlX2NvcHlfcGFnZSgpIGlzDQo+ID4gPiB0aGUgZmlyc3QNCj4g
PiA+IHN0ZXAgaW4gdGhhdCBkaXJlY3Rpb24gOikNCj4gPiA+IA0KPiA+IA0KPiA+IEkgYW0gcHJv
YmFibHkgbWlzc2luZyBzb21ldGhpbmcgaW1wb3J0YW50LCBidXQgd2h5IGFyZSB3ZQ0KPiA+IHNh
dmluZy9yZXN0b3JpbmcNCj4gPiB0aGUgY29udGVudCBvZiBwYWdlcyB0aGF0IHdlcmUgZXhwbGlj
aXRseSByZW1vdmVkIGZyb20gdGhlIGlkZW50aXR5DQo+ID4gbWFwcGluZw0KPiA+IHN1Y2ggdGhh
dCBub2JvZHkgd2lsbCBhY2Nlc3MgdGhlbT8NCj4gDQo+IEFjdHVhbGx5LCB3ZSBzaG91bGQgbm90
IGJlIHNhdmluZy9yZXN0b3JpbmcgZnJlZSBwYWdlcyBkdXJpbmcNCj4gaGliZXJuYXRpb24gYXMg
dGhlcmUgYXJlIHNldmVyYWwgY2FsbHMgdG8gbWFya19mcmVlX3BhZ2VzKCkgdGhhdA0KPiBzaG91
bGQNCj4gZXhjbHVkZSB0aGUgZnJlZSBwYWdlcyBmcm9tIHRoZSBzbmFwc2hvdC4gSSd2ZSB0cmll
ZCB0byBmaW5kIHdoeSB0aGUNCj4gZml4DQo+IHRoYXQgbWFwcy91bm1hcHMgYSBwYWdlIHRvIHNh
dmUgaXQgd2FzIHJlcXVpcmVkIGF0IHRoZSBmaXJzdCBwbGFjZSwNCj4gYnV0DQo+IEkgY291bGQg
bm90IGZpbmQgYnVnIHJlcG9ydHMuDQo+IA0KPiBUaGUgY2xvc2VzdCBJJ3ZlIGdvdCBpcyBhbiBl
bWFpbCBmcm9tIFJhZmFlbCB0aGF0IGFza2VkIHRvIHVwZGF0ZQ0KPiAiaGliZXJuYXRlOiBoYW5k
bGUgREVCVUdfUEFHRUFMTE9DIiBwYXRjaDoNCj4gDQo+IGh0dHBzOi8vbG9yZS5rZXJuZWwub3Jn
L2xpbnV4LXBtLzIwMDgwMjIwMDEzMy40NDA5OC5yandAc2lzay5wbC8NCj4gDQo+IENvdWxkIGl0
IGJlIHRoYXQgc2FmZV9jb3B5X3BhZ2UoKSB0cmllcyB0byB3b3JrYXJvdW5kIGEgbm9uLWV4aXN0
ZW50DQo+IHByb2JsZW0/DQoNCkl0IGxvb2tzIGxpa2UgaW5zaWRlIHBhZ2VfYWxsb2MuYyBpdCB1
bm1hcHMgdGhlIHBhZ2UgYmVmb3JlIGl0IGFjdHVhbGx5DQpmcmVlcyBpdCwgc28gdG8gaGliZXJu
YXRlIGl0IGNvdWxkIGxvb2sgbGlrZSB0aGUgcGFnZSBpcyBzdGlsbA0KYWxsb2NhdGVkIGV2ZW4g
dGhvdWdoIGl0J3MgdW5tYXBwZWQ/IE1heWJlIHRoYXQgc21hbGwgd2luZG93IGlzIHdoYXQgaXQN
CmNhcmVkIGFib3V0IGluaXRpYWxseS4NCg0KVGhlcmUgaXMgYWxzbyBub3cgdGhlIHZtYWxsb2Mg
Y2FzZSwgd2hpY2ggSSBhbSBhY3R1YWxseSB3b3JraW5nIG9uDQpleHBhbmRpbmcuIFNvIEkgdGhp
bmsgdGhlIHJlLW1hcHBpbmcgbG9naWMgaXMgbmVlZGVkLg0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Wed, 2020-10-28 at 13:09 +0200, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > On 27.10.20 09:38, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > 
> > > > Beyond whatever you are seeing, for the latter case of new
> > > > things
> > > > getting introduced to an interface with hidden dependencies...
> > > > Another
> > > > edge case could be a new caller to set_memory_np() could result
> > > > in
> > > > large NP pages. None of the callers today should cause this
> > > > AFAICT, but
> > > > it's not great to rely on the callers to know these details.
> > > A caller of set_memory_*() or set_direct_map_*() should expect a
> > > failure
> > > and be ready for that. So adding a WARN to safe_copy_page() is
> > > the first
> > > step in that direction :)
> > > 
> > 
> > I am probably missing something important, but why are we
> > saving/restoring
> > the content of pages that were explicitly removed from the identity
> > mapping
> > such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that
> should
> exclude the free pages from the snapshot. I've tried to find why the
> fix
> that maps/unmaps a page to save it was required at the first place,
> but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?

It looks like inside page_alloc.c it unmaps the page before it actually
frees it, so to hibernate it could look like the page is still
allocated even though it's unmapped? Maybe that small window is what it
cared about initially.

There is also now the vmalloc case, which I am actually working on
expanding. So I think the re-mapping logic is needed.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: peterz, catalin.marinas, dave.hansen, linux-mm, paulus, pavel,
	hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	linuxppc-dev, davem

On Wed, 2020-10-28 at 13:09 +0200, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > On 27.10.20 09:38, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > 
> > > > Beyond whatever you are seeing, for the latter case of new
> > > > things
> > > > getting introduced to an interface with hidden dependencies...
> > > > Another
> > > > edge case could be a new caller to set_memory_np() could result
> > > > in
> > > > large NP pages. None of the callers today should cause this
> > > > AFAICT, but
> > > > it's not great to rely on the callers to know these details.
> > > A caller of set_memory_*() or set_direct_map_*() should expect a
> > > failure
> > > and be ready for that. So adding a WARN to safe_copy_page() is
> > > the first
> > > step in that direction :)
> > > 
> > 
> > I am probably missing something important, but why are we
> > saving/restoring
> > the content of pages that were explicitly removed from the identity
> > mapping
> > such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that
> should
> exclude the free pages from the snapshot. I've tried to find why the
> fix
> that maps/unmaps a page to save it was required at the first place,
> but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?

It looks like inside page_alloc.c it unmaps the page before it actually
frees it, so to hibernate it could look like the page is still
allocated even though it's unmapped? Maybe that small window is what it
cared about initially.

There is also now the vmalloc case, which I am actually working on
expanding. So I think the re-mapping logic is needed.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 18:31               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 18:31 UTC (permalink / raw)
  To: david, rppt
  Cc: benh, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Wed, 2020-10-28 at 13:09 +0200, Mike Rapoport wrote:
> On Tue, Oct 27, 2020 at 09:46:35AM +0100, David Hildenbrand wrote:
> > On 27.10.20 09:38, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > 
> > > > Beyond whatever you are seeing, for the latter case of new
> > > > things
> > > > getting introduced to an interface with hidden dependencies...
> > > > Another
> > > > edge case could be a new caller to set_memory_np() could result
> > > > in
> > > > large NP pages. None of the callers today should cause this
> > > > AFAICT, but
> > > > it's not great to rely on the callers to know these details.
> > > A caller of set_memory_*() or set_direct_map_*() should expect a
> > > failure
> > > and be ready for that. So adding a WARN to safe_copy_page() is
> > > the first
> > > step in that direction :)
> > > 
> > 
> > I am probably missing something important, but why are we
> > saving/restoring
> > the content of pages that were explicitly removed from the identity
> > mapping
> > such that nobody will access them?
> 
> Actually, we should not be saving/restoring free pages during
> hibernation as there are several calls to mark_free_pages() that
> should
> exclude the free pages from the snapshot. I've tried to find why the
> fix
> that maps/unmaps a page to save it was required at the first place,
> but
> I could not find bug reports.
> 
> The closest I've got is an email from Rafael that asked to update
> "hibernate: handle DEBUG_PAGEALLOC" patch:
> 
> https://lore.kernel.org/linux-pm/200802200133.44098.rjw@sisk.pl/
> 
> Could it be that safe_copy_page() tries to workaround a non-existent
> problem?

It looks like inside page_alloc.c it unmaps the page before it actually
frees it, so to hibernate it could look like the page is still
allocated even though it's unmapped? Maybe that small window is what it
cared about initially.

There is also now the vmalloc case, which I am actually working on
expanding. So I think the re-mapping logic is needed.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 11:30             ` Mike Rapoport
                                 ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, tglx, linuxppc-dev, linux-riscv, x86, rjw, linux-pm,
	benh, linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.


So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, tglx, linuxppc-dev, linux-riscv, x86, rjw, linux-pm,
	benh, linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.


So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

T24gV2VkLCAyMDIwLTEwLTI4IGF0IDEzOjMwICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBPbiBXZWQsIE9jdCAyOCwgMjAyMCBhdCAxMToyMDoxMkFNICswMDAwLCBXaWxsIERlYWNvbiB3
cm90ZToNCj4gPiBPbiBUdWUsIE9jdCAyNywgMjAyMCBhdCAxMDozODoxNkFNICswMjAwLCBNaWtl
IFJhcG9wb3J0IHdyb3RlOg0KPiA+ID4gT24gTW9uLCBPY3QgMjYsIDIwMjAgYXQgMDY6MDU6MzBQ
TSArMDAwMCwgRWRnZWNvbWJlLCBSaWNrIFANCj4gPiA+IHdyb3RlOg0KPiA+ID4gPiBPbiBNb24s
IDIwMjAtMTAtMjYgYXQgMTE6MDUgKzAyMDAsIE1pa2UgUmFwb3BvcnQgd3JvdGU6DQo+ID4gPiA+
ID4gT24gTW9uLCBPY3QgMjYsIDIwMjAgYXQgMDE6MTM6NTJBTSArMDAwMCwgRWRnZWNvbWJlLCBS
aWNrIFANCj4gPiA+ID4gPiB3cm90ZToNCj4gPiA+ID4gPiA+IE9uIFN1biwgMjAyMC0xMC0yNSBh
dCAxMjoxNSArMDIwMCwgTWlrZSBSYXBvcG9ydCB3cm90ZToNCj4gPiA+ID4gPiA+ID4gSW5kZWVk
LCBmb3IgYXJjaGl0ZWN0dXJlcyB0aGF0IGRlZmluZQ0KPiA+ID4gPiA+ID4gPiBDT05GSUdfQVJD
SF9IQVNfU0VUX0RJUkVDVF9NQVANCj4gPiA+ID4gPiA+ID4gaXQgaXMNCj4gPiA+ID4gPiA+ID4g
cG9zc2libGUgdGhhdCBfX2tlcm5lbF9tYXBfcGFnZXMoKSB3b3VsZCBmYWlsLCBidXQgc2luY2UN
Cj4gPiA+ID4gPiA+ID4gdGhpcw0KPiA+ID4gPiA+ID4gPiBmdW5jdGlvbiBpcw0KPiA+ID4gPiA+
ID4gPiB2b2lkLCB0aGUgZmFpbHVyZSB3aWxsIGdvIHVubm90aWNlZC4NCj4gPiA+ID4gPiA+IA0K
PiA+ID4gPiA+ID4gQ291bGQgeW91IGVsYWJvcmF0ZSBvbiBob3cgdGhpcyBjb3VsZCBoYXBwZW4/
IERvIHlvdSBtZWFuDQo+ID4gPiA+ID4gPiBkdXJpbmcNCj4gPiA+ID4gPiA+IHJ1bnRpbWUgdG9k
YXkgb3IgaWYgc29tZXRoaW5nIG5ldyB3YXMgaW50cm9kdWNlZD8NCj4gPiA+ID4gPiANCj4gPiA+
ID4gPiBBIGZhaWx1cmUgaW5fX2tlcm5lbF9tYXBfcGFnZXMoKSBtYXkgaGFwcGVuIHRvZGF5LiBG
b3INCj4gPiA+ID4gPiBpbnN0YW5jZSwgb24NCj4gPiA+ID4gPiB4ODYNCj4gPiA+ID4gPiBpZiB0
aGUga2VybmVsIGlzIGJ1aWx0IHdpdGggREVCVUdfUEFHRUFMTE9DLg0KPiA+ID4gPiA+IA0KPiA+
ID4gPiA+ICAgICAgICAgX19rZXJuZWxfbWFwX3BhZ2VzKHBhZ2UsIDEsIDApOw0KPiA+ID4gPiA+
IA0KPiA+ID4gPiA+IHdpbGwgbmVlZCB0byBzcGxpdCwgc2F5LCAyTSBwYWdlIGFuZCBkdXJpbmcg
dGhlIHNwbGl0IGFuDQo+ID4gPiA+ID4gYWxsb2NhdGlvbg0KPiA+ID4gPiA+IG9mDQo+ID4gPiA+
ID4gcGFnZSB0YWJsZSBjb3VsZCBmYWlsLg0KPiA+ID4gPiANCj4gPiA+ID4gT24geDg2IGF0IGxl
YXN0LCBERUJVR19QQUdFQUxMT0MgZXhwZWN0cyB0byBuZXZlciBoYXZlIHRvIGJyZWFrDQo+ID4g
PiA+IGEgcGFnZQ0KPiA+ID4gPiBvbiB0aGUgZGlyZWN0IG1hcCBhbmQgZXZlbiBkaXNhYmxlcyBs
b2NraW5nIGluIGNwYSBiZWNhdXNlIGl0DQo+ID4gPiA+IGFzc3VtZXMNCj4gPiA+ID4gdGhpcy4g
SWYgdGhpcyBpcyBoYXBwZW5pbmcgc29tZWhvdyBhbnl3YXkgdGhlbiB3ZSBzaG91bGQNCj4gPiA+
ID4gcHJvYmFibHkgZml4DQo+ID4gPiA+IHRoYXQuIEV2ZW4gaWYgaXQncyBhIGRlYnVnIGZlYXR1
cmUsIGl0IHdpbGwgbm90IGJlIGFzIHVzZWZ1bCBpZg0KPiA+ID4gPiBpdCBpcw0KPiA+ID4gPiBj
YXVzaW5nIGl0cyBvd24gY3Jhc2hlcy4NCj4gPiA+ID4gDQo+ID4gPiA+IEknbSBzdGlsbCB3b25k
ZXJpbmcgaWYgdGhlcmUgaXMgc29tZXRoaW5nIEknbSBtaXNzaW5nIGhlcmUuIEl0DQo+ID4gPiA+
IHNlZW1zDQo+ID4gPiA+IGxpa2UgeW91IGFyZSBzYXlpbmcgdGhlcmUgaXMgYSBidWcgaW4gc29t
ZSBhcmNoJ3MsIHNvIGxldCdzIGFkZA0KPiA+ID4gPiBhIFdBUk4NCj4gPiA+ID4gaW4gY3Jvc3Mt
YXJjaCBjb2RlIHRvIGxvZyBpdCBhcyBpdCBjcmFzaGVzLiBBIHdhcm4gYW5kIG1ha2luZw0KPiA+
ID4gPiB0aGluZ3MNCj4gPiA+ID4gY2xlYXJlciBzZWVtIGxpa2UgZ29vZCBpZGVhcywgYnV0IGlm
IHRoZXJlIGlzIGEgYnVnIHdlIHNob3VsZA0KPiA+ID4gPiBmaXggaXQuDQo+ID4gPiA+IFRoZSBj
b2RlIGFyb3VuZCB0aGUgY2FsbGVycyBzdGlsbCBmdW5jdGlvbmFsbHkgYXNzdW1lIHJlLQ0KPiA+
ID4gPiBtYXBwaW5nIGNhbid0DQo+ID4gPiA+IGZhaWwuDQo+ID4gPiANCj4gPiA+IE9oLCBJJ3Zl
IG1lYW50IHg4NiBrZXJuZWwgKndpdGhvdXQqIERFQlVHX1BBR0VBTExPQywgYW5kIGluZGVlZA0K
PiA+ID4gdGhlIGNhbGwNCj4gPiA+IHRoYXQgdW5tYXBzIHBhZ2VzIGJhY2sgaW4gc2FmZV9jb3B5
X3BhZ2Ugd2lsbCBqdXN0IHJlc2V0IGEgNEsNCj4gPiA+IHBhZ2UgdG8NCj4gPiA+IE5QIGJlY2F1
c2Ugd2hhdGV2ZXIgbWFkZSBpdCBOUCBhdCB0aGUgZmlyc3QgcGxhY2UgYWxyZWFkeSBkaWQgdGhl
DQo+ID4gPiBzcGxpdC4NCj4gPiA+IA0KPiA+ID4gU3RpbGwsIG9uIGFybTY0IHdpdGggREVCVUdf
UEFHRUFMTE9DPW4gdGhlcmUgaXMgYSBwb3NzaWJpbGl0eSBvZg0KPiA+ID4gYSByYWNlDQo+ID4g
PiBiZXR3ZWVuIG1hcC91bm1hcCBkYW5jZSBpbiBfX3Z1bm1hcCgpIGFuZCBzYWZlX2NvcHlfcGFn
ZSgpIHRoYXQNCj4gPiA+IG1heQ0KPiA+ID4gY2F1c2UgYWNjZXNzIHRvIHVubWFwcGVkIG1lbW9y
eToNCj4gPiA+IA0KPiA+ID4gX192dW5tYXAoKQ0KPiA+ID4gICAgIHZtX3JlbW92ZV9tYXBwaW5n
cygpDQo+ID4gPiAgICAgICAgIHNldF9kaXJlY3RfbWFwX2ludmFsaWQoKQ0KPiA+ID4gCQkJCQlz
YWZlX2NvcHlfcGFnZSgpCQ0KPiA+ID4gCQkJCQkgICAgX19rZXJuZWxfbWFwX3BhZ2VzKCkNCj4g
PiA+IAkJCQkJICAgIAlyZXR1cm4NCj4gPiA+IAkJCQkJICAgIGRvX2NvcHlfcGFnZSgpIC0+IGZh
dWx0DQo+ID4gPiAJCQkJCSAgIAkNCj4gPiA+IFRoaXMgaXMgYSB0aGVvcmV0aWNhbCBidWcsIGJ1
dCBpdCBpcyBzdGlsbCBub3QgbmljZSA6KSAJCQ0KPiA+ID4gCQkJCQkNCj4gPiANCj4gPiBKdXN0
IHRvIGNsYXJpZnk6IHRoaXMgcGF0Y2ggc2VyaWVzIGZpeGVzIHRoaXMgcHJvYmxlbSwgcmlnaHQ/
DQo+IA0KPiBZZXMuDQo+IA0KDQpXZWxsLCBub3cgSSdtIGNvbmZ1c2VkIGFnYWluLg0KDQpBcyBE
YXZpZCBwb2ludGVkLCBfX3Z1bm1hcCgpIHNob3VsZCBub3QgYmUgZXhlY3V0aW5nIHNpbXVsdGFu
ZW91c2x5DQp3aXRoIHRoZSBoaWJlcm5hdGUgb3BlcmF0aW9uIGJlY2F1c2UgaGliZXJuYXRlIGNh
bid0IHNuYXBzaG90IHdoaWxlDQpkYXRhIGl0IG5lZWRzIHRvIHNhdmUgaXMgc3RpbGwgdXBkYXRp
bmcuIElmIGEgdGhyZWFkIHdhcyBwYXVzZWQgd2hlbiBhDQpwYWdlIHdhcyBpbiBhbiAiaW52YWxp
ZCIgc3RhdGUsIGl0IHNob3VsZCBiZSByZW1hcHBlZCBieSBoaWJlcm5hdGUNCmJlZm9yZSB0aGUg
Y29weS4NCg0KVG8gbGV2ZWwgc2V0LCBiZWZvcmUgcmVhZGluZyB0aGlzIG1haWwsIG15IHRha2Vh
d2F5cyBmcm9tIHRoZQ0KZGlzY3Vzc2lvbnMgb24gcG90ZW50aWFsIGhpYmVybmF0ZS9kZWJ1ZyBw
YWdlIGFsbG9jIHByb2JsZW1zIHdlcmU6DQoNClBvdGVudGlhbCBSSVNDLVYgaXNzdWU6DQpEb2Vz
bid0IGhhdmUgaGliZXJuYXRlIHN1cHBvcnQNCg0KUG90ZW50aWFsIEFSTSBpc3N1ZToNClRoZSBs
b2dpYyBhcm91bmQgd2hlbiBpdCdzIGNwYSBkZXRlcm1pbmVzIHBhZ2VzIG1pZ2h0IGJlIHVubWFw
cGVkIGxvb2tzDQpjb3JyZWN0IGZvciBjdXJyZW50IGNhbGxlcnMuDQoNClBvdGVudGlhbCB4ODYg
cGFnZSBicmVhayBpc3N1ZToNClNlZW1zIHRvIGJlIG9rIGZvciBub3csIGJ1dCBhIG5ldyBzZXRf
bWVtb3J5X25wKCkgY2FsbGVyIGNvdWxkIHZpb2xhdGUNCmFzc3VtcHRpb25zIGluIGhpYmVybmF0
ZS4NCg0KTm9uLW9idmlvdXMgdGhvcm55IGxvZ2ljOiANCkdlbmVyYWwgYWdyZWVtZW50IGl0IHdv
dWxkIGJlIGdvb2QgdG8gc2VwYXJhdGUgZGVwZW5kZW5jaWVzLg0KDQpCZWhhdmlvciBvZiBWMSBv
ZiB0aGlzIHBhdGNoc2V0Og0KTm8gZnVuY3Rpb25hbCBjaGFuZ2Ugb3RoZXIgdGhhbiBhZGRpdGlv
biBvZiBhIHdhcm4gaW4gaGliZXJuYXRlLg0KDQoNClNvICJkb2VzIHRoaXMgZml4IHRoZSBwcm9i
bGVtIiwgInllcyIgbGVhdmVzIG1lIGEgYml0IGNvbmZ1c2VkLi4uIE5vdA0Kc2F5aW5nIHRoZXJl
IGNvdWxkbid0IGJlIGFueSBwcm9ibGVtcywgZXNwZWNpYWxseSBkdWUgdG8gdGhlIHRob3JuaW5l
c3MNCmFuZCBjcm9zcyBhcmNoIHN0cmlkZSwgYnV0IHdoYXQgaXMgaXQgZXhhY3RseSBhbmQgaG93
IGRvZXMgdGhpcyBzZXJpZXMNCmZpeCBpdD8NCg0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.


So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	linuxppc-dev, davem

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.


So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-28 21:03               ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:03 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Wed, 2020-10-28 at 13:30 +0200, Mike Rapoport wrote:
> On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > On Mon, Oct 26, 2020 at 06:05:30PM +0000, Edgecombe, Rick P
> > > wrote:
> > > > On Mon, 2020-10-26 at 11:05 +0200, Mike Rapoport wrote:
> > > > > On Mon, Oct 26, 2020 at 01:13:52AM +0000, Edgecombe, Rick P
> > > > > wrote:
> > > > > > On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > > > > > > Indeed, for architectures that define
> > > > > > > CONFIG_ARCH_HAS_SET_DIRECT_MAP
> > > > > > > it is
> > > > > > > possible that __kernel_map_pages() would fail, but since
> > > > > > > this
> > > > > > > function is
> > > > > > > void, the failure will go unnoticed.
> > > > > > 
> > > > > > Could you elaborate on how this could happen? Do you mean
> > > > > > during
> > > > > > runtime today or if something new was introduced?
> > > > > 
> > > > > A failure in__kernel_map_pages() may happen today. For
> > > > > instance, on
> > > > > x86
> > > > > if the kernel is built with DEBUG_PAGEALLOC.
> > > > > 
> > > > >         __kernel_map_pages(page, 1, 0);
> > > > > 
> > > > > will need to split, say, 2M page and during the split an
> > > > > allocation
> > > > > of
> > > > > page table could fail.
> > > > 
> > > > On x86 at least, DEBUG_PAGEALLOC expects to never have to break
> > > > a page
> > > > on the direct map and even disables locking in cpa because it
> > > > assumes
> > > > this. If this is happening somehow anyway then we should
> > > > probably fix
> > > > that. Even if it's a debug feature, it will not be as useful if
> > > > it is
> > > > causing its own crashes.
> > > > 
> > > > I'm still wondering if there is something I'm missing here. It
> > > > seems
> > > > like you are saying there is a bug in some arch's, so let's add
> > > > a WARN
> > > > in cross-arch code to log it as it crashes. A warn and making
> > > > things
> > > > clearer seem like good ideas, but if there is a bug we should
> > > > fix it.
> > > > The code around the callers still functionally assume re-
> > > > mapping can't
> > > > fail.
> > > 
> > > Oh, I've meant x86 kernel *without* DEBUG_PAGEALLOC, and indeed
> > > the call
> > > that unmaps pages back in safe_copy_page will just reset a 4K
> > > page to
> > > NP because whatever made it NP at the first place already did the
> > > split.
> > > 
> > > Still, on arm64 with DEBUG_PAGEALLOC=n there is a possibility of
> > > a race
> > > between map/unmap dance in __vunmap() and safe_copy_page() that
> > > may
> > > cause access to unmapped memory:
> > > 
> > > __vunmap()
> > >     vm_remove_mappings()
> > >         set_direct_map_invalid()
> > > 					safe_copy_page()	
> > > 					    __kernel_map_pages()
> > > 					    	return
> > > 					    do_copy_page() -> fault
> > > 					   	
> > > This is a theoretical bug, but it is still not nice :) 		
> > > 					
> > 
> > Just to clarify: this patch series fixes this problem, right?
> 
> Yes.
> 

Well, now I'm confused again.

As David pointed, __vunmap() should not be executing simultaneously
with the hibernate operation because hibernate can't snapshot while
data it needs to save is still updating. If a thread was paused when a
page was in an "invalid" state, it should be remapped by hibernate
before the copy.

To level set, before reading this mail, my takeaways from the
discussions on potential hibernate/debug page alloc problems were:

Potential RISC-V issue:
Doesn't have hibernate support

Potential ARM issue:
The logic around when it's cpa determines pages might be unmapped looks
correct for current callers.

Potential x86 page break issue:
Seems to be ok for now, but a new set_memory_np() caller could violate
assumptions in hibernate.

Non-obvious thorny logic: 
General agreement it would be good to separate dependencies.

Behavior of V1 of this patchset:
No functional change other than addition of a warn in hibernate.


So "does this fix the problem", "yes" leaves me a bit confused... Not
saying there couldn't be any problems, especially due to the thorniness
and cross arch stride, but what is it exactly and how does this series
fix it?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-25 10:15   ` Mike Rapoport
                       ` (3 preceding siblings ...)
  (?)
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, iamjoonsoo.kim, linux-mm, will, aou,
	kirill, rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe,
	luto, davem, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +               unsigned long addr = (unsigned
> long)page_address(page);
> +               int ret;
> +
> +               if (enable)
> +                       ret = set_direct_map_default_noflush(page);
> +               else
> +                       ret = set_direct_map_invalid_noflush(page);
> +
> +               if (WARN_ON(ret))
> +                       return;
> +
> +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       } else {
> +               debug_pagealloc_map_pages(page, 1, enable);
> +       }

Looking at the arm side again, I think this might actually introduce a
regression for the arm/hibernate/DEBUG_PAGEALLOC combo.

Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
in the set_direct_map_() functions if rodata_full is false. So if
rodata_full was disabled but debug page alloc is on, then this would
now skip remapping the pages. I guess the significance depends on
whether hibernate could actually try to save any DEBUG_PAGEALLOC
unmapped pages. Looks like it to me though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, iamjoonsoo.kim, linux-mm, will, aou,
	kirill, rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe,
	luto, davem, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +               unsigned long addr = (unsigned
> long)page_address(page);
> +               int ret;
> +
> +               if (enable)
> +                       ret = set_direct_map_default_noflush(page);
> +               else
> +                       ret = set_direct_map_invalid_noflush(page);
> +
> +               if (WARN_ON(ret))
> +                       return;
> +
> +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       } else {
> +               debug_pagealloc_map_pages(page, 1, enable);
> +       }

Looking at the arm side again, I think this might actually introduce a
regression for the arm/hibernate/DEBUG_PAGEALLOC combo.

Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
in the set_direct_map_() functions if rodata_full is false. So if
rodata_full was disabled but debug page alloc is on, then this would
now skip remapping the pages. I guess the significance depends on
whether hibernate could actually try to save any DEBUG_PAGEALLOC
unmapped pages. Looks like it to me though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

T24gU3VuLCAyMDIwLTEwLTI1IGF0IDEyOjE1ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiArICAgICAgIGlmIChJU19FTkFCTEVEKENPTkZJR19BUkNIX0hBU19TRVRfRElSRUNUX01BUCkp
IHsNCj4gKyAgICAgICAgICAgICAgIHVuc2lnbmVkIGxvbmcgYWRkciA9ICh1bnNpZ25lZA0KPiBs
b25nKXBhZ2VfYWRkcmVzcyhwYWdlKTsNCj4gKyAgICAgICAgICAgICAgIGludCByZXQ7DQo+ICsN
Cj4gKyAgICAgICAgICAgICAgIGlmIChlbmFibGUpDQo+ICsgICAgICAgICAgICAgICAgICAgICAg
IHJldCA9IHNldF9kaXJlY3RfbWFwX2RlZmF1bHRfbm9mbHVzaChwYWdlKTsNCj4gKyAgICAgICAg
ICAgICAgIGVsc2UNCj4gKyAgICAgICAgICAgICAgICAgICAgICAgcmV0ID0gc2V0X2RpcmVjdF9t
YXBfaW52YWxpZF9ub2ZsdXNoKHBhZ2UpOw0KPiArDQo+ICsgICAgICAgICAgICAgICBpZiAoV0FS
Tl9PTihyZXQpKQ0KPiArICAgICAgICAgICAgICAgICAgICAgICByZXR1cm47DQo+ICsNCj4gKyAg
ICAgICAgICAgICAgIGZsdXNoX3RsYl9rZXJuZWxfcmFuZ2UoYWRkciwgYWRkciArIFBBR0VfU0la
RSk7DQo+ICsgICAgICAgfSBlbHNlIHsNCj4gKyAgICAgICAgICAgICAgIGRlYnVnX3BhZ2VhbGxv
Y19tYXBfcGFnZXMocGFnZSwgMSwgZW5hYmxlKTsNCj4gKyAgICAgICB9DQoNCkxvb2tpbmcgYXQg
dGhlIGFybSBzaWRlIGFnYWluLCBJIHRoaW5rIHRoaXMgbWlnaHQgYWN0dWFsbHkgaW50cm9kdWNl
IGENCnJlZ3Jlc3Npb24gZm9yIHRoZSBhcm0vaGliZXJuYXRlL0RFQlVHX1BBR0VBTExPQyBjb21i
by4NCg0KVW5saWtlIF9fa2VybmVsX21hcF9wYWdlcygpLCBpdCBsb29rcyBsaWtlIGFybSdzIGNw
YSB3aWxsIGFsd2F5cyBiYWlsDQppbiB0aGUgc2V0X2RpcmVjdF9tYXBfKCkgZnVuY3Rpb25zIGlm
IHJvZGF0YV9mdWxsIGlzIGZhbHNlLiBTbyBpZg0Kcm9kYXRhX2Z1bGwgd2FzIGRpc2FibGVkIGJ1
dCBkZWJ1ZyBwYWdlIGFsbG9jIGlzIG9uLCB0aGVuIHRoaXMgd291bGQNCm5vdyBza2lwIHJlbWFw
cGluZyB0aGUgcGFnZXMuIEkgZ3Vlc3MgdGhlIHNpZ25pZmljYW5jZSBkZXBlbmRzIG9uDQp3aGV0
aGVyIGhpYmVybmF0ZSBjb3VsZCBhY3R1YWxseSB0cnkgdG8gc2F2ZSBhbnkgREVCVUdfUEFHRUFM
TE9DDQp1bm1hcHBlZCBwYWdlcy4gTG9va3MgbGlrZSBpdCB0byBtZSB0aG91Z2guDQoNCg=

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +               unsigned long addr = (unsigned
> long)page_address(page);
> +               int ret;
> +
> +               if (enable)
> +                       ret = set_direct_map_default_noflush(page);
> +               else
> +                       ret = set_direct_map_invalid_noflush(page);
> +
> +               if (WARN_ON(ret))
> +                       return;
> +
> +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       } else {
> +               debug_pagealloc_map_pages(page, 1, enable);
> +       }

Looking at the arm side again, I think this might actually introduce a
regression for the arm/hibernate/DEBUG_PAGEALLOC combo.

Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
in the set_direct_map_() functions if rodata_full is false. So if
rodata_full was disabled but debug page alloc is on, then this would
now skip remapping the pages. I guess the significance depends on
whether hibernate could actually try to save any DEBUG_PAGEALLOC
unmapped pages. Looks like it to me though.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +               unsigned long addr = (unsigned
> long)page_address(page);
> +               int ret;
> +
> +               if (enable)
> +                       ret = set_direct_map_default_noflush(page);
> +               else
> +                       ret = set_direct_map_invalid_noflush(page);
> +
> +               if (WARN_ON(ret))
> +                       return;
> +
> +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       } else {
> +               debug_pagealloc_map_pages(page, 1, enable);
> +       }

Looking at the arm side again, I think this might actually introduce a
regression for the arm/hibernate/DEBUG_PAGEALLOC combo.

Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
in the set_direct_map_() functions if rodata_full is false. So if
rodata_full was disabled but debug page alloc is on, then this would
now skip remapping the pages. I guess the significance depends on
whether hibernate could actually try to save any DEBUG_PAGEALLOC
unmapped pages. Looks like it to me though.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-28 21:15     ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-28 21:15 UTC (permalink / raw)
  To: rppt, akpm
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> +               unsigned long addr = (unsigned
> long)page_address(page);
> +               int ret;
> +
> +               if (enable)
> +                       ret = set_direct_map_default_noflush(page);
> +               else
> +                       ret = set_direct_map_invalid_noflush(page);
> +
> +               if (WARN_ON(ret))
> +                       return;
> +
> +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +       } else {
> +               debug_pagealloc_map_pages(page, 1, enable);
> +       }

Looking at the arm side again, I think this might actually introduce a
regression for the arm/hibernate/DEBUG_PAGEALLOC combo.

Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
in the set_direct_map_() functions if rodata_full is false. So if
rodata_full was disabled but debug page alloc is on, then this would
now skip remapping the pages. I guess the significance depends on
whether hibernate could actually try to save any DEBUG_PAGEALLOC
unmapped pages. Looks like it to me though.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-28 21:15     ` Edgecombe, Rick P
                         ` (3 preceding siblings ...)
  (?)
@ 2020-10-29  7:54       ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, iamjoonsoo.kim, linux-mm, will, aou,
	kirill, rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe,
	luto, davem, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29  7:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: akpm, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, iamjoonsoo.kim, linux-mm, will, aou,
	kirill, rientjes, rppt, paulus, hca, pavel, bp, sparclinux, mpe,
	luto, davem, tglx, rjw, linux-riscv, benh, linuxppc-dev, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29  7:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29  7:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29  7:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx, akpm,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29  7:54       ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  7:54 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, akpm, linux-arm-kernel, rjw, linux-kernel, penberg, palmer,
	iamjoonsoo.kim, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:15:38PM +0000, Edgecombe, Rick P wrote:
> On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> > +       if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> > +               unsigned long addr = (unsigned
> > long)page_address(page);
> > +               int ret;
> > +
> > +               if (enable)
> > +                       ret = set_direct_map_default_noflush(page);
> > +               else
> > +                       ret = set_direct_map_invalid_noflush(page);
> > +
> > +               if (WARN_ON(ret))
> > +                       return;
> > +
> > +               flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> > +       } else {
> > +               debug_pagealloc_map_pages(page, 1, enable);
> > +       }
> 
> Looking at the arm side again, I think this might actually introduce a
> regression for the arm/hibernate/DEBUG_PAGEALLOC combo.
> 
> Unlike __kernel_map_pages(), it looks like arm's cpa will always bail
> in the set_direct_map_() functions if rodata_full is false.
>
> So if rodata_full was disabled but debug page alloc is on, then this
> would now skip remapping the pages. I guess the significance depends
> on whether hibernate could actually try to save any DEBUG_PAGEALLOC
> unmapped pages. Looks like it to me though.
 
__kernel_map_pages() on arm64 will also bail out if rodata_full is
false:

void __kernel_map_pages(struct page *page, int numpages, int enable)
{
	if (!debug_pagealloc_enabled() && !rodata_full)
		return;

	set_memory_valid((unsigned long)page_address(page), numpages, enable);
}

So using set_direct_map() to map back pages removed from the direct map
with __kernel_map_pages() seems safe to me.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-28 21:03               ` Edgecombe, Rick P
                                   ` (3 preceding siblings ...)
  (?)
@ 2020-10-29  8:12                 ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: will, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, tglx, linuxppc-dev, linux-riscv, x86, rjw, linux-pm,
	benh, linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:12                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: will, david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, tglx, linuxppc-dev, linux-riscv, x86, rjw, linux-pm,
	benh, linux-arm-kernel, palmer, Brown, Len, mingo, linux-s390,
	linux-kernel, paul.walmsley

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:12                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:12                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:12                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:12                 ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-10-29  8:12 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Wed, Oct 28, 2020 at 09:03:31PM +0000, Edgecombe, Rick P wrote:

> > On Wed, Oct 28, 2020 at 11:20:12AM +0000, Will Deacon wrote:
> > > On Tue, Oct 27, 2020 at 10:38:16AM +0200, Mike Rapoport wrote:
> > > > 					   	
> > > > This is a theoretical bug, but it is still not nice :) 		
> > > > 					
> > > 
> > > Just to clarify: this patch series fixes this problem, right?
> > 
> > Yes.
> > 
> 
> Well, now I'm confused again.
> 
> As David pointed, __vunmap() should not be executing simultaneously
> with the hibernate operation because hibernate can't snapshot while
> data it needs to save is still updating. If a thread was paused when a
> page was in an "invalid" state, it should be remapped by hibernate
> before the copy.
> 
> To level set, before reading this mail, my takeaways from the
> discussions on potential hibernate/debug page alloc problems were:
> 
> Potential RISC-V issue:
> Doesn't have hibernate support
> 
> Potential ARM issue:
> The logic around when it's cpa determines pages might be unmapped looks
> correct for current callers.
> 
> Potential x86 page break issue:
> Seems to be ok for now, but a new set_memory_np() caller could violate
> assumptions in hibernate.
> 
> Non-obvious thorny logic: 
> General agreement it would be good to separate dependencies.
> 
> Behavior of V1 of this patchset:
> No functional change other than addition of a warn in hibernate.

There is a change that adds explicit use of set_direct_map() to
hibernate. Currently, in case of arm64 with DEBUG_PAGEALLOC=n if a
thread was paused when a page was in an "invalid" state hibernate will
access an unmapped data because __kernel_map_pages() will bail out.
After the change set_direct_map_default_noflush() would be used and the
page will get mapped before copy.

> So "does this fix the problem", "yes" leaves me a bit confused... Not
> saying there couldn't be any problems, especially due to the thorniness
> and cross arch stride, but what is it exactly and how does this series
> fix it?

This series goal was primarily to separate dependincies and make it
clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it turned
out, there is also some lack of consistency between architectures that
implement either of this so I tried to improve this as well.

Honestly, I don't know if a thread can be paused at the time __vunmap()
left invalid pages, but it could, there is an issue on arm64 with
DEBUG_PAGEALLOC=n and this set fixes it.

__vunmap()
    vm_remove_mappings()
        set_direct_map_invalid()
	/* thread is frozen */
 					safe_copy_page()	
 					    __kernel_map_pages()
						if (!debug_pagealloc())
 					    	    return
 					    do_copy_page() -> fault

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-25 10:15 ` Mike Rapoport
                     ` (2 preceding siblings ...)
  (?)
@ 2020-10-29  8:15   ` David Hildenbrand
  -1 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-29  8:15 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Albert Ou, Andy Lutomirski, Benjamin Herrenschmidt,
	Borislav Petkov, Catalin Marinas, Christian Borntraeger,
	Christoph Lameter, David S. Miller, Dave Hansen, David Rientjes,
	Edgecombe, Rick P, H. Peter Anvin, Heiko Carstens, Ingo Molnar,
	Joonsoo Kim, Kirill A. Shutemov, Len Brown, Michael Ellerman,
	Mike Rapoport, Palmer Dabbelt, Paul Mackerras, Paul Walmsley,
	Pavel Machek, Pekka Enberg, Peter Zijlstra, Rafael J. Wysocki,
	Thomas Gleixner, Vasily Gorbik, Will Deacon, linux-arm-kernel,
	linux-kernel, linux-mm, linux-pm, linux-riscv, linux-s390,
	linuxppc-dev, sparclinux, x86

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> During recent discussion about KVM protected memory, David raised a concern
> about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
> 
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
> possible that __kernel_map_pages() would fail, but since this function is
> void, the failure will go unnoticed.
> 
> Moreover, there's lack of consistency of __kernel_map_pages() semantics
> across architectures as some guard this function with
> #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
> allocation debugging is disabled at run time and some allow modifying the
> direct map regardless of DEBUG_PAGEALLOC settings.
> 
> This set straightens this out by restoring dependency of
> __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
> accordingly.
> 

So, I was primarily wondering if we really have to touch direct mappings 
in hibernation code, or if we can avoid doing that. I was wondering if 
we cannot simply do something like kmap() when trying to access a 
!mapped page. Similar to reading old-os memory after kexec when in 
kdump. Just a thought.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:15   ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-29  8:15 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> During recent discussion about KVM protected memory, David raised a concern
> about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
> 
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
> possible that __kernel_map_pages() would fail, but since this function is
> void, the failure will go unnoticed.
> 
> Moreover, there's lack of consistency of __kernel_map_pages() semantics
> across architectures as some guard this function with
> #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
> allocation debugging is disabled at run time and some allow modifying the
> direct map regardless of DEBUG_PAGEALLOC settings.
> 
> This set straightens this out by restoring dependency of
> __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
> accordingly.
> 

So, I was primarily wondering if we really have to touch direct mappings 
in hibernation code, or if we can avoid doing that. I was wondering if 
we cannot simply do something like kmap() when trying to access a 
!mapped page. Similar to reading old-os memory after kexec when in 
kdump. Just a thought.

-- 
Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:15   ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-29  8:15 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> During recent discussion about KVM protected memory, David raised a concern
> about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
> 
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
> possible that __kernel_map_pages() would fail, but since this function is
> void, the failure will go unnoticed.
> 
> Moreover, there's lack of consistency of __kernel_map_pages() semantics
> across architectures as some guard this function with
> #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
> allocation debugging is disabled at run time and some allow modifying the
> direct map regardless of DEBUG_PAGEALLOC settings.
> 
> This set straightens this out by restoring dependency of
> __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
> accordingly.
> 

So, I was primarily wondering if we really have to touch direct mappings 
in hibernation code, or if we can avoid doing that. I was wondering if 
we cannot simply do something like kmap() when trying to access a 
!mapped page. Similar to reading old-os memory after kexec when in 
kdump. Just a thought.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:15   ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-29  8:15 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Dave Hansen, linux-mm, Paul Mackerras,
	Pavel Machek, H. Peter Anvin, sparclinux, Christoph Lameter,
	Will Deacon, linux-riscv, linux-s390, x86, Mike Rapoport,
	Christian Borntraeger, Ingo Molnar, Catalin Marinas, Len Brown,
	Albert Ou, Vasily Gorbik, linux-pm, Heiko Carstens,
	David Rientjes, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> During recent discussion about KVM protected memory, David raised a concern
> about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
> 
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
> possible that __kernel_map_pages() would fail, but since this function is
> void, the failure will go unnoticed.
> 
> Moreover, there's lack of consistency of __kernel_map_pages() semantics
> across architectures as some guard this function with
> #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
> allocation debugging is disabled at run time and some allow modifying the
> direct map regardless of DEBUG_PAGEALLOC settings.
> 
> This set straightens this out by restoring dependency of
> __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
> accordingly.
> 

So, I was primarily wondering if we really have to touch direct mappings 
in hibernation code, or if we can avoid doing that. I was wondering if 
we cannot simply do something like kmap() when trying to access a 
!mapped page. Similar to reading old-os memory after kexec when in 
kdump. Just a thought.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29  8:15   ` David Hildenbrand
  0 siblings, 0 replies; 219+ messages in thread
From: David Hildenbrand @ 2020-10-29  8:15 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Peter Zijlstra, Benjamin Herrenschmidt, Dave Hansen, linux-mm,
	Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
	Christoph Lameter, Will Deacon, linux-riscv, linux-s390,
	Michael Ellerman, x86, Mike Rapoport, Christian Borntraeger,
	Ingo Molnar, Catalin Marinas, Len Brown, Albert Ou,
	Vasily Gorbik, linux-pm, Heiko Carstens, David Rientjes,
	Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Kirill A. Shutemov, Thomas Gleixner, linux-arm-kernel,
	Rafael J. Wysocki, linux-kernel, Pekka Enberg, Palmer Dabbelt,
	Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev, David S. Miller

On 25.10.20 11:15, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
> 
> Hi,
> 
> During recent discussion about KVM protected memory, David raised a concern
> about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
> 
> Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
> possible that __kernel_map_pages() would fail, but since this function is
> void, the failure will go unnoticed.
> 
> Moreover, there's lack of consistency of __kernel_map_pages() semantics
> across architectures as some guard this function with
> #ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
> allocation debugging is disabled at run time and some allow modifying the
> direct map regardless of DEBUG_PAGEALLOC settings.
> 
> This set straightens this out by restoring dependency of
> __kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
> accordingly.
> 

So, I was primarily wondering if we really have to touch direct mappings 
in hibernation code, or if we can avoid doing that. I was wondering if 
we cannot simply do something like kmap() when trying to access a 
!mapped page. Similar to reading old-os memory after kexec when in 
kdump. Just a thought.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-29  7:54       ` Mike Rapoport
                           ` (3 preceding siblings ...)
  (?)
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
>         if (!debug_pagealloc_enabled() && !rodata_full)
>                 return;
> 
>         set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
> 
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.

Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.

__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.

Equivalent logic would be:
	if (!(debug_pagealloc_enabled() || rodata_full))
		return;

Or:
	if (debug_pagealloc_enabled() || rodata_full)
		set_memory_valid(blah)

So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.

It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
>         if (!debug_pagealloc_enabled() && !rodata_full)
>                 return;
> 
>         set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
> 
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.

Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.

__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.

Equivalent logic would be:
	if (!(debug_pagealloc_enabled() || rodata_full))
		return;

Or:
	if (debug_pagealloc_enabled() || rodata_full)
		set_memory_valid(blah)

So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.

It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

T24gVGh1LCAyMDIwLTEwLTI5IGF0IDA5OjU0ICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBfX2tlcm5lbF9tYXBfcGFnZXMoKSBvbiBhcm02NCB3aWxsIGFsc28gYmFpbCBvdXQgaWYgcm9k
YXRhX2Z1bGwgaXMNCj4gZmFsc2U6DQo+IHZvaWQgX19rZXJuZWxfbWFwX3BhZ2VzKHN0cnVjdCBw
YWdlICpwYWdlLCBpbnQgbnVtcGFnZXMsIGludCBlbmFibGUpDQo+IHsNCj4gICAgICAgICBpZiAo
IWRlYnVnX3BhZ2VhbGxvY19lbmFibGVkKCkgJiYgIXJvZGF0YV9mdWxsKQ0KPiAgICAgICAgICAg
ICAgICAgcmV0dXJuOw0KPiANCj4gICAgICAgICBzZXRfbWVtb3J5X3ZhbGlkKCh1bnNpZ25lZCBs
b25nKXBhZ2VfYWRkcmVzcyhwYWdlKSwgbnVtcGFnZXMsDQo+IGVuYWJsZSk7DQo+IH0NCj4gDQo+
IFNvIHVzaW5nIHNldF9kaXJlY3RfbWFwKCkgdG8gbWFwIGJhY2sgcGFnZXMgcmVtb3ZlZCBmcm9t
IHRoZSBkaXJlY3QNCj4gbWFwDQo+IHdpdGggX19rZXJuZWxfbWFwX3BhZ2VzKCkgc2VlbXMgc2Fm
ZSB0byBtZS4NCg0KSGVoLCBvbmUgb2YgdXMgbXVzdCBoYXZlIHNvbWUgc2ltcGxlIGJvb2xlYW4g
ZXJyb3IgaW4gb3VyIGhlYWQuIEkgaG9wZQ0KaXRzIG5vdCBtZSEgOikgSSdsbCB0cnkgb24gbW9y
ZSB0aW1lLg0KDQpfX2tlcm5lbF9tYXBfcGFnZXMoKSB3aWxsIGJhaWwgb3V0IGlmIHJvZGF0YV9m
dWxsIGlzIGZhbHNlICoqQU5EKioNCmRlYnVnIHBhZ2UgYWxsb2MgaXMgb2ZmLiBTbyBpdCB3aWxs
IG9ubHkgYmFpbCB1bmRlciBjb25kaXRpb25zIHdoZXJlDQp0aGVyZSBjb3VsZCBiZSBub3RoaW5n
IHVubWFwcGVkIG9uIHRoZSBkaXJlY3QgbWFwLg0KDQpFcXVpdmFsZW50IGxvZ2ljIHdvdWxkIGJl
Og0KCWlmICghKGRlYnVnX3BhZ2VhbGxvY19lbmFibGVkKCkgfHwgcm9kYXRhX2Z1bGwpKQ0KCQly
ZXR1cm47DQoNCk9yOg0KCWlmIChkZWJ1Z19wYWdlYWxsb2NfZW5hYmxlZCgpIHx8IHJvZGF0YV9m
dWxsKQ0KCQlzZXRfbWVtb3J5X3ZhbGlkKGJsYWgpDQoNClNvIGlmIGVpdGhlciBpcyBvbiwgdGhl
IGV4aXN0aW5nIGNvZGUgd2lsbCB0cnkgdG8gcmUtbWFwLiBCdXQgdGhlDQpzZXRfZGlyZWN0X21h
cF8oKSdzIHdpbGwgb25seSB3b3JrIGlmIHJvZGF0YV9mdWxsIGlzIG9uLiBTbyBzd2l0Y2hpbmcN
CmhpYmVybmF0ZSB0byBzZXRfZGlyZWN0X21hcCgpIHdpbGwgY2F1c2UgdGhlIHJlbWFwIHRvIGJl
IG1pc3NlZCBmb3IgdGhlDQpkZWJ1ZyBwYWdlIGFsbG9jIGNhc2UsIHdpdGggIXJvZGF0YV9mdWxs
Lg0KDQpJdCBhbHNvIGJyZWFrcyBub3JtYWwgZGVidWcgcGFnZSBhbGxvYyB1c2FnZSB3aXRoICFy
b2RhdGFfZnVsbCBmb3INCnNpbWlsYXIgcmVhc29ucyBhZnRlciBwYXRjaCAzLiBUaGUgcGFnZXMg
d291bGQgbmV2ZXIgZ2V0IHVubWFwcGVkLg0KDQoNCg=

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
>         if (!debug_pagealloc_enabled() && !rodata_full)
>                 return;
> 
>         set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
> 
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.

Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.

__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.

Equivalent logic would be:
	if (!(debug_pagealloc_enabled() || rodata_full))
		return;

Or:
	if (debug_pagealloc_enabled() || rodata_full)
		set_memory_valid(blah)

So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.

It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
>         if (!debug_pagealloc_enabled() && !rodata_full)
>                 return;
> 
>         set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
> 
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.

Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.

__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.

Equivalent logic would be:
	if (!(debug_pagealloc_enabled() || rodata_full))
		return;

Or:
	if (debug_pagealloc_enabled() || rodata_full)
		set_memory_valid(blah)

So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.

It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.



^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-10-29 23:19         ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> __kernel_map_pages() on arm64 will also bail out if rodata_full is
> false:
> void __kernel_map_pages(struct page *page, int numpages, int enable)
> {
>         if (!debug_pagealloc_enabled() && !rodata_full)
>                 return;
> 
>         set_memory_valid((unsigned long)page_address(page), numpages,
> enable);
> }
> 
> So using set_direct_map() to map back pages removed from the direct
> map
> with __kernel_map_pages() seems safe to me.

Heh, one of us must have some simple boolean error in our head. I hope
its not me! :) I'll try on more time.

__kernel_map_pages() will bail out if rodata_full is false **AND**
debug page alloc is off. So it will only bail under conditions where
there could be nothing unmapped on the direct map.

Equivalent logic would be:
	if (!(debug_pagealloc_enabled() || rodata_full))
		return;

Or:
	if (debug_pagealloc_enabled() || rodata_full)
		set_memory_valid(blah)

So if either is on, the existing code will try to re-map. But the
set_direct_map_()'s will only work if rodata_full is on. So switching
hibernate to set_direct_map() will cause the remap to be missed for the
debug page alloc case, with !rodata_full.

It also breaks normal debug page alloc usage with !rodata_full for
similar reasons after patch 3. The pages would never get unmapped.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
  2020-10-29  8:12                 ` Mike Rapoport
                                     ` (3 preceding siblings ...)
  (?)
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  -1 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
> 
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.

Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, aou, kirill,
	rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm, luto,
	davem, mpe, benh, linuxppc-dev, linux-riscv, tglx, rjw, x86,
	linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
> 
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.

Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

T24gVGh1LCAyMDIwLTEwLTI5IGF0IDEwOjEyICswMjAwLCBNaWtlIFJhcG9wb3J0IHdyb3RlOg0K
PiBUaGlzIHNlcmllcyBnb2FsIHdhcyBwcmltYXJpbHkgdG8gc2VwYXJhdGUgZGVwZW5kaW5jaWVz
IGFuZCBtYWtlIGl0DQo+IGNsZWFyZXIgd2hhdCBERUJVR19QQUdFQUxMT0MgYW5kIHdoYXQgU0VU
X0RJUkVDVF9NQVAgYXJlLiBBcyBpdA0KPiB0dXJuZWQNCj4gb3V0LCB0aGVyZSBpcyBhbHNvIHNv
bWUgbGFjayBvZiBjb25zaXN0ZW5jeSBiZXR3ZWVuIGFyY2hpdGVjdHVyZXMNCj4gdGhhdA0KPiBp
bXBsZW1lbnQgZWl0aGVyIG9mIHRoaXMgc28gSSB0cmllZCB0byBpbXByb3ZlIHRoaXMgYXMgd2Vs
bC4NCj4gDQo+IEhvbmVzdGx5LCBJIGRvbid0IGtub3cgaWYgYSB0aHJlYWQgY2FuIGJlIHBhdXNl
ZCBhdCB0aGUgdGltZQ0KPiBfX3Z1bm1hcCgpDQo+IGxlZnQgaW52YWxpZCBwYWdlcywgYnV0IGl0
IGNvdWxkLCB0aGVyZSBpcyBhbiBpc3N1ZSBvbiBhcm02NCB3aXRoDQo+IERFQlVHX1BBR0VBTExP
Qz1uIGFuZCB0aGlzIHNldCBmaXhlcyBpdC4NCg0KQWgsIG9rLiBTbyBmcm9tIHRoaXMgYW5kIHRo
ZSBvdGhlciB0aHJlYWQsIHRoaXMgaXMgYWJvdXQgdGhlIGxvZ2ljIGluDQphcm0ncyBjcGEgZm9y
IHdoZW4gaXQgd2lsbCB0cnkgdGhlIHVuL21hcCBvcGVyYXRpb25zLiBJIHRoaW5rIHRoZSBsb2dp
Yw0KYWN0dWFsbHkgd29ya3MgY3VycmVudGx5LiBBbmQgdGhpcyBzZXJpZXMgaW50cm9kdWNlcyBh
IHByb2JsZW0gb24gQVJNDQpzaW1pbGFyIHRvIHRoZSBvbmUgeW91IGFyZSBzYXlpbmcgcHJlZXhp
c3RzLiBJIHB1dCB0aGUgZGV0YWlscyBpbiB0aGUNCm90aGVyIHRocmVhZC4NCg0K

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
> 
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.

Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, x86, rppt,
	borntraeger, mingo, rientjes, Brown, Len, aou, gor, linux-pm,
	hca, bp, luto, paul.walmsley, kirill, tglx, iamjoonsoo.kim,
	linux-arm-kernel, rjw, linux-kernel, penberg, palmer, akpm,
	linuxppc-dev, davem

On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
> 
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.

Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.


^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
@ 2020-10-29 23:19                   ` Edgecombe, Rick P
  0 siblings, 0 replies; 219+ messages in thread
From: Edgecombe, Rick P @ 2020-10-29 23:19 UTC (permalink / raw)
  To: will, rppt
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, linux-riscv, linux-s390, mpe,
	x86, rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Thu, 2020-10-29 at 10:12 +0200, Mike Rapoport wrote:
> This series goal was primarily to separate dependincies and make it
> clearer what DEBUG_PAGEALLOC and what SET_DIRECT_MAP are. As it
> turned
> out, there is also some lack of consistency between architectures
> that
> implement either of this so I tried to improve this as well.
> 
> Honestly, I don't know if a thread can be paused at the time
> __vunmap()
> left invalid pages, but it could, there is an issue on arm64 with
> DEBUG_PAGEALLOC=n and this set fixes it.

Ah, ok. So from this and the other thread, this is about the logic in
arm's cpa for when it will try the un/map operations. I think the logic
actually works currently. And this series introduces a problem on ARM
similar to the one you are saying preexists. I put the details in the
other thread.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
  2020-10-29 23:19         ` Edgecombe, Rick P
                             ` (3 preceding siblings ...)
  (?)
@ 2020-11-01 17:02           ` Mike Rapoport
  -1 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-11-01 17:02           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, cl, gor, hpa, peterz, catalin.marinas, dave.hansen,
	borntraeger, penberg, linux-mm, iamjoonsoo.kim, will, aou,
	kirill, rientjes, rppt, paulus, hca, bp, pavel, sparclinux, akpm,
	luto, davem, mpe, benh, linuxppc-dev, rjw, tglx, linux-riscv,
	x86, linux-pm, linux-arm-kernel, palmer, Brown, Len, mingo,
	linux-s390, linux-kernel, paul.walmsley

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-11-01 17:02           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-11-01 17:02           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-11-01 17:02           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: david, peterz, catalin.marinas, dave.hansen, linux-mm, paulus,
	pavel, hpa, sparclinux, cl, will, linux-riscv, linux-s390, x86,
	rppt, borntraeger, mingo, rientjes, Brown, Len, aou, gor,
	linux-pm, hca, bp, luto, paul.walmsley, kirill, tglx,
	iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel, penberg,
	palmer, akpm, linuxppc-dev, davem

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 219+ messages in thread

* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
@ 2020-11-01 17:02           ` Mike Rapoport
  0 siblings, 0 replies; 219+ messages in thread
From: Mike Rapoport @ 2020-11-01 17:02 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: benh, david, peterz, catalin.marinas, dave.hansen, linux-mm,
	paulus, pavel, hpa, sparclinux, cl, will, linux-riscv,
	linux-s390, mpe, x86, rppt, borntraeger, mingo, rientjes, Brown,
	Len, aou, gor, linux-pm, hca, bp, luto, paul.walmsley, kirill,
	tglx, iamjoonsoo.kim, linux-arm-kernel, rjw, linux-kernel,
	penberg, palmer, akpm, linuxppc-dev, davem

On Thu, Oct 29, 2020 at 11:19:18PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2020-10-29 at 09:54 +0200, Mike Rapoport wrote:
> > __kernel_map_pages() on arm64 will also bail out if rodata_full is
> > false:
> > void __kernel_map_pages(struct page *page, int numpages, int enable)
> > {
> >         if (!debug_pagealloc_enabled() && !rodata_full)
> >                 return;
> > 
> >         set_memory_valid((unsigned long)page_address(page), numpages,
> > enable);
> > }
> > 
> > So using set_direct_map() to map back pages removed from the direct
> > map
> > with __kernel_map_pages() seems safe to me.
> 
> Heh, one of us must have some simple boolean error in our head. I hope
> its not me! :) I'll try on more time.

Well, then it's me :)
You are right, I misread this and I could not understand why
!rodata_full bothers you.

> __kernel_map_pages() will bail out if rodata_full is false **AND**
> debug page alloc is off. So it will only bail under conditions where
> there could be nothing unmapped on the direct map.
> 
> Equivalent logic would be:
> 	if (!(debug_pagealloc_enabled() || rodata_full))
> 		return;
> 
> Or:
> 	if (debug_pagealloc_enabled() || rodata_full)
> 		set_memory_valid(blah)
> 
> So if either is on, the existing code will try to re-map. But the
> set_direct_map_()'s will only work if rodata_full is on. So switching
> hibernate to set_direct_map() will cause the remap to be missed for the
> debug page alloc case, with !rodata_full.
> 
> It also breaks normal debug page alloc usage with !rodata_full for
> similar reasons after patch 3. The pages would never get unmapped.

I've updated the patches, there should be no regression now.

-- 
Sincerely yours,
Mike.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 219+ messages in thread

end of thread, other threads:[~2020-11-01 17:04 UTC | newest]

Thread overview: 219+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-25 10:15 [PATCH 0/4] arch, mm: improve robustness of direct map manipulation Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26 11:05   ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:05     ` David Hildenbrand
2020-10-26 11:54     ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:54       ` Mike Rapoport
2020-10-26 11:55       ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-26 11:55         ` David Hildenbrand
2020-10-25 10:15 ` [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26  0:38   ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  0:38     ` Edgecombe, Rick P
2020-10-26  9:15     ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26  9:15       ` Mike Rapoport
2020-10-26 18:57       ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-26 18:57         ` Edgecombe, Rick P
2020-10-27  8:49         ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27  8:49           ` Mike Rapoport
2020-10-27 22:44           ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-27 22:44             ` Edgecombe, Rick P
2020-10-28  9:41             ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-28  9:41               ` Mike Rapoport
2020-10-27  1:10       ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-27  1:10         ` Edgecombe, Rick P
2020-10-28 21:15   ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-28 21:15     ` Edgecombe, Rick P
2020-10-29  7:54     ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29  7:54       ` Mike Rapoport
2020-10-29 23:19       ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-10-29 23:19         ` Edgecombe, Rick P
2020-11-01 17:02         ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-11-01 17:02           ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15 ` [PATCH 4/4] arch, mm: make kernel_page_present() always available Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-25 10:15   ` Mike Rapoport
2020-10-26  0:54   ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  0:54     ` Edgecombe, Rick P
2020-10-26  9:31     ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  9:31       ` Mike Rapoport
2020-10-26  1:13 ` [PATCH 0/4] arch, mm: improve robustness of direct map manipulation Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  1:13   ` Edgecombe, Rick P
2020-10-26  9:05   ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26  9:05     ` Mike Rapoport
2020-10-26 18:05     ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-26 18:05       ` Edgecombe, Rick P
2020-10-27  8:38       ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:38         ` Mike Rapoport
2020-10-27  8:46         ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  8:46           ` David Hildenbrand
2020-10-27  9:47           ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27  9:47             ` Mike Rapoport
2020-10-27 10:34             ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-27 10:34               ` David Hildenbrand
2020-10-28 11:09           ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:09             ` Mike Rapoport
2020-10-28 11:17             ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 11:17               ` David Hildenbrand
2020-10-28 12:22               ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 12:22                 ` Mike Rapoport
2020-10-28 18:31             ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 18:31               ` Edgecombe, Rick P
2020-10-28 11:20         ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:20           ` Will Deacon
2020-10-28 11:30           ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 11:30             ` Mike Rapoport
2020-10-28 21:03             ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-28 21:03               ` Edgecombe, Rick P
2020-10-29  8:12               ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29  8:12                 ` Mike Rapoport
2020-10-29 23:19                 ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29 23:19                   ` Edgecombe, Rick P
2020-10-29  8:15 ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand
2020-10-29  8:15   ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.