All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages
@ 2021-05-12 20:09 ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently we can end up touching PROT_MTE user pages twice on fault
and once on unmap. On fault, with KASAN disabled we first clear data
and then set tags to 0, and with KASAN enabled we simultaneously
clear data and set tags to the KASAN random tag, and then set tags
again to 0. On unmap, we poison the page by setting tags, but this
is less likely to find a bug than poisoning kernel pages.

This patch series fixes these inefficiencies by only touching the pages
once on fault using the DC GZVA instruction to clear both data and
tags, and providing the option to avoid poisoning user pages on free.

Peter Collingbourne (3):
  kasan: use separate (un)poison implementation for integrated init
  arm64: mte: handle tags zeroing at page allocation time
  kasan: allow freed user page poisoning to be disabled with HW tags

 arch/arm64/include/asm/mte.h   |  4 ++
 arch/arm64/include/asm/page.h  |  9 ++++-
 arch/arm64/lib/mte.S           | 20 ++++++++++
 arch/arm64/mm/fault.c          | 25 +++++++++++++
 arch/arm64/mm/proc.S           | 10 +++--
 include/linux/gfp.h            | 18 +++++++--
 include/linux/highmem.h        |  8 ++++
 include/linux/kasan.h          | 66 ++++++++++++++++++++-------------
 include/linux/page-flags.h     |  9 +++++
 include/trace/events/mmflags.h |  9 ++++-
 mm/kasan/common.c              |  4 +-
 mm/kasan/hw_tags.c             | 31 ++++++++++++++++
 mm/mempool.c                   |  6 ++-
 mm/page_alloc.c                | 67 ++++++++++++++++++++--------------
 14 files changed, 221 insertions(+), 65 deletions(-)

-- 
2.31.1.607.g51e8a6a459-goog



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages
@ 2021-05-12 20:09 ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently we can end up touching PROT_MTE user pages twice on fault
and once on unmap. On fault, with KASAN disabled we first clear data
and then set tags to 0, and with KASAN enabled we simultaneously
clear data and set tags to the KASAN random tag, and then set tags
again to 0. On unmap, we poison the page by setting tags, but this
is less likely to find a bug than poisoning kernel pages.

This patch series fixes these inefficiencies by only touching the pages
once on fault using the DC GZVA instruction to clear both data and
tags, and providing the option to avoid poisoning user pages on free.

Peter Collingbourne (3):
  kasan: use separate (un)poison implementation for integrated init
  arm64: mte: handle tags zeroing at page allocation time
  kasan: allow freed user page poisoning to be disabled with HW tags

 arch/arm64/include/asm/mte.h   |  4 ++
 arch/arm64/include/asm/page.h  |  9 ++++-
 arch/arm64/lib/mte.S           | 20 ++++++++++
 arch/arm64/mm/fault.c          | 25 +++++++++++++
 arch/arm64/mm/proc.S           | 10 +++--
 include/linux/gfp.h            | 18 +++++++--
 include/linux/highmem.h        |  8 ++++
 include/linux/kasan.h          | 66 ++++++++++++++++++++-------------
 include/linux/page-flags.h     |  9 +++++
 include/trace/events/mmflags.h |  9 ++++-
 mm/kasan/common.c              |  4 +-
 mm/kasan/hw_tags.c             | 31 ++++++++++++++++
 mm/mempool.c                   |  6 ++-
 mm/page_alloc.c                | 67 ++++++++++++++++++++--------------
 14 files changed, 221 insertions(+), 65 deletions(-)

-- 
2.31.1.607.g51e8a6a459-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-12 20:09 ` Peter Collingbourne
@ 2021-05-12 20:09   ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently with integrated init page_alloc.c needs to know whether
kasan_alloc_pages() will zero initialize memory, but this will start
becoming more complicated once we start adding tag initialization
support for user pages. To avoid page_alloc.c needing to know more
details of what integrated init will do, move the unpoisoning logic
for integrated init into the HW tags implementation. Currently the
logic is identical but it will diverge in subsequent patches.

For symmetry do the same for poisoning although this logic will
be unaffected by subsequent patches.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
---
v3:
- use BUILD_BUG()

v2:
- fix build with KASAN disabled

 include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
 mm/kasan/common.c     |  4 +--
 mm/kasan/hw_tags.c    | 14 +++++++++
 mm/mempool.c          |  6 ++--
 mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
 5 files changed, 91 insertions(+), 55 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index b1678a61e6a7..38061673e6ac 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KASAN_H
 #define _LINUX_KASAN_H
 
+#include <linux/bug.h>
 #include <linux/static_key.h>
 #include <linux/types.h>
 
@@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
 
 #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
 
-#ifdef CONFIG_KASAN
-
-struct kasan_cache {
-	int alloc_meta_offset;
-	int free_meta_offset;
-	bool is_kmalloc;
-};
-
 #ifdef CONFIG_KASAN_HW_TAGS
 
 DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
@@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
 	return kasan_enabled();
 }
 
+void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
+void kasan_free_pages(struct page *page, unsigned int order);
+
 #else /* CONFIG_KASAN_HW_TAGS */
 
 static inline bool kasan_enabled(void)
 {
+#ifdef CONFIG_KASAN
 	return true;
+#else
+	return false;
+#endif
 }
 
 static inline bool kasan_has_integrated_init(void)
@@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
 	return false;
 }
 
+static __always_inline void kasan_alloc_pages(struct page *page,
+					      unsigned int order, gfp_t flags)
+{
+	/* Only available for integrated init. */
+	BUILD_BUG();
+}
+
+static __always_inline void kasan_free_pages(struct page *page,
+					     unsigned int order)
+{
+	/* Only available for integrated init. */
+	BUILD_BUG();
+}
+
 #endif /* CONFIG_KASAN_HW_TAGS */
 
+#ifdef CONFIG_KASAN
+
+struct kasan_cache {
+	int alloc_meta_offset;
+	int free_meta_offset;
+	bool is_kmalloc;
+};
+
 slab_flags_t __kasan_never_merge(void);
 static __always_inline slab_flags_t kasan_never_merge(void)
 {
@@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
 		__kasan_unpoison_range(addr, size);
 }
 
-void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
-static __always_inline void kasan_alloc_pages(struct page *page,
+void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
+static __always_inline void kasan_poison_pages(struct page *page,
 						unsigned int order, bool init)
 {
 	if (kasan_enabled())
-		__kasan_alloc_pages(page, order, init);
+		__kasan_poison_pages(page, order, init);
 }
 
-void __kasan_free_pages(struct page *page, unsigned int order, bool init);
-static __always_inline void kasan_free_pages(struct page *page,
-						unsigned int order, bool init)
+void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
+static __always_inline void kasan_unpoison_pages(struct page *page,
+						 unsigned int order, bool init)
 {
 	if (kasan_enabled())
-		__kasan_free_pages(page, order, init);
+		__kasan_unpoison_pages(page, order, init);
 }
 
 void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
@@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
 
 #else /* CONFIG_KASAN */
 
-static inline bool kasan_enabled(void)
-{
-	return false;
-}
-static inline bool kasan_has_integrated_init(void)
-{
-	return false;
-}
 static inline slab_flags_t kasan_never_merge(void)
 {
 	return 0;
 }
 static inline void kasan_unpoison_range(const void *address, size_t size) {}
-static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
-static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
+static inline void kasan_poison_pages(struct page *page, unsigned int order,
+				      bool init) {}
+static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
+					bool init) {}
 static inline void kasan_cache_create(struct kmem_cache *cache,
 				      unsigned int *size,
 				      slab_flags_t *flags) {}
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 6bb87f2acd4e..0ecd293af344 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
 	return 0;
 }
 
-void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
+void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
 {
 	u8 tag;
 	unsigned long i;
@@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
 	kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
 }
 
-void __kasan_free_pages(struct page *page, unsigned int order, bool init)
+void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
 {
 	if (likely(!PageHighMem(page)))
 		kasan_poison(page_address(page), PAGE_SIZE << order,
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 4004388b4e4b..45e552cb9172 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
 	return &alloc_meta->free_track[0];
 }
 
+void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
+{
+	bool init = !want_init_on_free() && want_init_on_alloc(flags);
+
+	kasan_unpoison_pages(page, order, init);
+}
+
+void kasan_free_pages(struct page *page, unsigned int order)
+{
+	bool init = want_init_on_free();
+
+	kasan_poison_pages(page, order, init);
+}
+
 #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
 
 void kasan_set_tagging_report_once(bool state)
diff --git a/mm/mempool.c b/mm/mempool.c
index a258cf4de575..0b8afbec3e35 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
 	if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
 		kasan_slab_free_mempool(element);
 	else if (pool->alloc == mempool_alloc_pages)
-		kasan_free_pages(element, (unsigned long)pool->pool_data, false);
+		kasan_poison_pages(element, (unsigned long)pool->pool_data,
+				   false);
 }
 
 static void kasan_unpoison_element(mempool_t *pool, void *element)
@@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
 	if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
 		kasan_unpoison_range(element, __ksize(element));
 	else if (pool->alloc == mempool_alloc_pages)
-		kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
+		kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
+				     false);
 }
 
 static __always_inline void add_element(mempool_t *pool, void *element)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index aaa1655cf682..6e82a7f6fd6f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
 static DEFINE_STATIC_KEY_TRUE(deferred_pages);
 
 /*
- * Calling kasan_free_pages() only after deferred memory initialization
+ * Calling kasan_poison_pages() only after deferred memory initialization
  * has completed. Poisoning pages during deferred memory init will greatly
  * lengthen the process and cause problem in large memory systems as the
  * deferred pages initialization is done with interrupt disabled.
@@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
  * on-demand allocation and then freed again before the deferred pages
  * initialization is done, but this is not likely to happen.
  */
-static inline void kasan_free_nondeferred_pages(struct page *page, int order,
-						bool init, fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
 {
-	if (static_branch_unlikely(&deferred_pages))
-		return;
-	if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-			(fpi_flags & FPI_SKIP_KASAN_POISON))
-		return;
-	kasan_free_pages(page, order, init);
+	return static_branch_unlikely(&deferred_pages) ||
+	       (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+		(fpi_flags & FPI_SKIP_KASAN_POISON));
 }
 
 /* Returns true if the struct page for the pfn is uninitialised */
@@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 	return false;
 }
 #else
-static inline void kasan_free_nondeferred_pages(struct page *page, int order,
-						bool init, fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
 {
-	if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-			(fpi_flags & FPI_SKIP_KASAN_POISON))
-		return;
-	kasan_free_pages(page, order, init);
+	return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+		(fpi_flags & FPI_SKIP_KASAN_POISON));
 }
 
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 			unsigned int order, bool check_free, fpi_t fpi_flags)
 {
 	int bad = 0;
-	bool init;
+	bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
@@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 * With hardware tag-based KASAN, memory tags must be set before the
 	 * page becomes unavailable via debug_pagealloc or arch_free_page.
 	 */
-	init = want_init_on_free();
-	if (init && !kasan_has_integrated_init())
-		kernel_init_free_pages(page, 1 << order);
-	kasan_free_nondeferred_pages(page, order, init, fpi_flags);
+	if (kasan_has_integrated_init()) {
+		if (!skip_kasan_poison)
+			kasan_free_pages(page, order);
+	} else {
+		bool init = want_init_on_free();
+
+		if (init)
+			kernel_init_free_pages(page, 1 << order);
+		if (!skip_kasan_poison)
+			kasan_poison_pages(page, order, init);
+	}
 
 	/*
 	 * arch_free_page() can make the page's contents inaccessible.  s390
@@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
 inline void post_alloc_hook(struct page *page, unsigned int order,
 				gfp_t gfp_flags)
 {
-	bool init;
-
 	set_page_private(page, 0);
 	set_page_refcounted(page);
 
@@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	 * kasan_alloc_pages and kernel_init_free_pages must be
 	 * kept together to avoid discrepancies in behavior.
 	 */
-	init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
-	kasan_alloc_pages(page, order, init);
-	if (init && !kasan_has_integrated_init())
-		kernel_init_free_pages(page, 1 << order);
+	if (kasan_has_integrated_init()) {
+		kasan_alloc_pages(page, order, gfp_flags);
+	} else {
+		bool init =
+			!want_init_on_free() && want_init_on_alloc(gfp_flags);
+
+		kasan_unpoison_pages(page, order, init);
+		if (init)
+			kernel_init_free_pages(page, 1 << order);
+	}
 
 	set_page_owner(page, order, gfp_flags);
 }
-- 
2.31.1.607.g51e8a6a459-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-12 20:09   ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently with integrated init page_alloc.c needs to know whether
kasan_alloc_pages() will zero initialize memory, but this will start
becoming more complicated once we start adding tag initialization
support for user pages. To avoid page_alloc.c needing to know more
details of what integrated init will do, move the unpoisoning logic
for integrated init into the HW tags implementation. Currently the
logic is identical but it will diverge in subsequent patches.

For symmetry do the same for poisoning although this logic will
be unaffected by subsequent patches.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
---
v3:
- use BUILD_BUG()

v2:
- fix build with KASAN disabled

 include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
 mm/kasan/common.c     |  4 +--
 mm/kasan/hw_tags.c    | 14 +++++++++
 mm/mempool.c          |  6 ++--
 mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
 5 files changed, 91 insertions(+), 55 deletions(-)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index b1678a61e6a7..38061673e6ac 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KASAN_H
 #define _LINUX_KASAN_H
 
+#include <linux/bug.h>
 #include <linux/static_key.h>
 #include <linux/types.h>
 
@@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
 
 #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
 
-#ifdef CONFIG_KASAN
-
-struct kasan_cache {
-	int alloc_meta_offset;
-	int free_meta_offset;
-	bool is_kmalloc;
-};
-
 #ifdef CONFIG_KASAN_HW_TAGS
 
 DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
@@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
 	return kasan_enabled();
 }
 
+void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
+void kasan_free_pages(struct page *page, unsigned int order);
+
 #else /* CONFIG_KASAN_HW_TAGS */
 
 static inline bool kasan_enabled(void)
 {
+#ifdef CONFIG_KASAN
 	return true;
+#else
+	return false;
+#endif
 }
 
 static inline bool kasan_has_integrated_init(void)
@@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
 	return false;
 }
 
+static __always_inline void kasan_alloc_pages(struct page *page,
+					      unsigned int order, gfp_t flags)
+{
+	/* Only available for integrated init. */
+	BUILD_BUG();
+}
+
+static __always_inline void kasan_free_pages(struct page *page,
+					     unsigned int order)
+{
+	/* Only available for integrated init. */
+	BUILD_BUG();
+}
+
 #endif /* CONFIG_KASAN_HW_TAGS */
 
+#ifdef CONFIG_KASAN
+
+struct kasan_cache {
+	int alloc_meta_offset;
+	int free_meta_offset;
+	bool is_kmalloc;
+};
+
 slab_flags_t __kasan_never_merge(void);
 static __always_inline slab_flags_t kasan_never_merge(void)
 {
@@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
 		__kasan_unpoison_range(addr, size);
 }
 
-void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
-static __always_inline void kasan_alloc_pages(struct page *page,
+void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
+static __always_inline void kasan_poison_pages(struct page *page,
 						unsigned int order, bool init)
 {
 	if (kasan_enabled())
-		__kasan_alloc_pages(page, order, init);
+		__kasan_poison_pages(page, order, init);
 }
 
-void __kasan_free_pages(struct page *page, unsigned int order, bool init);
-static __always_inline void kasan_free_pages(struct page *page,
-						unsigned int order, bool init)
+void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
+static __always_inline void kasan_unpoison_pages(struct page *page,
+						 unsigned int order, bool init)
 {
 	if (kasan_enabled())
-		__kasan_free_pages(page, order, init);
+		__kasan_unpoison_pages(page, order, init);
 }
 
 void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
@@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
 
 #else /* CONFIG_KASAN */
 
-static inline bool kasan_enabled(void)
-{
-	return false;
-}
-static inline bool kasan_has_integrated_init(void)
-{
-	return false;
-}
 static inline slab_flags_t kasan_never_merge(void)
 {
 	return 0;
 }
 static inline void kasan_unpoison_range(const void *address, size_t size) {}
-static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
-static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
+static inline void kasan_poison_pages(struct page *page, unsigned int order,
+				      bool init) {}
+static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
+					bool init) {}
 static inline void kasan_cache_create(struct kmem_cache *cache,
 				      unsigned int *size,
 				      slab_flags_t *flags) {}
diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 6bb87f2acd4e..0ecd293af344 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
 	return 0;
 }
 
-void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
+void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
 {
 	u8 tag;
 	unsigned long i;
@@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
 	kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
 }
 
-void __kasan_free_pages(struct page *page, unsigned int order, bool init)
+void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
 {
 	if (likely(!PageHighMem(page)))
 		kasan_poison(page_address(page), PAGE_SIZE << order,
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 4004388b4e4b..45e552cb9172 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
 	return &alloc_meta->free_track[0];
 }
 
+void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
+{
+	bool init = !want_init_on_free() && want_init_on_alloc(flags);
+
+	kasan_unpoison_pages(page, order, init);
+}
+
+void kasan_free_pages(struct page *page, unsigned int order)
+{
+	bool init = want_init_on_free();
+
+	kasan_poison_pages(page, order, init);
+}
+
 #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
 
 void kasan_set_tagging_report_once(bool state)
diff --git a/mm/mempool.c b/mm/mempool.c
index a258cf4de575..0b8afbec3e35 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
 	if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
 		kasan_slab_free_mempool(element);
 	else if (pool->alloc == mempool_alloc_pages)
-		kasan_free_pages(element, (unsigned long)pool->pool_data, false);
+		kasan_poison_pages(element, (unsigned long)pool->pool_data,
+				   false);
 }
 
 static void kasan_unpoison_element(mempool_t *pool, void *element)
@@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
 	if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
 		kasan_unpoison_range(element, __ksize(element));
 	else if (pool->alloc == mempool_alloc_pages)
-		kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
+		kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
+				     false);
 }
 
 static __always_inline void add_element(mempool_t *pool, void *element)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index aaa1655cf682..6e82a7f6fd6f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
 static DEFINE_STATIC_KEY_TRUE(deferred_pages);
 
 /*
- * Calling kasan_free_pages() only after deferred memory initialization
+ * Calling kasan_poison_pages() only after deferred memory initialization
  * has completed. Poisoning pages during deferred memory init will greatly
  * lengthen the process and cause problem in large memory systems as the
  * deferred pages initialization is done with interrupt disabled.
@@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
  * on-demand allocation and then freed again before the deferred pages
  * initialization is done, but this is not likely to happen.
  */
-static inline void kasan_free_nondeferred_pages(struct page *page, int order,
-						bool init, fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
 {
-	if (static_branch_unlikely(&deferred_pages))
-		return;
-	if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-			(fpi_flags & FPI_SKIP_KASAN_POISON))
-		return;
-	kasan_free_pages(page, order, init);
+	return static_branch_unlikely(&deferred_pages) ||
+	       (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+		(fpi_flags & FPI_SKIP_KASAN_POISON));
 }
 
 /* Returns true if the struct page for the pfn is uninitialised */
@@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 	return false;
 }
 #else
-static inline void kasan_free_nondeferred_pages(struct page *page, int order,
-						bool init, fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
 {
-	if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-			(fpi_flags & FPI_SKIP_KASAN_POISON))
-		return;
-	kasan_free_pages(page, order, init);
+	return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
+		(fpi_flags & FPI_SKIP_KASAN_POISON));
 }
 
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 			unsigned int order, bool check_free, fpi_t fpi_flags)
 {
 	int bad = 0;
-	bool init;
+	bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
@@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	 * With hardware tag-based KASAN, memory tags must be set before the
 	 * page becomes unavailable via debug_pagealloc or arch_free_page.
 	 */
-	init = want_init_on_free();
-	if (init && !kasan_has_integrated_init())
-		kernel_init_free_pages(page, 1 << order);
-	kasan_free_nondeferred_pages(page, order, init, fpi_flags);
+	if (kasan_has_integrated_init()) {
+		if (!skip_kasan_poison)
+			kasan_free_pages(page, order);
+	} else {
+		bool init = want_init_on_free();
+
+		if (init)
+			kernel_init_free_pages(page, 1 << order);
+		if (!skip_kasan_poison)
+			kasan_poison_pages(page, order, init);
+	}
 
 	/*
 	 * arch_free_page() can make the page's contents inaccessible.  s390
@@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
 inline void post_alloc_hook(struct page *page, unsigned int order,
 				gfp_t gfp_flags)
 {
-	bool init;
-
 	set_page_private(page, 0);
 	set_page_refcounted(page);
 
@@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	 * kasan_alloc_pages and kernel_init_free_pages must be
 	 * kept together to avoid discrepancies in behavior.
 	 */
-	init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
-	kasan_alloc_pages(page, order, init);
-	if (init && !kasan_has_integrated_init())
-		kernel_init_free_pages(page, 1 << order);
+	if (kasan_has_integrated_init()) {
+		kasan_alloc_pages(page, order, gfp_flags);
+	} else {
+		bool init =
+			!want_init_on_free() && want_init_on_alloc(gfp_flags);
+
+		kasan_unpoison_pages(page, order, init);
+		if (init)
+			kernel_init_free_pages(page, 1 << order);
+	}
 
 	set_page_owner(page, order, gfp_flags);
 }
-- 
2.31.1.607.g51e8a6a459-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/3] arm64: mte: handle tags zeroing at page allocation time
  2021-05-12 20:09 ` Peter Collingbourne
@ 2021-05-12 20:09   ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently, on an anonymous page fault, the kernel allocates a zeroed
page and maps it in user space. If the mapping is tagged (PROT_MTE),
set_pte_at() additionally clears the tags. It is, however, more
efficient to clear the tags at the same time as zeroing the data on
allocation. To avoid clearing the tags on any page (which may not be
mapped as tagged), only do this if the vma flags contain VM_MTE. This
requires introducing a new GFP flag that is used to determine whether
to clear the tags.

The DC GZVA instruction with a 0 top byte (and 0 tag) requires
top-byte-ignore. Set the TCR_EL1.{TBI1,TBID1} bits irrespective of
whether KASAN_HW is enabled.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://linux-review.googlesource.com/id/Id46dc94e30fe11474f7e54f5d65e7658dbdddb26
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
v2:
- remove want_zero_tags_on_free()

 arch/arm64/include/asm/mte.h  |  4 ++++
 arch/arm64/include/asm/page.h |  9 +++++++--
 arch/arm64/lib/mte.S          | 20 ++++++++++++++++++++
 arch/arm64/mm/fault.c         | 25 +++++++++++++++++++++++++
 arch/arm64/mm/proc.S          | 10 +++++++---
 include/linux/gfp.h           |  9 +++++++--
 include/linux/highmem.h       |  8 ++++++++
 mm/kasan/hw_tags.c            |  9 ++++++++-
 mm/page_alloc.c               | 13 ++++++++++---
 9 files changed, 96 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index bc88a1ced0d7..67bf259ae768 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -37,6 +37,7 @@ void mte_free_tag_storage(char *storage);
 /* track which pages have valid allocation tags */
 #define PG_mte_tagged	PG_arch_2
 
+void mte_zero_clear_page_tags(void *addr);
 void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void mte_thread_init_user(void);
@@ -53,6 +54,9 @@ int mte_ptrace_copy_tags(struct task_struct *child, long request,
 /* unused if !CONFIG_ARM64_MTE, silence the compiler */
 #define PG_mte_tagged	0
 
+static inline void mte_zero_clear_page_tags(void *addr)
+{
+}
 static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 }
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..448e14071d13 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -13,6 +13,7 @@
 #ifndef __ASSEMBLY__
 
 #include <linux/personality.h> /* for READ_IMPLIES_EXEC */
+#include <linux/types.h> /* for gfp_t */
 #include <asm/pgtable-types.h>
 
 struct page;
@@ -28,10 +29,14 @@ void copy_user_highpage(struct page *to, struct page *from,
 void copy_highpage(struct page *to, struct page *from);
 #define __HAVE_ARCH_COPY_HIGHPAGE
 
-#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
-	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
+struct page *__alloc_zeroed_user_highpage(gfp_t movableflags,
+					  struct vm_area_struct *vma,
+					  unsigned long vaddr);
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
+void tag_clear_highpage(struct page *to);
+#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
+
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 351537c12f36..e83643b3995f 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -36,6 +36,26 @@ SYM_FUNC_START(mte_clear_page_tags)
 	ret
 SYM_FUNC_END(mte_clear_page_tags)
 
+/*
+ * Zero the page and tags at the same time
+ *
+ * Parameters:
+ *	x0 - address to the beginning of the page
+ */
+SYM_FUNC_START(mte_zero_clear_page_tags)
+	mrs	x1, dczid_el0
+	and	w1, w1, #0xf
+	mov	x2, #4
+	lsl	x1, x2, x1
+	and	x0, x0, #(1 << MTE_TAG_SHIFT) - 1	// clear the tag
+
+1:	dc	gzva, x0
+	add	x0, x0, x1
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+	ret
+SYM_FUNC_END(mte_zero_clear_page_tags)
+
 /*
  * Copy the tags from the source page to the destination one
  *   x0 - address of the destination page
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 871c82ab0a30..8127e0c0b8fb 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -921,3 +921,28 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
 	debug_exception_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug_exception);
+
+/*
+ * Used during anonymous page fault handling.
+ */
+struct page *__alloc_zeroed_user_highpage(gfp_t flags,
+					  struct vm_area_struct *vma,
+					  unsigned long vaddr)
+{
+	/*
+	 * If the page is mapped with PROT_MTE, initialise the tags at the
+	 * point of allocation and page zeroing as this is usually faster than
+	 * separate DC ZVA and STGM.
+	 */
+	if (vma->vm_flags & VM_MTE)
+		flags |= __GFP_ZEROTAGS;
+
+	return alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | flags, vma, vaddr);
+}
+
+void tag_clear_highpage(struct page *page)
+{
+	mte_zero_clear_page_tags(page_address(page));
+	page_kasan_tag_reset(page);
+	set_bit(PG_mte_tagged, &page->flags);
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 0a48191534ff..a27c77dbe91c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,9 +46,13 @@
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_KASAN_HW_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
 #else
-#define TCR_KASAN_HW_FLAGS 0
+/*
+ * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
+ * TBI being enabled at EL1.
+ */
+#define TCR_MTE_FLAGS TCR_TBI1 | TCR_TBID1
 #endif
 
 /*
@@ -452,7 +456,7 @@ SYM_FUNC_START(__cpu_setup)
 	msr_s	SYS_TFSRE0_EL1, xzr
 
 	/* set the TCR_EL1 bits */
-	mov_q	x10, TCR_KASAN_HW_FLAGS
+	mov_q	x10, TCR_MTE_FLAGS
 	orr	tcr, tcr, x10
 1:
 #endif
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 11da8af06704..68ba237365dc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -53,8 +53,9 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x100000u
 #define ___GFP_THISNODE		0x200000u
 #define ___GFP_ACCOUNT		0x400000u
+#define ___GFP_ZEROTAGS		0x800000u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP	0x800000u
+#define ___GFP_NOLOCKDEP	0x1000000u
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
@@ -229,16 +230,20 @@ struct vm_area_struct;
  * %__GFP_COMP address compound page metadata.
  *
  * %__GFP_ZERO returns a zeroed page on success.
+ *
+ * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
+ * __GFP_ZERO is set.
  */
 #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
+#define __GFP_ZEROTAGS	((__force gfp_t)___GFP_ZEROTAGS)
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 832b49b50c7b..caaa62e1dd24 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -204,6 +204,14 @@ static inline void clear_highpage(struct page *page)
 	kunmap_atomic(kaddr);
 }
 
+#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
+
+static inline void tag_clear_highpage(struct page *page)
+{
+}
+
+#endif
+
 /*
  * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
  * If we pass in a head page, we can zero up to the size of the compound page.
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 45e552cb9172..34362c8d0955 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -242,7 +242,14 @@ void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
 {
 	bool init = !want_init_on_free() && want_init_on_alloc(flags);
 
-	kasan_unpoison_pages(page, order, init);
+	if (flags & __GFP_ZEROTAGS) {
+		int i;
+
+		for (i = 0; i != 1 << order; ++i)
+			tag_clear_highpage(page + i);
+	} else {
+		kasan_unpoison_pages(page, order, init);
+	}
 }
 
 void kasan_free_pages(struct page *page, unsigned int order)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6e82a7f6fd6f..24e6f668ef73 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1219,10 +1219,16 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
 	return ret;
 }
 
-static void kernel_init_free_pages(struct page *page, int numpages)
+static void kernel_init_free_pages(struct page *page, int numpages, bool zero_tags)
 {
 	int i;
 
+	if (zero_tags) {
+		for (i = 0; i < numpages; i++)
+			tag_clear_highpage(page + i);
+		return;
+	}
+
 	/* s390's use of memset() could override KASAN redzones. */
 	kasan_disable_current();
 	for (i = 0; i < numpages; i++) {
@@ -1314,7 +1320,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 		bool init = want_init_on_free();
 
 		if (init)
-			kernel_init_free_pages(page, 1 << order);
+			kernel_init_free_pages(page, 1 << order, false);
 		if (!skip_kasan_poison)
 			kasan_poison_pages(page, order, init);
 	}
@@ -2350,7 +2356,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 
 		kasan_unpoison_pages(page, order, init);
 		if (init)
-			kernel_init_free_pages(page, 1 << order);
+			kernel_init_free_pages(page, 1 << order,
+					       gfp_flags & __GFP_ZEROTAGS);
 	}
 
 	set_page_owner(page, order, gfp_flags);
-- 
2.31.1.607.g51e8a6a459-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/3] arm64: mte: handle tags zeroing at page allocation time
@ 2021-05-12 20:09   ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Currently, on an anonymous page fault, the kernel allocates a zeroed
page and maps it in user space. If the mapping is tagged (PROT_MTE),
set_pte_at() additionally clears the tags. It is, however, more
efficient to clear the tags at the same time as zeroing the data on
allocation. To avoid clearing the tags on any page (which may not be
mapped as tagged), only do this if the vma flags contain VM_MTE. This
requires introducing a new GFP flag that is used to determine whether
to clear the tags.

The DC GZVA instruction with a 0 top byte (and 0 tag) requires
top-byte-ignore. Set the TCR_EL1.{TBI1,TBID1} bits irrespective of
whether KASAN_HW is enabled.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://linux-review.googlesource.com/id/Id46dc94e30fe11474f7e54f5d65e7658dbdddb26
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
v2:
- remove want_zero_tags_on_free()

 arch/arm64/include/asm/mte.h  |  4 ++++
 arch/arm64/include/asm/page.h |  9 +++++++--
 arch/arm64/lib/mte.S          | 20 ++++++++++++++++++++
 arch/arm64/mm/fault.c         | 25 +++++++++++++++++++++++++
 arch/arm64/mm/proc.S          | 10 +++++++---
 include/linux/gfp.h           |  9 +++++++--
 include/linux/highmem.h       |  8 ++++++++
 mm/kasan/hw_tags.c            |  9 ++++++++-
 mm/page_alloc.c               | 13 ++++++++++---
 9 files changed, 96 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index bc88a1ced0d7..67bf259ae768 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -37,6 +37,7 @@ void mte_free_tag_storage(char *storage);
 /* track which pages have valid allocation tags */
 #define PG_mte_tagged	PG_arch_2
 
+void mte_zero_clear_page_tags(void *addr);
 void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void mte_thread_init_user(void);
@@ -53,6 +54,9 @@ int mte_ptrace_copy_tags(struct task_struct *child, long request,
 /* unused if !CONFIG_ARM64_MTE, silence the compiler */
 #define PG_mte_tagged	0
 
+static inline void mte_zero_clear_page_tags(void *addr)
+{
+}
 static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 }
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 012cffc574e8..448e14071d13 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -13,6 +13,7 @@
 #ifndef __ASSEMBLY__
 
 #include <linux/personality.h> /* for READ_IMPLIES_EXEC */
+#include <linux/types.h> /* for gfp_t */
 #include <asm/pgtable-types.h>
 
 struct page;
@@ -28,10 +29,14 @@ void copy_user_highpage(struct page *to, struct page *from,
 void copy_highpage(struct page *to, struct page *from);
 #define __HAVE_ARCH_COPY_HIGHPAGE
 
-#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
-	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
+struct page *__alloc_zeroed_user_highpage(gfp_t movableflags,
+					  struct vm_area_struct *vma,
+					  unsigned long vaddr);
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
+void tag_clear_highpage(struct page *to);
+#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
+
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 351537c12f36..e83643b3995f 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -36,6 +36,26 @@ SYM_FUNC_START(mte_clear_page_tags)
 	ret
 SYM_FUNC_END(mte_clear_page_tags)
 
+/*
+ * Zero the page and tags at the same time
+ *
+ * Parameters:
+ *	x0 - address to the beginning of the page
+ */
+SYM_FUNC_START(mte_zero_clear_page_tags)
+	mrs	x1, dczid_el0
+	and	w1, w1, #0xf
+	mov	x2, #4
+	lsl	x1, x2, x1
+	and	x0, x0, #(1 << MTE_TAG_SHIFT) - 1	// clear the tag
+
+1:	dc	gzva, x0
+	add	x0, x0, x1
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+	ret
+SYM_FUNC_END(mte_zero_clear_page_tags)
+
 /*
  * Copy the tags from the source page to the destination one
  *   x0 - address of the destination page
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 871c82ab0a30..8127e0c0b8fb 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -921,3 +921,28 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
 	debug_exception_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug_exception);
+
+/*
+ * Used during anonymous page fault handling.
+ */
+struct page *__alloc_zeroed_user_highpage(gfp_t flags,
+					  struct vm_area_struct *vma,
+					  unsigned long vaddr)
+{
+	/*
+	 * If the page is mapped with PROT_MTE, initialise the tags at the
+	 * point of allocation and page zeroing as this is usually faster than
+	 * separate DC ZVA and STGM.
+	 */
+	if (vma->vm_flags & VM_MTE)
+		flags |= __GFP_ZEROTAGS;
+
+	return alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | flags, vma, vaddr);
+}
+
+void tag_clear_highpage(struct page *page)
+{
+	mte_zero_clear_page_tags(page_address(page));
+	page_kasan_tag_reset(page);
+	set_bit(PG_mte_tagged, &page->flags);
+}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 0a48191534ff..a27c77dbe91c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,9 +46,13 @@
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_KASAN_HW_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
 #else
-#define TCR_KASAN_HW_FLAGS 0
+/*
+ * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
+ * TBI being enabled at EL1.
+ */
+#define TCR_MTE_FLAGS TCR_TBI1 | TCR_TBID1
 #endif
 
 /*
@@ -452,7 +456,7 @@ SYM_FUNC_START(__cpu_setup)
 	msr_s	SYS_TFSRE0_EL1, xzr
 
 	/* set the TCR_EL1 bits */
-	mov_q	x10, TCR_KASAN_HW_FLAGS
+	mov_q	x10, TCR_MTE_FLAGS
 	orr	tcr, tcr, x10
 1:
 #endif
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 11da8af06704..68ba237365dc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -53,8 +53,9 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x100000u
 #define ___GFP_THISNODE		0x200000u
 #define ___GFP_ACCOUNT		0x400000u
+#define ___GFP_ZEROTAGS		0x800000u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP	0x800000u
+#define ___GFP_NOLOCKDEP	0x1000000u
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
@@ -229,16 +230,20 @@ struct vm_area_struct;
  * %__GFP_COMP address compound page metadata.
  *
  * %__GFP_ZERO returns a zeroed page on success.
+ *
+ * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
+ * __GFP_ZERO is set.
  */
 #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
+#define __GFP_ZEROTAGS	((__force gfp_t)___GFP_ZEROTAGS)
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 832b49b50c7b..caaa62e1dd24 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -204,6 +204,14 @@ static inline void clear_highpage(struct page *page)
 	kunmap_atomic(kaddr);
 }
 
+#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
+
+static inline void tag_clear_highpage(struct page *page)
+{
+}
+
+#endif
+
 /*
  * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
  * If we pass in a head page, we can zero up to the size of the compound page.
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 45e552cb9172..34362c8d0955 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -242,7 +242,14 @@ void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
 {
 	bool init = !want_init_on_free() && want_init_on_alloc(flags);
 
-	kasan_unpoison_pages(page, order, init);
+	if (flags & __GFP_ZEROTAGS) {
+		int i;
+
+		for (i = 0; i != 1 << order; ++i)
+			tag_clear_highpage(page + i);
+	} else {
+		kasan_unpoison_pages(page, order, init);
+	}
 }
 
 void kasan_free_pages(struct page *page, unsigned int order)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6e82a7f6fd6f..24e6f668ef73 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1219,10 +1219,16 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
 	return ret;
 }
 
-static void kernel_init_free_pages(struct page *page, int numpages)
+static void kernel_init_free_pages(struct page *page, int numpages, bool zero_tags)
 {
 	int i;
 
+	if (zero_tags) {
+		for (i = 0; i < numpages; i++)
+			tag_clear_highpage(page + i);
+		return;
+	}
+
 	/* s390's use of memset() could override KASAN redzones. */
 	kasan_disable_current();
 	for (i = 0; i < numpages; i++) {
@@ -1314,7 +1320,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 		bool init = want_init_on_free();
 
 		if (init)
-			kernel_init_free_pages(page, 1 << order);
+			kernel_init_free_pages(page, 1 << order, false);
 		if (!skip_kasan_poison)
 			kasan_poison_pages(page, order, init);
 	}
@@ -2350,7 +2356,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 
 		kasan_unpoison_pages(page, order, init);
 		if (init)
-			kernel_init_free_pages(page, 1 << order);
+			kernel_init_free_pages(page, 1 << order,
+					       gfp_flags & __GFP_ZEROTAGS);
 	}
 
 	set_page_owner(page, order, gfp_flags);
-- 
2.31.1.607.g51e8a6a459-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
  2021-05-12 20:09 ` Peter Collingbourne
@ 2021-05-12 20:09   ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Poisoning freed pages protects against kernel use-after-free. The
likelihood of such a bug involving kernel pages is significantly higher
than that for user pages. At the same time, poisoning freed pages can
impose a significant performance cost, which cannot always be justified
for user pages given the lower probability of finding a bug. Therefore,
make it possible to configure the kernel to disable freed user page
poisoning when using HW tags via the new kasan.skip_user_poison_on_free
command line option.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I716846e2de8ef179f44e835770df7e6307be96c9
---
 include/linux/gfp.h            | 13 ++++++++++---
 include/linux/page-flags.h     |  9 +++++++++
 include/trace/events/mmflags.h |  9 ++++++++-
 mm/kasan/hw_tags.c             | 10 ++++++++++
 mm/page_alloc.c                | 12 +++++++-----
 5 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 68ba237365dc..9a77e5660b07 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -54,8 +54,9 @@ struct vm_area_struct;
 #define ___GFP_THISNODE		0x200000u
 #define ___GFP_ACCOUNT		0x400000u
 #define ___GFP_ZEROTAGS		0x800000u
+#define ___GFP_SKIP_KASAN_POISON	0x1000000u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP	0x1000000u
+#define ___GFP_NOLOCKDEP	0x2000000u
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
@@ -233,17 +234,22 @@ struct vm_area_struct;
  *
  * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
  * __GFP_ZERO is set.
+ *
+ * %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
+ * on deallocation. Typically used for userspace pages. Currently only has an
+ * effect in HW tags mode, and only if a command line option is set.
  */
 #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
 #define __GFP_ZEROTAGS	((__force gfp_t)___GFP_ZEROTAGS)
+#define __GFP_SKIP_KASAN_POISON	((__force gfp_t)___GFP_SKIP_KASAN_POISON)
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
@@ -320,7 +326,8 @@ struct vm_area_struct;
 #define GFP_NOWAIT	(__GFP_KSWAPD_RECLAIM)
 #define GFP_NOIO	(__GFP_RECLAIM)
 #define GFP_NOFS	(__GFP_RECLAIM | __GFP_IO)
-#define GFP_USER	(__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
+#define GFP_USER	(__GFP_RECLAIM | __GFP_IO | __GFP_FS | \
+			 __GFP_HARDWALL | __GFP_SKIP_KASAN_POISON)
 #define GFP_DMA		__GFP_DMA
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..40e2c5000585 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -137,6 +137,9 @@ enum pageflags {
 #endif
 #ifdef CONFIG_64BIT
 	PG_arch_2,
+#endif
+#ifdef CONFIG_KASAN_HW_TAGS
+	PG_skip_kasan_poison,
 #endif
 	__NR_PAGEFLAGS,
 
@@ -443,6 +446,12 @@ TESTCLEARFLAG(Young, young, PF_ANY)
 PAGEFLAG(Idle, idle, PF_ANY)
 #endif
 
+#ifdef CONFIG_KASAN_HW_TAGS
+PAGEFLAG(SkipKASanPoison, skip_kasan_poison, PF_HEAD)
+#else
+PAGEFLAG_FALSE(SkipKASanPoison)
+#endif
+
 /*
  * PageReported() is used to track reported free pages within the Buddy
  * allocator. We can use the non-atomic version of the test and set
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 629c7a0eaff2..390270e00a1d 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -85,6 +85,12 @@
 #define IF_HAVE_PG_ARCH_2(flag,string)
 #endif
 
+#ifdef CONFIG_KASAN_HW_TAGS
+#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string) ,{1UL << flag, string}
+#else
+#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string)
+#endif
+
 #define __def_pageflag_names						\
 	{1UL << PG_locked,		"locked"	},		\
 	{1UL << PG_waiters,		"waiters"	},		\
@@ -112,7 +118,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached,	"uncached"	)		\
 IF_HAVE_PG_HWPOISON(PG_hwpoison,	"hwpoison"	)		\
 IF_HAVE_PG_IDLE(PG_young,		"young"		)		\
 IF_HAVE_PG_IDLE(PG_idle,		"idle"		)		\
-IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)
+IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)		\
+IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison")
 
 #define show_page_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",				\
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 34362c8d0955..954d5c2f7683 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -238,10 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
 	return &alloc_meta->free_track[0];
 }
 
+static bool skip_user_poison_on_free;
+static int __init skip_user_poison_on_free_param(char *buf)
+{
+	return kstrtobool(buf, &skip_user_poison_on_free);
+}
+early_param("kasan.skip_user_poison_on_free", skip_user_poison_on_free_param);
+
 void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
 {
 	bool init = !want_init_on_free() && want_init_on_alloc(flags);
 
+	if (skip_user_poison_on_free && (flags & __GFP_SKIP_KASAN_POISON))
+		SetPageSkipKASanPoison(page);
+
 	if (flags & __GFP_ZEROTAGS) {
 		int i;
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 24e6f668ef73..2c3ac15ddd54 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -394,11 +394,12 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
  * on-demand allocation and then freed again before the deferred pages
  * initialization is done, but this is not likely to happen.
  */
-static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
 {
 	return static_branch_unlikely(&deferred_pages) ||
 	       (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-		(fpi_flags & FPI_SKIP_KASAN_POISON));
+		(fpi_flags & FPI_SKIP_KASAN_POISON)) ||
+	       PageSkipKASanPoison(page);
 }
 
 /* Returns true if the struct page for the pfn is uninitialised */
@@ -449,10 +450,11 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 	return false;
 }
 #else
-static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
 {
 	return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-		(fpi_flags & FPI_SKIP_KASAN_POISON));
+		(fpi_flags & FPI_SKIP_KASAN_POISON)) ||
+	       PageSkipKASanPoison(page);
 }
 
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -1244,7 +1246,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 			unsigned int order, bool check_free, fpi_t fpi_flags)
 {
 	int bad = 0;
-	bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
+	bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
-- 
2.31.1.607.g51e8a6a459-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
@ 2021-05-12 20:09   ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-12 20:09 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Peter Collingbourne, Evgenii Stepanov, linux-mm, linux-arm-kernel

Poisoning freed pages protects against kernel use-after-free. The
likelihood of such a bug involving kernel pages is significantly higher
than that for user pages. At the same time, poisoning freed pages can
impose a significant performance cost, which cannot always be justified
for user pages given the lower probability of finding a bug. Therefore,
make it possible to configure the kernel to disable freed user page
poisoning when using HW tags via the new kasan.skip_user_poison_on_free
command line option.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I716846e2de8ef179f44e835770df7e6307be96c9
---
 include/linux/gfp.h            | 13 ++++++++++---
 include/linux/page-flags.h     |  9 +++++++++
 include/trace/events/mmflags.h |  9 ++++++++-
 mm/kasan/hw_tags.c             | 10 ++++++++++
 mm/page_alloc.c                | 12 +++++++-----
 5 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 68ba237365dc..9a77e5660b07 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -54,8 +54,9 @@ struct vm_area_struct;
 #define ___GFP_THISNODE		0x200000u
 #define ___GFP_ACCOUNT		0x400000u
 #define ___GFP_ZEROTAGS		0x800000u
+#define ___GFP_SKIP_KASAN_POISON	0x1000000u
 #ifdef CONFIG_LOCKDEP
-#define ___GFP_NOLOCKDEP	0x1000000u
+#define ___GFP_NOLOCKDEP	0x2000000u
 #else
 #define ___GFP_NOLOCKDEP	0
 #endif
@@ -233,17 +234,22 @@ struct vm_area_struct;
  *
  * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
  * __GFP_ZERO is set.
+ *
+ * %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
+ * on deallocation. Typically used for userspace pages. Currently only has an
+ * effect in HW tags mode, and only if a command line option is set.
  */
 #define __GFP_NOWARN	((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP	((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO	((__force gfp_t)___GFP_ZERO)
 #define __GFP_ZEROTAGS	((__force gfp_t)___GFP_ZEROTAGS)
+#define __GFP_SKIP_KASAN_POISON	((__force gfp_t)___GFP_SKIP_KASAN_POISON)
 
 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
 
 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
 
 /**
@@ -320,7 +326,8 @@ struct vm_area_struct;
 #define GFP_NOWAIT	(__GFP_KSWAPD_RECLAIM)
 #define GFP_NOIO	(__GFP_RECLAIM)
 #define GFP_NOFS	(__GFP_RECLAIM | __GFP_IO)
-#define GFP_USER	(__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
+#define GFP_USER	(__GFP_RECLAIM | __GFP_IO | __GFP_FS | \
+			 __GFP_HARDWALL | __GFP_SKIP_KASAN_POISON)
 #define GFP_DMA		__GFP_DMA
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..40e2c5000585 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -137,6 +137,9 @@ enum pageflags {
 #endif
 #ifdef CONFIG_64BIT
 	PG_arch_2,
+#endif
+#ifdef CONFIG_KASAN_HW_TAGS
+	PG_skip_kasan_poison,
 #endif
 	__NR_PAGEFLAGS,
 
@@ -443,6 +446,12 @@ TESTCLEARFLAG(Young, young, PF_ANY)
 PAGEFLAG(Idle, idle, PF_ANY)
 #endif
 
+#ifdef CONFIG_KASAN_HW_TAGS
+PAGEFLAG(SkipKASanPoison, skip_kasan_poison, PF_HEAD)
+#else
+PAGEFLAG_FALSE(SkipKASanPoison)
+#endif
+
 /*
  * PageReported() is used to track reported free pages within the Buddy
  * allocator. We can use the non-atomic version of the test and set
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 629c7a0eaff2..390270e00a1d 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -85,6 +85,12 @@
 #define IF_HAVE_PG_ARCH_2(flag,string)
 #endif
 
+#ifdef CONFIG_KASAN_HW_TAGS
+#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string) ,{1UL << flag, string}
+#else
+#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string)
+#endif
+
 #define __def_pageflag_names						\
 	{1UL << PG_locked,		"locked"	},		\
 	{1UL << PG_waiters,		"waiters"	},		\
@@ -112,7 +118,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached,	"uncached"	)		\
 IF_HAVE_PG_HWPOISON(PG_hwpoison,	"hwpoison"	)		\
 IF_HAVE_PG_IDLE(PG_young,		"young"		)		\
 IF_HAVE_PG_IDLE(PG_idle,		"idle"		)		\
-IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)
+IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)		\
+IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison")
 
 #define show_page_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",				\
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 34362c8d0955..954d5c2f7683 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -238,10 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
 	return &alloc_meta->free_track[0];
 }
 
+static bool skip_user_poison_on_free;
+static int __init skip_user_poison_on_free_param(char *buf)
+{
+	return kstrtobool(buf, &skip_user_poison_on_free);
+}
+early_param("kasan.skip_user_poison_on_free", skip_user_poison_on_free_param);
+
 void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
 {
 	bool init = !want_init_on_free() && want_init_on_alloc(flags);
 
+	if (skip_user_poison_on_free && (flags & __GFP_SKIP_KASAN_POISON))
+		SetPageSkipKASanPoison(page);
+
 	if (flags & __GFP_ZEROTAGS) {
 		int i;
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 24e6f668ef73..2c3ac15ddd54 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -394,11 +394,12 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
  * on-demand allocation and then freed again before the deferred pages
  * initialization is done, but this is not likely to happen.
  */
-static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
 {
 	return static_branch_unlikely(&deferred_pages) ||
 	       (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-		(fpi_flags & FPI_SKIP_KASAN_POISON));
+		(fpi_flags & FPI_SKIP_KASAN_POISON)) ||
+	       PageSkipKASanPoison(page);
 }
 
 /* Returns true if the struct page for the pfn is uninitialised */
@@ -449,10 +450,11 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 	return false;
 }
 #else
-static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
+static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
 {
 	return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
-		(fpi_flags & FPI_SKIP_KASAN_POISON));
+		(fpi_flags & FPI_SKIP_KASAN_POISON)) ||
+	       PageSkipKASanPoison(page);
 }
 
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -1244,7 +1246,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
 			unsigned int order, bool check_free, fpi_t fpi_flags)
 {
 	int bad = 0;
-	bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
+	bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
 
 	VM_BUG_ON_PAGE(PageTail(page), page);
 
-- 
2.31.1.607.g51e8a6a459-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages
  2021-05-12 20:09 ` Peter Collingbourne
@ 2021-05-25 19:03   ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-25 19:03 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Evgenii Stepanov, Linux Memory Management List, Linux ARM, Marco Elver

On Wed, May 12, 2021 at 1:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently we can end up touching PROT_MTE user pages twice on fault
> and once on unmap. On fault, with KASAN disabled we first clear data
> and then set tags to 0, and with KASAN enabled we simultaneously
> clear data and set tags to the KASAN random tag, and then set tags
> again to 0. On unmap, we poison the page by setting tags, but this
> is less likely to find a bug than poisoning kernel pages.
>
> This patch series fixes these inefficiencies by only touching the pages
> once on fault using the DC GZVA instruction to clear both data and
> tags, and providing the option to avoid poisoning user pages on free.
>
> Peter Collingbourne (3):
>   kasan: use separate (un)poison implementation for integrated init
>   arm64: mte: handle tags zeroing at page allocation time
>   kasan: allow freed user page poisoning to be disabled with HW tags

Thanks Catalin for reviewing patch 2. Could someone on the KASAN side
please take a look at patches 1 and 3?

I imagine that we'll want this series to go in via the mm tree once it's ready.

Peter


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages
@ 2021-05-25 19:03   ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-25 19:03 UTC (permalink / raw)
  To: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton
  Cc: Evgenii Stepanov, Linux Memory Management List, Linux ARM, Marco Elver

On Wed, May 12, 2021 at 1:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently we can end up touching PROT_MTE user pages twice on fault
> and once on unmap. On fault, with KASAN disabled we first clear data
> and then set tags to 0, and with KASAN enabled we simultaneously
> clear data and set tags to the KASAN random tag, and then set tags
> again to 0. On unmap, we poison the page by setting tags, but this
> is less likely to find a bug than poisoning kernel pages.
>
> This patch series fixes these inefficiencies by only touching the pages
> once on fault using the DC GZVA instruction to clear both data and
> tags, and providing the option to avoid poisoning user pages on free.
>
> Peter Collingbourne (3):
>   kasan: use separate (un)poison implementation for integrated init
>   arm64: mte: handle tags zeroing at page allocation time
>   kasan: allow freed user page poisoning to be disabled with HW tags

Thanks Catalin for reviewing patch 2. Could someone on the KASAN side
please take a look at patches 1 and 3?

I imagine that we'll want this series to go in via the mm tree once it's ready.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-12 20:09   ` Peter Collingbourne
@ 2021-05-25 22:00     ` Andrey Konovalov
  -1 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:00 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently with integrated init page_alloc.c needs to know whether
> kasan_alloc_pages() will zero initialize memory, but this will start
> becoming more complicated once we start adding tag initialization
> support for user pages. To avoid page_alloc.c needing to know more
> details of what integrated init will do, move the unpoisoning logic
> for integrated init into the HW tags implementation. Currently the
> logic is identical but it will diverge in subsequent patches.
>
> For symmetry do the same for poisoning although this logic will
> be unaffected by subsequent patches.
>
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
> ---
> v3:
> - use BUILD_BUG()
>
> v2:
> - fix build with KASAN disabled
>
>  include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
>  mm/kasan/common.c     |  4 +--
>  mm/kasan/hw_tags.c    | 14 +++++++++
>  mm/mempool.c          |  6 ++--
>  mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
>  5 files changed, 91 insertions(+), 55 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index b1678a61e6a7..38061673e6ac 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -2,6 +2,7 @@
>  #ifndef _LINUX_KASAN_H
>  #define _LINUX_KASAN_H
>
> +#include <linux/bug.h>
>  #include <linux/static_key.h>
>  #include <linux/types.h>
>
> @@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
>
>  #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
>
> -#ifdef CONFIG_KASAN
> -
> -struct kasan_cache {
> -       int alloc_meta_offset;
> -       int free_meta_offset;
> -       bool is_kmalloc;
> -};
> -
>  #ifdef CONFIG_KASAN_HW_TAGS
>
>  DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> @@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
>         return kasan_enabled();
>  }
>
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> +void kasan_free_pages(struct page *page, unsigned int order);
> +
>  #else /* CONFIG_KASAN_HW_TAGS */
>
>  static inline bool kasan_enabled(void)
>  {
> +#ifdef CONFIG_KASAN
>         return true;
> +#else
> +       return false;
> +#endif
>  }
>
>  static inline bool kasan_has_integrated_init(void)
> @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
>         return false;
>  }
>
> +static __always_inline void kasan_alloc_pages(struct page *page,
> +                                             unsigned int order, gfp_t flags)
> +{
> +       /* Only available for integrated init. */
> +       BUILD_BUG();
> +}
> +
> +static __always_inline void kasan_free_pages(struct page *page,
> +                                            unsigned int order)
> +{
> +       /* Only available for integrated init. */
> +       BUILD_BUG();
> +}
> +
>  #endif /* CONFIG_KASAN_HW_TAGS */
>
> +#ifdef CONFIG_KASAN
> +
> +struct kasan_cache {
> +       int alloc_meta_offset;
> +       int free_meta_offset;
> +       bool is_kmalloc;
> +};
> +
>  slab_flags_t __kasan_never_merge(void);
>  static __always_inline slab_flags_t kasan_never_merge(void)
>  {
> @@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
>                 __kasan_unpoison_range(addr, size);
>  }
>
> -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
> -static __always_inline void kasan_alloc_pages(struct page *page,
> +void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
> +static __always_inline void kasan_poison_pages(struct page *page,
>                                                 unsigned int order, bool init)
>  {
>         if (kasan_enabled())
> -               __kasan_alloc_pages(page, order, init);
> +               __kasan_poison_pages(page, order, init);
>  }
>
> -void __kasan_free_pages(struct page *page, unsigned int order, bool init);
> -static __always_inline void kasan_free_pages(struct page *page,
> -                                               unsigned int order, bool init)
> +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
> +static __always_inline void kasan_unpoison_pages(struct page *page,
> +                                                unsigned int order, bool init)
>  {
>         if (kasan_enabled())
> -               __kasan_free_pages(page, order, init);
> +               __kasan_unpoison_pages(page, order, init);
>  }
>
>  void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> @@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
>
>  #else /* CONFIG_KASAN */
>
> -static inline bool kasan_enabled(void)
> -{
> -       return false;
> -}
> -static inline bool kasan_has_integrated_init(void)
> -{
> -       return false;
> -}
>  static inline slab_flags_t kasan_never_merge(void)
>  {
>         return 0;
>  }
>  static inline void kasan_unpoison_range(const void *address, size_t size) {}
> -static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
> -static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
> +static inline void kasan_poison_pages(struct page *page, unsigned int order,
> +                                     bool init) {}
> +static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
> +                                       bool init) {}
>  static inline void kasan_cache_create(struct kmem_cache *cache,
>                                       unsigned int *size,
>                                       slab_flags_t *flags) {}
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 6bb87f2acd4e..0ecd293af344 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
>         return 0;
>  }
>
> -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
>  {
>         u8 tag;
>         unsigned long i;
> @@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
>         kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
>  }
>
> -void __kasan_free_pages(struct page *page, unsigned int order, bool init)
> +void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
>  {
>         if (likely(!PageHighMem(page)))
>                 kasan_poison(page_address(page), PAGE_SIZE << order,
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 4004388b4e4b..45e552cb9172 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
>         return &alloc_meta->free_track[0];
>  }
>
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
> +{
> +       bool init = !want_init_on_free() && want_init_on_alloc(flags);

This check is now duplicated. One check here, the same one in
page_alloc.c. Please either add a helper that gets used in both
places, or at least a comment that the checks must be kept in sync.

> +
> +       kasan_unpoison_pages(page, order, init);
> +}
> +
> +void kasan_free_pages(struct page *page, unsigned int order)
> +{
> +       bool init = want_init_on_free();

Same here.

> +
> +       kasan_poison_pages(page, order, init);
> +}
> +
>  #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
>
>  void kasan_set_tagging_report_once(bool state)
> diff --git a/mm/mempool.c b/mm/mempool.c
> index a258cf4de575..0b8afbec3e35 100644
> --- a/mm/mempool.c
> +++ b/mm/mempool.c
> @@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
>         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
>                 kasan_slab_free_mempool(element);
>         else if (pool->alloc == mempool_alloc_pages)
> -               kasan_free_pages(element, (unsigned long)pool->pool_data, false);
> +               kasan_poison_pages(element, (unsigned long)pool->pool_data,
> +                                  false);
>  }
>
>  static void kasan_unpoison_element(mempool_t *pool, void *element)
> @@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
>         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
>                 kasan_unpoison_range(element, __ksize(element));
>         else if (pool->alloc == mempool_alloc_pages)
> -               kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
> +               kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
> +                                    false);
>  }
>
>  static __always_inline void add_element(mempool_t *pool, void *element)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index aaa1655cf682..6e82a7f6fd6f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
>  static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>
>  /*
> - * Calling kasan_free_pages() only after deferred memory initialization
> + * Calling kasan_poison_pages() only after deferred memory initialization
>   * has completed. Poisoning pages during deferred memory init will greatly
>   * lengthen the process and cause problem in large memory systems as the
>   * deferred pages initialization is done with interrupt disabled.
> @@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>   * on-demand allocation and then freed again before the deferred pages
>   * initialization is done, but this is not likely to happen.
>   */
> -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> -                                               bool init, fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
>  {
> -       if (static_branch_unlikely(&deferred_pages))
> -               return;
> -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> -               return;
> -       kasan_free_pages(page, order, init);
> +       return static_branch_unlikely(&deferred_pages) ||
> +              (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> +               (fpi_flags & FPI_SKIP_KASAN_POISON));
>  }
>
>  /* Returns true if the struct page for the pfn is uninitialised */
> @@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
>         return false;
>  }
>  #else
> -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> -                                               bool init, fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
>  {
> -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> -               return;
> -       kasan_free_pages(page, order, init);
> +       return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> +               (fpi_flags & FPI_SKIP_KASAN_POISON));
>  }
>
>  static inline bool early_page_uninitialised(unsigned long pfn)
> @@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                         unsigned int order, bool check_free, fpi_t fpi_flags)
>  {
>         int bad = 0;
> -       bool init;
> +       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
>
>         VM_BUG_ON_PAGE(PageTail(page), page);
>
> @@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
>          * With hardware tag-based KASAN, memory tags must be set before the
>          * page becomes unavailable via debug_pagealloc or arch_free_page.
>          */
> -       init = want_init_on_free();
> -       if (init && !kasan_has_integrated_init())
> -               kernel_init_free_pages(page, 1 << order);
> -       kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> +       if (kasan_has_integrated_init()) {

Is it guaranteed that this branch will be eliminated when
kasan_has_integrated_init() is static inline returning false? I know
this works with macros, but I don't remember seeing cases with static
inline functions. I guess it's the same, but mentioning just in case
because BUILD_BUG() stood out.

> +               if (!skip_kasan_poison)
> +                       kasan_free_pages(page, order);
> +       } else {
> +               bool init = want_init_on_free();
> +
> +               if (init)
> +                       kernel_init_free_pages(page, 1 << order);
> +               if (!skip_kasan_poison)
> +                       kasan_poison_pages(page, order, init);
> +       }
>
>         /*
>          * arch_free_page() can make the page's contents inaccessible.  s390
> @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
>  inline void post_alloc_hook(struct page *page, unsigned int order,
>                                 gfp_t gfp_flags)
>  {
> -       bool init;
> -
>         set_page_private(page, 0);
>         set_page_refcounted(page);
>
> @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>          * kasan_alloc_pages and kernel_init_free_pages must be
>          * kept together to avoid discrepancies in behavior.
>          */
> -       init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> -       kasan_alloc_pages(page, order, init);
> -       if (init && !kasan_has_integrated_init())
> -               kernel_init_free_pages(page, 1 << order);
> +       if (kasan_has_integrated_init()) {
> +               kasan_alloc_pages(page, order, gfp_flags);
> +       } else {
> +               bool init =
> +                       !want_init_on_free() && want_init_on_alloc(gfp_flags);
> +
> +               kasan_unpoison_pages(page, order, init);
> +               if (init)
> +                       kernel_init_free_pages(page, 1 << order);
> +       }
>
>         set_page_owner(page, order, gfp_flags);
>  }
> --
> 2.31.1.607.g51e8a6a459-goog
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-25 22:00     ` Andrey Konovalov
  0 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:00 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently with integrated init page_alloc.c needs to know whether
> kasan_alloc_pages() will zero initialize memory, but this will start
> becoming more complicated once we start adding tag initialization
> support for user pages. To avoid page_alloc.c needing to know more
> details of what integrated init will do, move the unpoisoning logic
> for integrated init into the HW tags implementation. Currently the
> logic is identical but it will diverge in subsequent patches.
>
> For symmetry do the same for poisoning although this logic will
> be unaffected by subsequent patches.
>
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
> ---
> v3:
> - use BUILD_BUG()
>
> v2:
> - fix build with KASAN disabled
>
>  include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
>  mm/kasan/common.c     |  4 +--
>  mm/kasan/hw_tags.c    | 14 +++++++++
>  mm/mempool.c          |  6 ++--
>  mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
>  5 files changed, 91 insertions(+), 55 deletions(-)
>
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index b1678a61e6a7..38061673e6ac 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -2,6 +2,7 @@
>  #ifndef _LINUX_KASAN_H
>  #define _LINUX_KASAN_H
>
> +#include <linux/bug.h>
>  #include <linux/static_key.h>
>  #include <linux/types.h>
>
> @@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
>
>  #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
>
> -#ifdef CONFIG_KASAN
> -
> -struct kasan_cache {
> -       int alloc_meta_offset;
> -       int free_meta_offset;
> -       bool is_kmalloc;
> -};
> -
>  #ifdef CONFIG_KASAN_HW_TAGS
>
>  DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> @@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
>         return kasan_enabled();
>  }
>
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> +void kasan_free_pages(struct page *page, unsigned int order);
> +
>  #else /* CONFIG_KASAN_HW_TAGS */
>
>  static inline bool kasan_enabled(void)
>  {
> +#ifdef CONFIG_KASAN
>         return true;
> +#else
> +       return false;
> +#endif
>  }
>
>  static inline bool kasan_has_integrated_init(void)
> @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
>         return false;
>  }
>
> +static __always_inline void kasan_alloc_pages(struct page *page,
> +                                             unsigned int order, gfp_t flags)
> +{
> +       /* Only available for integrated init. */
> +       BUILD_BUG();
> +}
> +
> +static __always_inline void kasan_free_pages(struct page *page,
> +                                            unsigned int order)
> +{
> +       /* Only available for integrated init. */
> +       BUILD_BUG();
> +}
> +
>  #endif /* CONFIG_KASAN_HW_TAGS */
>
> +#ifdef CONFIG_KASAN
> +
> +struct kasan_cache {
> +       int alloc_meta_offset;
> +       int free_meta_offset;
> +       bool is_kmalloc;
> +};
> +
>  slab_flags_t __kasan_never_merge(void);
>  static __always_inline slab_flags_t kasan_never_merge(void)
>  {
> @@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
>                 __kasan_unpoison_range(addr, size);
>  }
>
> -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
> -static __always_inline void kasan_alloc_pages(struct page *page,
> +void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
> +static __always_inline void kasan_poison_pages(struct page *page,
>                                                 unsigned int order, bool init)
>  {
>         if (kasan_enabled())
> -               __kasan_alloc_pages(page, order, init);
> +               __kasan_poison_pages(page, order, init);
>  }
>
> -void __kasan_free_pages(struct page *page, unsigned int order, bool init);
> -static __always_inline void kasan_free_pages(struct page *page,
> -                                               unsigned int order, bool init)
> +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
> +static __always_inline void kasan_unpoison_pages(struct page *page,
> +                                                unsigned int order, bool init)
>  {
>         if (kasan_enabled())
> -               __kasan_free_pages(page, order, init);
> +               __kasan_unpoison_pages(page, order, init);
>  }
>
>  void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> @@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
>
>  #else /* CONFIG_KASAN */
>
> -static inline bool kasan_enabled(void)
> -{
> -       return false;
> -}
> -static inline bool kasan_has_integrated_init(void)
> -{
> -       return false;
> -}
>  static inline slab_flags_t kasan_never_merge(void)
>  {
>         return 0;
>  }
>  static inline void kasan_unpoison_range(const void *address, size_t size) {}
> -static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
> -static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
> +static inline void kasan_poison_pages(struct page *page, unsigned int order,
> +                                     bool init) {}
> +static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
> +                                       bool init) {}
>  static inline void kasan_cache_create(struct kmem_cache *cache,
>                                       unsigned int *size,
>                                       slab_flags_t *flags) {}
> diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> index 6bb87f2acd4e..0ecd293af344 100644
> --- a/mm/kasan/common.c
> +++ b/mm/kasan/common.c
> @@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
>         return 0;
>  }
>
> -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
>  {
>         u8 tag;
>         unsigned long i;
> @@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
>         kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
>  }
>
> -void __kasan_free_pages(struct page *page, unsigned int order, bool init)
> +void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
>  {
>         if (likely(!PageHighMem(page)))
>                 kasan_poison(page_address(page), PAGE_SIZE << order,
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 4004388b4e4b..45e552cb9172 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
>         return &alloc_meta->free_track[0];
>  }
>
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
> +{
> +       bool init = !want_init_on_free() && want_init_on_alloc(flags);

This check is now duplicated. One check here, the same one in
page_alloc.c. Please either add a helper that gets used in both
places, or at least a comment that the checks must be kept in sync.

> +
> +       kasan_unpoison_pages(page, order, init);
> +}
> +
> +void kasan_free_pages(struct page *page, unsigned int order)
> +{
> +       bool init = want_init_on_free();

Same here.

> +
> +       kasan_poison_pages(page, order, init);
> +}
> +
>  #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
>
>  void kasan_set_tagging_report_once(bool state)
> diff --git a/mm/mempool.c b/mm/mempool.c
> index a258cf4de575..0b8afbec3e35 100644
> --- a/mm/mempool.c
> +++ b/mm/mempool.c
> @@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
>         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
>                 kasan_slab_free_mempool(element);
>         else if (pool->alloc == mempool_alloc_pages)
> -               kasan_free_pages(element, (unsigned long)pool->pool_data, false);
> +               kasan_poison_pages(element, (unsigned long)pool->pool_data,
> +                                  false);
>  }
>
>  static void kasan_unpoison_element(mempool_t *pool, void *element)
> @@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
>         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
>                 kasan_unpoison_range(element, __ksize(element));
>         else if (pool->alloc == mempool_alloc_pages)
> -               kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
> +               kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
> +                                    false);
>  }
>
>  static __always_inline void add_element(mempool_t *pool, void *element)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index aaa1655cf682..6e82a7f6fd6f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
>  static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>
>  /*
> - * Calling kasan_free_pages() only after deferred memory initialization
> + * Calling kasan_poison_pages() only after deferred memory initialization
>   * has completed. Poisoning pages during deferred memory init will greatly
>   * lengthen the process and cause problem in large memory systems as the
>   * deferred pages initialization is done with interrupt disabled.
> @@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>   * on-demand allocation and then freed again before the deferred pages
>   * initialization is done, but this is not likely to happen.
>   */
> -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> -                                               bool init, fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
>  {
> -       if (static_branch_unlikely(&deferred_pages))
> -               return;
> -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> -               return;
> -       kasan_free_pages(page, order, init);
> +       return static_branch_unlikely(&deferred_pages) ||
> +              (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> +               (fpi_flags & FPI_SKIP_KASAN_POISON));
>  }
>
>  /* Returns true if the struct page for the pfn is uninitialised */
> @@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
>         return false;
>  }
>  #else
> -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> -                                               bool init, fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
>  {
> -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> -               return;
> -       kasan_free_pages(page, order, init);
> +       return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> +               (fpi_flags & FPI_SKIP_KASAN_POISON));
>  }
>
>  static inline bool early_page_uninitialised(unsigned long pfn)
> @@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                         unsigned int order, bool check_free, fpi_t fpi_flags)
>  {
>         int bad = 0;
> -       bool init;
> +       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
>
>         VM_BUG_ON_PAGE(PageTail(page), page);
>
> @@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
>          * With hardware tag-based KASAN, memory tags must be set before the
>          * page becomes unavailable via debug_pagealloc or arch_free_page.
>          */
> -       init = want_init_on_free();
> -       if (init && !kasan_has_integrated_init())
> -               kernel_init_free_pages(page, 1 << order);
> -       kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> +       if (kasan_has_integrated_init()) {

Is it guaranteed that this branch will be eliminated when
kasan_has_integrated_init() is static inline returning false? I know
this works with macros, but I don't remember seeing cases with static
inline functions. I guess it's the same, but mentioning just in case
because BUILD_BUG() stood out.

> +               if (!skip_kasan_poison)
> +                       kasan_free_pages(page, order);
> +       } else {
> +               bool init = want_init_on_free();
> +
> +               if (init)
> +                       kernel_init_free_pages(page, 1 << order);
> +               if (!skip_kasan_poison)
> +                       kasan_poison_pages(page, order, init);
> +       }
>
>         /*
>          * arch_free_page() can make the page's contents inaccessible.  s390
> @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
>  inline void post_alloc_hook(struct page *page, unsigned int order,
>                                 gfp_t gfp_flags)
>  {
> -       bool init;
> -
>         set_page_private(page, 0);
>         set_page_refcounted(page);
>
> @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>          * kasan_alloc_pages and kernel_init_free_pages must be
>          * kept together to avoid discrepancies in behavior.
>          */
> -       init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> -       kasan_alloc_pages(page, order, init);
> -       if (init && !kasan_has_integrated_init())
> -               kernel_init_free_pages(page, 1 << order);
> +       if (kasan_has_integrated_init()) {
> +               kasan_alloc_pages(page, order, gfp_flags);
> +       } else {
> +               bool init =
> +                       !want_init_on_free() && want_init_on_alloc(gfp_flags);
> +
> +               kasan_unpoison_pages(page, order, init);
> +               if (init)
> +                       kernel_init_free_pages(page, 1 << order);
> +       }
>
>         set_page_owner(page, order, gfp_flags);
>  }
> --
> 2.31.1.607.g51e8a6a459-goog
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/3] arm64: mte: handle tags zeroing at page allocation time
  2021-05-12 20:09   ` Peter Collingbourne
@ 2021-05-25 22:00     ` Andrey Konovalov
  -1 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:00 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently, on an anonymous page fault, the kernel allocates a zeroed
> page and maps it in user space. If the mapping is tagged (PROT_MTE),
> set_pte_at() additionally clears the tags. It is, however, more
> efficient to clear the tags at the same time as zeroing the data on
> allocation. To avoid clearing the tags on any page (which may not be
> mapped as tagged), only do this if the vma flags contain VM_MTE. This
> requires introducing a new GFP flag that is used to determine whether
> to clear the tags.
>
> The DC GZVA instruction with a 0 top byte (and 0 tag) requires
> top-byte-ignore. Set the TCR_EL1.{TBI1,TBID1} bits irrespective of
> whether KASAN_HW is enabled.
>
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Link: https://linux-review.googlesource.com/id/Id46dc94e30fe11474f7e54f5d65e7658dbdddb26
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> v2:
> - remove want_zero_tags_on_free()
>
>  arch/arm64/include/asm/mte.h  |  4 ++++
>  arch/arm64/include/asm/page.h |  9 +++++++--
>  arch/arm64/lib/mte.S          | 20 ++++++++++++++++++++
>  arch/arm64/mm/fault.c         | 25 +++++++++++++++++++++++++
>  arch/arm64/mm/proc.S          | 10 +++++++---
>  include/linux/gfp.h           |  9 +++++++--
>  include/linux/highmem.h       |  8 ++++++++
>  mm/kasan/hw_tags.c            |  9 ++++++++-
>  mm/page_alloc.c               | 13 ++++++++++---
>  9 files changed, 96 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index bc88a1ced0d7..67bf259ae768 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -37,6 +37,7 @@ void mte_free_tag_storage(char *storage);
>  /* track which pages have valid allocation tags */
>  #define PG_mte_tagged  PG_arch_2
>
> +void mte_zero_clear_page_tags(void *addr);
>  void mte_sync_tags(pte_t *ptep, pte_t pte);
>  void mte_copy_page_tags(void *kto, const void *kfrom);
>  void mte_thread_init_user(void);
> @@ -53,6 +54,9 @@ int mte_ptrace_copy_tags(struct task_struct *child, long request,
>  /* unused if !CONFIG_ARM64_MTE, silence the compiler */
>  #define PG_mte_tagged  0
>
> +static inline void mte_zero_clear_page_tags(void *addr)
> +{
> +}
>  static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
>  {
>  }
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..448e14071d13 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -13,6 +13,7 @@
>  #ifndef __ASSEMBLY__
>
>  #include <linux/personality.h> /* for READ_IMPLIES_EXEC */
> +#include <linux/types.h> /* for gfp_t */
>  #include <asm/pgtable-types.h>
>
>  struct page;
> @@ -28,10 +29,14 @@ void copy_user_highpage(struct page *to, struct page *from,
>  void copy_highpage(struct page *to, struct page *from);
>  #define __HAVE_ARCH_COPY_HIGHPAGE
>
> -#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
> -       alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
> +struct page *__alloc_zeroed_user_highpage(gfp_t movableflags,
> +                                         struct vm_area_struct *vma,
> +                                         unsigned long vaddr);
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
>
> +void tag_clear_highpage(struct page *to);
> +#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> +
>  #define clear_user_page(page, vaddr, pg)       clear_page(page)
>  #define copy_user_page(to, from, vaddr, pg)    copy_page(to, from)
>
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 351537c12f36..e83643b3995f 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -36,6 +36,26 @@ SYM_FUNC_START(mte_clear_page_tags)
>         ret
>  SYM_FUNC_END(mte_clear_page_tags)
>
> +/*
> + * Zero the page and tags at the same time
> + *
> + * Parameters:
> + *     x0 - address to the beginning of the page
> + */
> +SYM_FUNC_START(mte_zero_clear_page_tags)
> +       mrs     x1, dczid_el0
> +       and     w1, w1, #0xf
> +       mov     x2, #4
> +       lsl     x1, x2, x1
> +       and     x0, x0, #(1 << MTE_TAG_SHIFT) - 1       // clear the tag
> +
> +1:     dc      gzva, x0
> +       add     x0, x0, x1
> +       tst     x0, #(PAGE_SIZE - 1)
> +       b.ne    1b
> +       ret
> +SYM_FUNC_END(mte_zero_clear_page_tags)
> +
>  /*
>   * Copy the tags from the source page to the destination one
>   *   x0 - address of the destination page
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 871c82ab0a30..8127e0c0b8fb 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -921,3 +921,28 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
>         debug_exception_exit(regs);
>  }
>  NOKPROBE_SYMBOL(do_debug_exception);
> +
> +/*
> + * Used during anonymous page fault handling.
> + */
> +struct page *__alloc_zeroed_user_highpage(gfp_t flags,
> +                                         struct vm_area_struct *vma,
> +                                         unsigned long vaddr)
> +{
> +       /*
> +        * If the page is mapped with PROT_MTE, initialise the tags at the
> +        * point of allocation and page zeroing as this is usually faster than
> +        * separate DC ZVA and STGM.
> +        */
> +       if (vma->vm_flags & VM_MTE)
> +               flags |= __GFP_ZEROTAGS;
> +
> +       return alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | flags, vma, vaddr);
> +}
> +
> +void tag_clear_highpage(struct page *page)
> +{
> +       mte_zero_clear_page_tags(page_address(page));
> +       page_kasan_tag_reset(page);
> +       set_bit(PG_mte_tagged, &page->flags);
> +}
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 0a48191534ff..a27c77dbe91c 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -46,9 +46,13 @@
>  #endif
>
>  #ifdef CONFIG_KASAN_HW_TAGS
> -#define TCR_KASAN_HW_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
> +#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
>  #else
> -#define TCR_KASAN_HW_FLAGS 0
> +/*
> + * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
> + * TBI being enabled at EL1.
> + */
> +#define TCR_MTE_FLAGS TCR_TBI1 | TCR_TBID1
>  #endif
>
>  /*
> @@ -452,7 +456,7 @@ SYM_FUNC_START(__cpu_setup)
>         msr_s   SYS_TFSRE0_EL1, xzr
>
>         /* set the TCR_EL1 bits */
> -       mov_q   x10, TCR_KASAN_HW_FLAGS
> +       mov_q   x10, TCR_MTE_FLAGS
>         orr     tcr, tcr, x10
>  1:
>  #endif
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 11da8af06704..68ba237365dc 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -53,8 +53,9 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL                0x100000u
>  #define ___GFP_THISNODE                0x200000u
>  #define ___GFP_ACCOUNT         0x400000u
> +#define ___GFP_ZEROTAGS                0x800000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x800000u
> +#define ___GFP_NOLOCKDEP       0x1000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> @@ -229,16 +230,20 @@ struct vm_area_struct;
>   * %__GFP_COMP address compound page metadata.
>   *
>   * %__GFP_ZERO returns a zeroed page on success.
> + *
> + * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
> + * __GFP_ZERO is set.
>   */
>  #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
> +#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
>
>  /* Disable lockdep for GFP context tracking */
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>
>  /**
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 832b49b50c7b..caaa62e1dd24 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -204,6 +204,14 @@ static inline void clear_highpage(struct page *page)
>         kunmap_atomic(kaddr);
>  }
>
> +#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> +
> +static inline void tag_clear_highpage(struct page *page)
> +{
> +}
> +
> +#endif
> +
>  /*
>   * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
>   * If we pass in a head page, we can zero up to the size of the compound page.
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 45e552cb9172..34362c8d0955 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -242,7 +242,14 @@ void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
>  {
>         bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> -       kasan_unpoison_pages(page, order, init);
> +       if (flags & __GFP_ZEROTAGS) {
> +               int i;
> +
> +               for (i = 0; i != 1 << order; ++i)
> +                       tag_clear_highpage(page + i);
> +       } else {
> +               kasan_unpoison_pages(page, order, init);
> +       }
>  }
>
>  void kasan_free_pages(struct page *page, unsigned int order)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6e82a7f6fd6f..24e6f668ef73 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1219,10 +1219,16 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
>         return ret;
>  }
>
> -static void kernel_init_free_pages(struct page *page, int numpages)
> +static void kernel_init_free_pages(struct page *page, int numpages, bool zero_tags)
>  {
>         int i;
>
> +       if (zero_tags) {
> +               for (i = 0; i < numpages; i++)
> +                       tag_clear_highpage(page + i);
> +               return;
> +       }
> +
>         /* s390's use of memset() could override KASAN redzones. */
>         kasan_disable_current();
>         for (i = 0; i < numpages; i++) {
> @@ -1314,7 +1320,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                 bool init = want_init_on_free();
>
>                 if (init)
> -                       kernel_init_free_pages(page, 1 << order);
> +                       kernel_init_free_pages(page, 1 << order, false);
>                 if (!skip_kasan_poison)
>                         kasan_poison_pages(page, order, init);
>         }
> @@ -2350,7 +2356,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>
>                 kasan_unpoison_pages(page, order, init);
>                 if (init)
> -                       kernel_init_free_pages(page, 1 << order);
> +                       kernel_init_free_pages(page, 1 << order,
> +                                              gfp_flags & __GFP_ZEROTAGS);
>         }
>
>         set_page_owner(page, order, gfp_flags);
> --
> 2.31.1.607.g51e8a6a459-goog
>

For KASAN parts:

Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/3] arm64: mte: handle tags zeroing at page allocation time
@ 2021-05-25 22:00     ` Andrey Konovalov
  0 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:00 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Currently, on an anonymous page fault, the kernel allocates a zeroed
> page and maps it in user space. If the mapping is tagged (PROT_MTE),
> set_pte_at() additionally clears the tags. It is, however, more
> efficient to clear the tags at the same time as zeroing the data on
> allocation. To avoid clearing the tags on any page (which may not be
> mapped as tagged), only do this if the vma flags contain VM_MTE. This
> requires introducing a new GFP flag that is used to determine whether
> to clear the tags.
>
> The DC GZVA instruction with a 0 top byte (and 0 tag) requires
> top-byte-ignore. Set the TCR_EL1.{TBI1,TBID1} bits irrespective of
> whether KASAN_HW is enabled.
>
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Link: https://linux-review.googlesource.com/id/Id46dc94e30fe11474f7e54f5d65e7658dbdddb26
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> v2:
> - remove want_zero_tags_on_free()
>
>  arch/arm64/include/asm/mte.h  |  4 ++++
>  arch/arm64/include/asm/page.h |  9 +++++++--
>  arch/arm64/lib/mte.S          | 20 ++++++++++++++++++++
>  arch/arm64/mm/fault.c         | 25 +++++++++++++++++++++++++
>  arch/arm64/mm/proc.S          | 10 +++++++---
>  include/linux/gfp.h           |  9 +++++++--
>  include/linux/highmem.h       |  8 ++++++++
>  mm/kasan/hw_tags.c            |  9 ++++++++-
>  mm/page_alloc.c               | 13 ++++++++++---
>  9 files changed, 96 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index bc88a1ced0d7..67bf259ae768 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -37,6 +37,7 @@ void mte_free_tag_storage(char *storage);
>  /* track which pages have valid allocation tags */
>  #define PG_mte_tagged  PG_arch_2
>
> +void mte_zero_clear_page_tags(void *addr);
>  void mte_sync_tags(pte_t *ptep, pte_t pte);
>  void mte_copy_page_tags(void *kto, const void *kfrom);
>  void mte_thread_init_user(void);
> @@ -53,6 +54,9 @@ int mte_ptrace_copy_tags(struct task_struct *child, long request,
>  /* unused if !CONFIG_ARM64_MTE, silence the compiler */
>  #define PG_mte_tagged  0
>
> +static inline void mte_zero_clear_page_tags(void *addr)
> +{
> +}
>  static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
>  {
>  }
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 012cffc574e8..448e14071d13 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -13,6 +13,7 @@
>  #ifndef __ASSEMBLY__
>
>  #include <linux/personality.h> /* for READ_IMPLIES_EXEC */
> +#include <linux/types.h> /* for gfp_t */
>  #include <asm/pgtable-types.h>
>
>  struct page;
> @@ -28,10 +29,14 @@ void copy_user_highpage(struct page *to, struct page *from,
>  void copy_highpage(struct page *to, struct page *from);
>  #define __HAVE_ARCH_COPY_HIGHPAGE
>
> -#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
> -       alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
> +struct page *__alloc_zeroed_user_highpage(gfp_t movableflags,
> +                                         struct vm_area_struct *vma,
> +                                         unsigned long vaddr);
>  #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
>
> +void tag_clear_highpage(struct page *to);
> +#define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> +
>  #define clear_user_page(page, vaddr, pg)       clear_page(page)
>  #define copy_user_page(to, from, vaddr, pg)    copy_page(to, from)
>
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 351537c12f36..e83643b3995f 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -36,6 +36,26 @@ SYM_FUNC_START(mte_clear_page_tags)
>         ret
>  SYM_FUNC_END(mte_clear_page_tags)
>
> +/*
> + * Zero the page and tags at the same time
> + *
> + * Parameters:
> + *     x0 - address to the beginning of the page
> + */
> +SYM_FUNC_START(mte_zero_clear_page_tags)
> +       mrs     x1, dczid_el0
> +       and     w1, w1, #0xf
> +       mov     x2, #4
> +       lsl     x1, x2, x1
> +       and     x0, x0, #(1 << MTE_TAG_SHIFT) - 1       // clear the tag
> +
> +1:     dc      gzva, x0
> +       add     x0, x0, x1
> +       tst     x0, #(PAGE_SIZE - 1)
> +       b.ne    1b
> +       ret
> +SYM_FUNC_END(mte_zero_clear_page_tags)
> +
>  /*
>   * Copy the tags from the source page to the destination one
>   *   x0 - address of the destination page
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 871c82ab0a30..8127e0c0b8fb 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -921,3 +921,28 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
>         debug_exception_exit(regs);
>  }
>  NOKPROBE_SYMBOL(do_debug_exception);
> +
> +/*
> + * Used during anonymous page fault handling.
> + */
> +struct page *__alloc_zeroed_user_highpage(gfp_t flags,
> +                                         struct vm_area_struct *vma,
> +                                         unsigned long vaddr)
> +{
> +       /*
> +        * If the page is mapped with PROT_MTE, initialise the tags at the
> +        * point of allocation and page zeroing as this is usually faster than
> +        * separate DC ZVA and STGM.
> +        */
> +       if (vma->vm_flags & VM_MTE)
> +               flags |= __GFP_ZEROTAGS;
> +
> +       return alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | flags, vma, vaddr);
> +}
> +
> +void tag_clear_highpage(struct page *page)
> +{
> +       mte_zero_clear_page_tags(page_address(page));
> +       page_kasan_tag_reset(page);
> +       set_bit(PG_mte_tagged, &page->flags);
> +}
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 0a48191534ff..a27c77dbe91c 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -46,9 +46,13 @@
>  #endif
>
>  #ifdef CONFIG_KASAN_HW_TAGS
> -#define TCR_KASAN_HW_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
> +#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
>  #else
> -#define TCR_KASAN_HW_FLAGS 0
> +/*
> + * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
> + * TBI being enabled at EL1.
> + */
> +#define TCR_MTE_FLAGS TCR_TBI1 | TCR_TBID1
>  #endif
>
>  /*
> @@ -452,7 +456,7 @@ SYM_FUNC_START(__cpu_setup)
>         msr_s   SYS_TFSRE0_EL1, xzr
>
>         /* set the TCR_EL1 bits */
> -       mov_q   x10, TCR_KASAN_HW_FLAGS
> +       mov_q   x10, TCR_MTE_FLAGS
>         orr     tcr, tcr, x10
>  1:
>  #endif
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 11da8af06704..68ba237365dc 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -53,8 +53,9 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL                0x100000u
>  #define ___GFP_THISNODE                0x200000u
>  #define ___GFP_ACCOUNT         0x400000u
> +#define ___GFP_ZEROTAGS                0x800000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x800000u
> +#define ___GFP_NOLOCKDEP       0x1000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> @@ -229,16 +230,20 @@ struct vm_area_struct;
>   * %__GFP_COMP address compound page metadata.
>   *
>   * %__GFP_ZERO returns a zeroed page on success.
> + *
> + * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
> + * __GFP_ZERO is set.
>   */
>  #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
> +#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
>
>  /* Disable lockdep for GFP context tracking */
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>
>  /**
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 832b49b50c7b..caaa62e1dd24 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -204,6 +204,14 @@ static inline void clear_highpage(struct page *page)
>         kunmap_atomic(kaddr);
>  }
>
> +#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
> +
> +static inline void tag_clear_highpage(struct page *page)
> +{
> +}
> +
> +#endif
> +
>  /*
>   * If we pass in a base or tail page, we can zero up to PAGE_SIZE.
>   * If we pass in a head page, we can zero up to the size of the compound page.
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 45e552cb9172..34362c8d0955 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -242,7 +242,14 @@ void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
>  {
>         bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> -       kasan_unpoison_pages(page, order, init);
> +       if (flags & __GFP_ZEROTAGS) {
> +               int i;
> +
> +               for (i = 0; i != 1 << order; ++i)
> +                       tag_clear_highpage(page + i);
> +       } else {
> +               kasan_unpoison_pages(page, order, init);
> +       }
>  }
>
>  void kasan_free_pages(struct page *page, unsigned int order)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6e82a7f6fd6f..24e6f668ef73 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1219,10 +1219,16 @@ static int free_tail_pages_check(struct page *head_page, struct page *page)
>         return ret;
>  }
>
> -static void kernel_init_free_pages(struct page *page, int numpages)
> +static void kernel_init_free_pages(struct page *page, int numpages, bool zero_tags)
>  {
>         int i;
>
> +       if (zero_tags) {
> +               for (i = 0; i < numpages; i++)
> +                       tag_clear_highpage(page + i);
> +               return;
> +       }
> +
>         /* s390's use of memset() could override KASAN redzones. */
>         kasan_disable_current();
>         for (i = 0; i < numpages; i++) {
> @@ -1314,7 +1320,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                 bool init = want_init_on_free();
>
>                 if (init)
> -                       kernel_init_free_pages(page, 1 << order);
> +                       kernel_init_free_pages(page, 1 << order, false);
>                 if (!skip_kasan_poison)
>                         kasan_poison_pages(page, order, init);
>         }
> @@ -2350,7 +2356,8 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>
>                 kasan_unpoison_pages(page, order, init);
>                 if (init)
> -                       kernel_init_free_pages(page, 1 << order);
> +                       kernel_init_free_pages(page, 1 << order,
> +                                              gfp_flags & __GFP_ZEROTAGS);
>         }
>
>         set_page_owner(page, order, gfp_flags);
> --
> 2.31.1.607.g51e8a6a459-goog
>

For KASAN parts:

Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
  2021-05-12 20:09   ` Peter Collingbourne
@ 2021-05-25 22:06     ` Andrey Konovalov
  -1 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:06 UTC (permalink / raw)
  To: Peter Collingbourne, Jann Horn
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

+Jann

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Poisoning freed pages protects against kernel use-after-free. The
> likelihood of such a bug involving kernel pages is significantly higher
> than that for user pages. At the same time, poisoning freed pages can
> impose a significant performance cost, which cannot always be justified
> for user pages given the lower probability of finding a bug. Therefore,
> make it possible to configure the kernel to disable freed user page
> poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> command line option.

So the potential scenario that would be undetectable with
kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
page and maps it for userspace, 2) the page gets freed in the kernel,
3) kernel accesses the page leading to a use-after-free. Is this
correct?

If bugs involving use-after-free accesses on user pages is something
that is extremely rare, perhaps we could just change the default and
avoid adding a command line switch.

Jann, maybe you have an idea of how common something like this is or
have other inputs?

Peter, is the plan to have this poisoning disabled in production? Is
there an estimate on slow this is?



> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I716846e2de8ef179f44e835770df7e6307be96c9
> ---
>  include/linux/gfp.h            | 13 ++++++++++---
>  include/linux/page-flags.h     |  9 +++++++++
>  include/trace/events/mmflags.h |  9 ++++++++-
>  mm/kasan/hw_tags.c             | 10 ++++++++++
>  mm/page_alloc.c                | 12 +++++++-----
>  5 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 68ba237365dc..9a77e5660b07 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -54,8 +54,9 @@ struct vm_area_struct;
>  #define ___GFP_THISNODE                0x200000u
>  #define ___GFP_ACCOUNT         0x400000u
>  #define ___GFP_ZEROTAGS                0x800000u
> +#define ___GFP_SKIP_KASAN_POISON       0x1000000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x1000000u
> +#define ___GFP_NOLOCKDEP       0x2000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> @@ -233,17 +234,22 @@ struct vm_area_struct;
>   *
>   * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
>   * __GFP_ZERO is set.
> + *
> + * %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
> + * on deallocation. Typically used for userspace pages. Currently only has an
> + * effect in HW tags mode, and only if a command line option is set.
>   */
>  #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
>  #define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
> +#define __GFP_SKIP_KASAN_POISON        ((__force gfp_t)___GFP_SKIP_KASAN_POISON)
>
>  /* Disable lockdep for GFP context tracking */
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>
>  /**
> @@ -320,7 +326,8 @@ struct vm_area_struct;
>  #define GFP_NOWAIT     (__GFP_KSWAPD_RECLAIM)
>  #define GFP_NOIO       (__GFP_RECLAIM)
>  #define GFP_NOFS       (__GFP_RECLAIM | __GFP_IO)
> -#define GFP_USER       (__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
> +#define GFP_USER       (__GFP_RECLAIM | __GFP_IO | __GFP_FS | \
> +                        __GFP_HARDWALL | __GFP_SKIP_KASAN_POISON)
>  #define GFP_DMA                __GFP_DMA
>  #define GFP_DMA32      __GFP_DMA32
>  #define GFP_HIGHUSER   (GFP_USER | __GFP_HIGHMEM)
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 04a34c08e0a6..40e2c5000585 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -137,6 +137,9 @@ enum pageflags {
>  #endif
>  #ifdef CONFIG_64BIT
>         PG_arch_2,
> +#endif
> +#ifdef CONFIG_KASAN_HW_TAGS
> +       PG_skip_kasan_poison,
>  #endif
>         __NR_PAGEFLAGS,
>
> @@ -443,6 +446,12 @@ TESTCLEARFLAG(Young, young, PF_ANY)
>  PAGEFLAG(Idle, idle, PF_ANY)
>  #endif
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +PAGEFLAG(SkipKASanPoison, skip_kasan_poison, PF_HEAD)
> +#else
> +PAGEFLAG_FALSE(SkipKASanPoison)
> +#endif
> +
>  /*
>   * PageReported() is used to track reported free pages within the Buddy
>   * allocator. We can use the non-atomic version of the test and set
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index 629c7a0eaff2..390270e00a1d 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -85,6 +85,12 @@
>  #define IF_HAVE_PG_ARCH_2(flag,string)
>  #endif
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string) ,{1UL << flag, string}
> +#else
> +#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string)
> +#endif
> +
>  #define __def_pageflag_names                                           \
>         {1UL << PG_locked,              "locked"        },              \
>         {1UL << PG_waiters,             "waiters"       },              \
> @@ -112,7 +118,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached,    "uncached"      )               \
>  IF_HAVE_PG_HWPOISON(PG_hwpoison,       "hwpoison"      )               \
>  IF_HAVE_PG_IDLE(PG_young,              "young"         )               \
>  IF_HAVE_PG_IDLE(PG_idle,               "idle"          )               \
> -IF_HAVE_PG_ARCH_2(PG_arch_2,           "arch_2"        )
> +IF_HAVE_PG_ARCH_2(PG_arch_2,           "arch_2"        )               \
> +IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison")
>
>  #define show_page_flags(flags)                                         \
>         (flags) ? __print_flags(flags, "|",                             \
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 34362c8d0955..954d5c2f7683 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -238,10 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
>         return &alloc_meta->free_track[0];
>  }
>
> +static bool skip_user_poison_on_free;
> +static int __init skip_user_poison_on_free_param(char *buf)
> +{
> +       return kstrtobool(buf, &skip_user_poison_on_free);
> +}
> +early_param("kasan.skip_user_poison_on_free", skip_user_poison_on_free_param);
> +
>  void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
>  {
>         bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> +       if (skip_user_poison_on_free && (flags & __GFP_SKIP_KASAN_POISON))
> +               SetPageSkipKASanPoison(page);
> +
>         if (flags & __GFP_ZEROTAGS) {
>                 int i;
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 24e6f668ef73..2c3ac15ddd54 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -394,11 +394,12 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>   * on-demand allocation and then freed again before the deferred pages
>   * initialization is done, but this is not likely to happen.
>   */
> -static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
>  {
>         return static_branch_unlikely(&deferred_pages) ||
>                (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -               (fpi_flags & FPI_SKIP_KASAN_POISON));
> +               (fpi_flags & FPI_SKIP_KASAN_POISON)) ||
> +              PageSkipKASanPoison(page);
>  }
>
>  /* Returns true if the struct page for the pfn is uninitialised */
> @@ -449,10 +450,11 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
>         return false;
>  }
>  #else
> -static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
>  {
>         return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -               (fpi_flags & FPI_SKIP_KASAN_POISON));
> +               (fpi_flags & FPI_SKIP_KASAN_POISON)) ||
> +              PageSkipKASanPoison(page);
>  }
>
>  static inline bool early_page_uninitialised(unsigned long pfn)
> @@ -1244,7 +1246,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                         unsigned int order, bool check_free, fpi_t fpi_flags)
>  {
>         int bad = 0;
> -       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
> +       bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
>
>         VM_BUG_ON_PAGE(PageTail(page), page);
>
> --
> 2.31.1.607.g51e8a6a459-goog
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
@ 2021-05-25 22:06     ` Andrey Konovalov
  0 siblings, 0 replies; 28+ messages in thread
From: Andrey Konovalov @ 2021-05-25 22:06 UTC (permalink / raw)
  To: Peter Collingbourne, Jann Horn
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	linux-arm-kernel

+Jann

On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
>
> Poisoning freed pages protects against kernel use-after-free. The
> likelihood of such a bug involving kernel pages is significantly higher
> than that for user pages. At the same time, poisoning freed pages can
> impose a significant performance cost, which cannot always be justified
> for user pages given the lower probability of finding a bug. Therefore,
> make it possible to configure the kernel to disable freed user page
> poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> command line option.

So the potential scenario that would be undetectable with
kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
page and maps it for userspace, 2) the page gets freed in the kernel,
3) kernel accesses the page leading to a use-after-free. Is this
correct?

If bugs involving use-after-free accesses on user pages is something
that is extremely rare, perhaps we could just change the default and
avoid adding a command line switch.

Jann, maybe you have an idea of how common something like this is or
have other inputs?

Peter, is the plan to have this poisoning disabled in production? Is
there an estimate on slow this is?



> Signed-off-by: Peter Collingbourne <pcc@google.com>
> Link: https://linux-review.googlesource.com/id/I716846e2de8ef179f44e835770df7e6307be96c9
> ---
>  include/linux/gfp.h            | 13 ++++++++++---
>  include/linux/page-flags.h     |  9 +++++++++
>  include/trace/events/mmflags.h |  9 ++++++++-
>  mm/kasan/hw_tags.c             | 10 ++++++++++
>  mm/page_alloc.c                | 12 +++++++-----
>  5 files changed, 44 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 68ba237365dc..9a77e5660b07 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -54,8 +54,9 @@ struct vm_area_struct;
>  #define ___GFP_THISNODE                0x200000u
>  #define ___GFP_ACCOUNT         0x400000u
>  #define ___GFP_ZEROTAGS                0x800000u
> +#define ___GFP_SKIP_KASAN_POISON       0x1000000u
>  #ifdef CONFIG_LOCKDEP
> -#define ___GFP_NOLOCKDEP       0x1000000u
> +#define ___GFP_NOLOCKDEP       0x2000000u
>  #else
>  #define ___GFP_NOLOCKDEP       0
>  #endif
> @@ -233,17 +234,22 @@ struct vm_area_struct;
>   *
>   * %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
>   * __GFP_ZERO is set.
> + *
> + * %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
> + * on deallocation. Typically used for userspace pages. Currently only has an
> + * effect in HW tags mode, and only if a command line option is set.
>   */
>  #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
>  #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
>  #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
>  #define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
> +#define __GFP_SKIP_KASAN_POISON        ((__force gfp_t)___GFP_SKIP_KASAN_POISON)
>
>  /* Disable lockdep for GFP context tracking */
>  #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)
>
>  /* Room for N __GFP_FOO bits */
> -#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))
> +#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
>
>  /**
> @@ -320,7 +326,8 @@ struct vm_area_struct;
>  #define GFP_NOWAIT     (__GFP_KSWAPD_RECLAIM)
>  #define GFP_NOIO       (__GFP_RECLAIM)
>  #define GFP_NOFS       (__GFP_RECLAIM | __GFP_IO)
> -#define GFP_USER       (__GFP_RECLAIM | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
> +#define GFP_USER       (__GFP_RECLAIM | __GFP_IO | __GFP_FS | \
> +                        __GFP_HARDWALL | __GFP_SKIP_KASAN_POISON)
>  #define GFP_DMA                __GFP_DMA
>  #define GFP_DMA32      __GFP_DMA32
>  #define GFP_HIGHUSER   (GFP_USER | __GFP_HIGHMEM)
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 04a34c08e0a6..40e2c5000585 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -137,6 +137,9 @@ enum pageflags {
>  #endif
>  #ifdef CONFIG_64BIT
>         PG_arch_2,
> +#endif
> +#ifdef CONFIG_KASAN_HW_TAGS
> +       PG_skip_kasan_poison,
>  #endif
>         __NR_PAGEFLAGS,
>
> @@ -443,6 +446,12 @@ TESTCLEARFLAG(Young, young, PF_ANY)
>  PAGEFLAG(Idle, idle, PF_ANY)
>  #endif
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +PAGEFLAG(SkipKASanPoison, skip_kasan_poison, PF_HEAD)
> +#else
> +PAGEFLAG_FALSE(SkipKASanPoison)
> +#endif
> +
>  /*
>   * PageReported() is used to track reported free pages within the Buddy
>   * allocator. We can use the non-atomic version of the test and set
> diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
> index 629c7a0eaff2..390270e00a1d 100644
> --- a/include/trace/events/mmflags.h
> +++ b/include/trace/events/mmflags.h
> @@ -85,6 +85,12 @@
>  #define IF_HAVE_PG_ARCH_2(flag,string)
>  #endif
>
> +#ifdef CONFIG_KASAN_HW_TAGS
> +#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string) ,{1UL << flag, string}
> +#else
> +#define IF_HAVE_PG_SKIP_KASAN_POISON(flag,string)
> +#endif
> +
>  #define __def_pageflag_names                                           \
>         {1UL << PG_locked,              "locked"        },              \
>         {1UL << PG_waiters,             "waiters"       },              \
> @@ -112,7 +118,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached,    "uncached"      )               \
>  IF_HAVE_PG_HWPOISON(PG_hwpoison,       "hwpoison"      )               \
>  IF_HAVE_PG_IDLE(PG_young,              "young"         )               \
>  IF_HAVE_PG_IDLE(PG_idle,               "idle"          )               \
> -IF_HAVE_PG_ARCH_2(PG_arch_2,           "arch_2"        )
> +IF_HAVE_PG_ARCH_2(PG_arch_2,           "arch_2"        )               \
> +IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison")
>
>  #define show_page_flags(flags)                                         \
>         (flags) ? __print_flags(flags, "|",                             \
> diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> index 34362c8d0955..954d5c2f7683 100644
> --- a/mm/kasan/hw_tags.c
> +++ b/mm/kasan/hw_tags.c
> @@ -238,10 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
>         return &alloc_meta->free_track[0];
>  }
>
> +static bool skip_user_poison_on_free;
> +static int __init skip_user_poison_on_free_param(char *buf)
> +{
> +       return kstrtobool(buf, &skip_user_poison_on_free);
> +}
> +early_param("kasan.skip_user_poison_on_free", skip_user_poison_on_free_param);
> +
>  void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
>  {
>         bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> +       if (skip_user_poison_on_free && (flags & __GFP_SKIP_KASAN_POISON))
> +               SetPageSkipKASanPoison(page);
> +
>         if (flags & __GFP_ZEROTAGS) {
>                 int i;
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 24e6f668ef73..2c3ac15ddd54 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -394,11 +394,12 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>   * on-demand allocation and then freed again before the deferred pages
>   * initialization is done, but this is not likely to happen.
>   */
> -static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
>  {
>         return static_branch_unlikely(&deferred_pages) ||
>                (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -               (fpi_flags & FPI_SKIP_KASAN_POISON));
> +               (fpi_flags & FPI_SKIP_KASAN_POISON)) ||
> +              PageSkipKASanPoison(page);
>  }
>
>  /* Returns true if the struct page for the pfn is uninitialised */
> @@ -449,10 +450,11 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
>         return false;
>  }
>  #else
> -static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> +static inline bool should_skip_kasan_poison(struct page *page, fpi_t fpi_flags)
>  {
>         return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> -               (fpi_flags & FPI_SKIP_KASAN_POISON));
> +               (fpi_flags & FPI_SKIP_KASAN_POISON)) ||
> +              PageSkipKASanPoison(page);
>  }
>
>  static inline bool early_page_uninitialised(unsigned long pfn)
> @@ -1244,7 +1246,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
>                         unsigned int order, bool check_free, fpi_t fpi_flags)
>  {
>         int bad = 0;
> -       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
> +       bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags);
>
>         VM_BUG_ON_PAGE(PageTail(page), page);
>
> --
> 2.31.1.607.g51e8a6a459-goog
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-12 20:09   ` Peter Collingbourne
@ 2021-05-26 10:12     ` Marco Elver
  -1 siblings, 0 replies; 28+ messages in thread
From: Marco Elver @ 2021-05-26 10:12 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov, linux-mm,
	linux-arm-kernel, kasan-dev

On Wed, May 12, 2021 at 01:09PM -0700, Peter Collingbourne wrote:
[...] 
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> +void kasan_free_pages(struct page *page, unsigned int order);
> +
>  #else /* CONFIG_KASAN_HW_TAGS */
>  
>  static inline bool kasan_enabled(void)
>  {
> +#ifdef CONFIG_KASAN
>  	return true;
> +#else
> +	return false;
> +#endif
>  }

Just

	return IS_ENABLED(CONFIG_KASAN);

>  static inline bool kasan_has_integrated_init(void)
> @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
>  	return false;
>  }
>  
> +static __always_inline void kasan_alloc_pages(struct page *page,
> +					      unsigned int order, gfp_t flags)
> +{
> +	/* Only available for integrated init. */
> +	BUILD_BUG();
> +}
> +
> +static __always_inline void kasan_free_pages(struct page *page,
> +					     unsigned int order)
> +{
> +	/* Only available for integrated init. */
> +	BUILD_BUG();
> +}

This *should* always work, as long as the compiler optimizes everything
like we expect.

But: In this case, I think this is sign that the interface design can be
improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
__must_check' to indicate if kasan takes care of init?

The variants here would simply return kasan_has_integrated_init().

That way, there'd be no need for the BUILD_BUG()s and the interface
becomes harder to misuse by design.

Also, given that kasan_{alloc,free}_pages() initializes memory, this is
an opportunity to just give them a better name. Perhaps

	/* Returns true if KASAN took care of initialization, false otherwise. */
	bool __must_check kasan_alloc_pages_try_init(struct page *page, unsigned int order, gfp_t flags);
	bool __must_check kasan_free_pages_try_init(struct page *page, unsigned int order);

[...]
> -	init = want_init_on_free();
> -	if (init && !kasan_has_integrated_init())
> -		kernel_init_free_pages(page, 1 << order);
> -	kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> +	if (kasan_has_integrated_init()) {
> +		if (!skip_kasan_poison)
> +			kasan_free_pages(page, order);

I think kasan_free_pages() could return a bool, and this would become

	if (skip_kasan_poison || !kasan_free_pages(...)) {
		...

> +	} else {
> +		bool init = want_init_on_free();
> +
> +		if (init)
> +			kernel_init_free_pages(page, 1 << order);
> +		if (!skip_kasan_poison)
> +			kasan_poison_pages(page, order, init);
> +	}
>  
>  	/*
>  	 * arch_free_page() can make the page's contents inaccessible.  s390
> @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
>  inline void post_alloc_hook(struct page *page, unsigned int order,
>  				gfp_t gfp_flags)
>  {
> -	bool init;
> -
>  	set_page_private(page, 0);
>  	set_page_refcounted(page);
>  
> @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>  	 * kasan_alloc_pages and kernel_init_free_pages must be
>  	 * kept together to avoid discrepancies in behavior.
>  	 */
> -	init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> -	kasan_alloc_pages(page, order, init);
> -	if (init && !kasan_has_integrated_init())
> -		kernel_init_free_pages(page, 1 << order);
> +	if (kasan_has_integrated_init()) {
> +		kasan_alloc_pages(page, order, gfp_flags);

It looks to me that kasan_alloc_pages() could return a bool, and this
would become

	if (!kasan_alloc_pages(...)) {
		...

> +	} else {
> +		bool init =
> +			!want_init_on_free() && want_init_on_alloc(gfp_flags);
> +

[ No need for line-break (for cases like this the kernel is fine with up
to 100 cols if it improves readability). ]

> +		kasan_unpoison_pages(page, order, init);
> +		if (init)
> +			kernel_init_free_pages(page, 1 << order);
> +	}

Thoughts?

Thanks,
-- Marco


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-26 10:12     ` Marco Elver
  0 siblings, 0 replies; 28+ messages in thread
From: Marco Elver @ 2021-05-26 10:12 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov, linux-mm,
	linux-arm-kernel, kasan-dev

On Wed, May 12, 2021 at 01:09PM -0700, Peter Collingbourne wrote:
[...] 
> +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> +void kasan_free_pages(struct page *page, unsigned int order);
> +
>  #else /* CONFIG_KASAN_HW_TAGS */
>  
>  static inline bool kasan_enabled(void)
>  {
> +#ifdef CONFIG_KASAN
>  	return true;
> +#else
> +	return false;
> +#endif
>  }

Just

	return IS_ENABLED(CONFIG_KASAN);

>  static inline bool kasan_has_integrated_init(void)
> @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
>  	return false;
>  }
>  
> +static __always_inline void kasan_alloc_pages(struct page *page,
> +					      unsigned int order, gfp_t flags)
> +{
> +	/* Only available for integrated init. */
> +	BUILD_BUG();
> +}
> +
> +static __always_inline void kasan_free_pages(struct page *page,
> +					     unsigned int order)
> +{
> +	/* Only available for integrated init. */
> +	BUILD_BUG();
> +}

This *should* always work, as long as the compiler optimizes everything
like we expect.

But: In this case, I think this is sign that the interface design can be
improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
__must_check' to indicate if kasan takes care of init?

The variants here would simply return kasan_has_integrated_init().

That way, there'd be no need for the BUILD_BUG()s and the interface
becomes harder to misuse by design.

Also, given that kasan_{alloc,free}_pages() initializes memory, this is
an opportunity to just give them a better name. Perhaps

	/* Returns true if KASAN took care of initialization, false otherwise. */
	bool __must_check kasan_alloc_pages_try_init(struct page *page, unsigned int order, gfp_t flags);
	bool __must_check kasan_free_pages_try_init(struct page *page, unsigned int order);

[...]
> -	init = want_init_on_free();
> -	if (init && !kasan_has_integrated_init())
> -		kernel_init_free_pages(page, 1 << order);
> -	kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> +	if (kasan_has_integrated_init()) {
> +		if (!skip_kasan_poison)
> +			kasan_free_pages(page, order);

I think kasan_free_pages() could return a bool, and this would become

	if (skip_kasan_poison || !kasan_free_pages(...)) {
		...

> +	} else {
> +		bool init = want_init_on_free();
> +
> +		if (init)
> +			kernel_init_free_pages(page, 1 << order);
> +		if (!skip_kasan_poison)
> +			kasan_poison_pages(page, order, init);
> +	}
>  
>  	/*
>  	 * arch_free_page() can make the page's contents inaccessible.  s390
> @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
>  inline void post_alloc_hook(struct page *page, unsigned int order,
>  				gfp_t gfp_flags)
>  {
> -	bool init;
> -
>  	set_page_private(page, 0);
>  	set_page_refcounted(page);
>  
> @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
>  	 * kasan_alloc_pages and kernel_init_free_pages must be
>  	 * kept together to avoid discrepancies in behavior.
>  	 */
> -	init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> -	kasan_alloc_pages(page, order, init);
> -	if (init && !kasan_has_integrated_init())
> -		kernel_init_free_pages(page, 1 << order);
> +	if (kasan_has_integrated_init()) {
> +		kasan_alloc_pages(page, order, gfp_flags);

It looks to me that kasan_alloc_pages() could return a bool, and this
would become

	if (!kasan_alloc_pages(...)) {
		...

> +	} else {
> +		bool init =
> +			!want_init_on_free() && want_init_on_alloc(gfp_flags);
> +

[ No need for line-break (for cases like this the kernel is fine with up
to 100 cols if it improves readability). ]

> +		kasan_unpoison_pages(page, order, init);
> +		if (init)
> +			kernel_init_free_pages(page, 1 << order);
> +	}

Thoughts?

Thanks,
-- Marco

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
  2021-05-25 22:06     ` Andrey Konovalov
@ 2021-05-26 10:45       ` Jann Horn
  -1 siblings, 0 replies; 28+ messages in thread
From: Jann Horn @ 2021-05-26 10:45 UTC (permalink / raw)
  To: Andrey Konovalov, Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	Linux ARM

On Wed, May 26, 2021 at 12:07 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> > Poisoning freed pages protects against kernel use-after-free. The
> > likelihood of such a bug involving kernel pages is significantly higher
> > than that for user pages. At the same time, poisoning freed pages can
> > impose a significant performance cost, which cannot always be justified
> > for user pages given the lower probability of finding a bug. Therefore,
> > make it possible to configure the kernel to disable freed user page
> > poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> > command line option.
>
> So the potential scenario that would be undetectable with
> kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
> page and maps it for userspace, 2) the page gets freed in the kernel,
> 3) kernel accesses the page leading to a use-after-free. Is this
> correct?
>
> If bugs involving use-after-free accesses on user pages is something
> that is extremely rare, perhaps we could just change the default and
> avoid adding a command line switch.
>
> Jann, maybe you have an idea of how common something like this is or
> have other inputs?

GFP_USER is kind of a squishy concept, and if you grep around for it
in the kernel tree, you can see it being used for all kinds of things
- including SKBs in some weird ISDN driver, various types of BPF
allocations, and so on. It's probably the wrong flag to hook if you
want something that means "these pages will mostly be accessed from
userspace".

My guess is that what pcc@ is actually interested in are probably
mainly anonymous pages, and to a lesser degree also page cache pages?
Those use the more specific GFP_HIGHUSER_MOVABLE (which indicates that
the kernel will usually not be holding any direct pointers to the page
outside of rmap/pagecache logic, and that any kernel access to the
pages will be using the kmap API).

It's probably safe to assume that the majority of kernel bugs won't
directly involve GFP_HIGHUSER_MOVABLE memory - that's probably mostly
only going to happen if there are bugs in code that grabs pages with
get_user_pages* and then kmap()s them, or if there's something broken
in the pipe logic, or maybe an OOB issue in filesystem parsing code
(?), or something like that.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
@ 2021-05-26 10:45       ` Jann Horn
  0 siblings, 0 replies; 28+ messages in thread
From: Jann Horn @ 2021-05-26 10:45 UTC (permalink / raw)
  To: Andrey Konovalov, Peter Collingbourne
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	Linux ARM

On Wed, May 26, 2021 at 12:07 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> > Poisoning freed pages protects against kernel use-after-free. The
> > likelihood of such a bug involving kernel pages is significantly higher
> > than that for user pages. At the same time, poisoning freed pages can
> > impose a significant performance cost, which cannot always be justified
> > for user pages given the lower probability of finding a bug. Therefore,
> > make it possible to configure the kernel to disable freed user page
> > poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> > command line option.
>
> So the potential scenario that would be undetectable with
> kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
> page and maps it for userspace, 2) the page gets freed in the kernel,
> 3) kernel accesses the page leading to a use-after-free. Is this
> correct?
>
> If bugs involving use-after-free accesses on user pages is something
> that is extremely rare, perhaps we could just change the default and
> avoid adding a command line switch.
>
> Jann, maybe you have an idea of how common something like this is or
> have other inputs?

GFP_USER is kind of a squishy concept, and if you grep around for it
in the kernel tree, you can see it being used for all kinds of things
- including SKBs in some weird ISDN driver, various types of BPF
allocations, and so on. It's probably the wrong flag to hook if you
want something that means "these pages will mostly be accessed from
userspace".

My guess is that what pcc@ is actually interested in are probably
mainly anonymous pages, and to a lesser degree also page cache pages?
Those use the more specific GFP_HIGHUSER_MOVABLE (which indicates that
the kernel will usually not be holding any direct pointers to the page
outside of rmap/pagecache logic, and that any kernel access to the
pages will be using the kmap API).

It's probably safe to assume that the majority of kernel bugs won't
directly involve GFP_HIGHUSER_MOVABLE memory - that's probably mostly
only going to happen if there are bugs in code that grabs pages with
get_user_pages* and then kmap()s them, or if there's something broken
in the pipe logic, or maybe an OOB issue in filesystem parsing code
(?), or something like that.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-26 10:12     ` Marco Elver
@ 2021-05-26 19:27       ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-26 19:27 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM, kasan-dev

On Wed, May 26, 2021 at 3:12 AM Marco Elver <elver@google.com> wrote:
>
> On Wed, May 12, 2021 at 01:09PM -0700, Peter Collingbourne wrote:
> [...]
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> > +void kasan_free_pages(struct page *page, unsigned int order);
> > +
> >  #else /* CONFIG_KASAN_HW_TAGS */
> >
> >  static inline bool kasan_enabled(void)
> >  {
> > +#ifdef CONFIG_KASAN
> >       return true;
> > +#else
> > +     return false;
> > +#endif
> >  }
>
> Just
>
>         return IS_ENABLED(CONFIG_KASAN);

Will do.

> >  static inline bool kasan_has_integrated_init(void)
> > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> >       return false;
> >  }
> >
> > +static __always_inline void kasan_alloc_pages(struct page *page,
> > +                                           unsigned int order, gfp_t flags)
> > +{
> > +     /* Only available for integrated init. */
> > +     BUILD_BUG();
> > +}
> > +
> > +static __always_inline void kasan_free_pages(struct page *page,
> > +                                          unsigned int order)
> > +{
> > +     /* Only available for integrated init. */
> > +     BUILD_BUG();
> > +}
>
> This *should* always work, as long as the compiler optimizes everything
> like we expect.

Yeah, as I mentioned to Catalin on an earlier revision I'm not a fan
of relying on the compiler optimizing this away, but it looks like
we're already relying on this elsewhere in the kernel.

> But: In this case, I think this is sign that the interface design can be
> improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
> __must_check' to indicate if kasan takes care of init?

I considered a number of different approaches including something like
that before settling on the one in this patch. One consideration was
that we should avoid involving KASAN in normal execution as much as
possible, in order to make the normal code path as comprehensible as
possible. With an approach where alloc/free return a bool the reader
needs to understand what the KASAN alloc/free functions do in the
normal case. Whereas with an approach where an "accessor" function on
the KASAN side returns a bool, it's more obvious that the code has a
"normal path" and a "KASAN path", and readers who only care about the
normal path can ignore the KASAN path.

Does that make sense? I don't feel too strongly so I can change
alloc/free to return a bool if you don't agree.

> The variants here would simply return kasan_has_integrated_init().
>
> That way, there'd be no need for the BUILD_BUG()s and the interface
> becomes harder to misuse by design.
>
> Also, given that kasan_{alloc,free}_pages() initializes memory, this is
> an opportunity to just give them a better name. Perhaps
>
>         /* Returns true if KASAN took care of initialization, false otherwise. */
>         bool __must_check kasan_alloc_pages_try_init(struct page *page, unsigned int order, gfp_t flags);
>         bool __must_check kasan_free_pages_try_init(struct page *page, unsigned int order);

I considered changing the name but concluded that we probably
shouldn't try to pack too much information into the name. With a code
flow like:

if (kasan_has_integrated_init()) {
  kasan_alloc_pages();
} else {
  kernel_init_free_pages();
}

I think it's probably clear enough that kasan_alloc_pages() is doing
the stuff in kernel_init_free_pages() as well.

> [...]
> > -     init = want_init_on_free();
> > -     if (init && !kasan_has_integrated_init())
> > -             kernel_init_free_pages(page, 1 << order);
> > -     kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> > +     if (kasan_has_integrated_init()) {
> > +             if (!skip_kasan_poison)
> > +                     kasan_free_pages(page, order);
>
> I think kasan_free_pages() could return a bool, and this would become
>
>         if (skip_kasan_poison || !kasan_free_pages(...)) {
>                 ...
>
> > +     } else {
> > +             bool init = want_init_on_free();
> > +
> > +             if (init)
> > +                     kernel_init_free_pages(page, 1 << order);
> > +             if (!skip_kasan_poison)
> > +                     kasan_poison_pages(page, order, init);
> > +     }
> >
> >       /*
> >        * arch_free_page() can make the page's contents inaccessible.  s390
> > @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
> >  inline void post_alloc_hook(struct page *page, unsigned int order,
> >                               gfp_t gfp_flags)
> >  {
> > -     bool init;
> > -
> >       set_page_private(page, 0);
> >       set_page_refcounted(page);
> >
> > @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
> >        * kasan_alloc_pages and kernel_init_free_pages must be
> >        * kept together to avoid discrepancies in behavior.
> >        */
> > -     init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> > -     kasan_alloc_pages(page, order, init);
> > -     if (init && !kasan_has_integrated_init())
> > -             kernel_init_free_pages(page, 1 << order);
> > +     if (kasan_has_integrated_init()) {
> > +             kasan_alloc_pages(page, order, gfp_flags);
>
> It looks to me that kasan_alloc_pages() could return a bool, and this
> would become
>
>         if (!kasan_alloc_pages(...)) {
>                 ...
>
> > +     } else {
> > +             bool init =
> > +                     !want_init_on_free() && want_init_on_alloc(gfp_flags);
> > +
>
> [ No need for line-break (for cases like this the kernel is fine with up
> to 100 cols if it improves readability). ]

Okay, I'll make that change.

Peter


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-26 19:27       ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-26 19:27 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM, kasan-dev

On Wed, May 26, 2021 at 3:12 AM Marco Elver <elver@google.com> wrote:
>
> On Wed, May 12, 2021 at 01:09PM -0700, Peter Collingbourne wrote:
> [...]
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> > +void kasan_free_pages(struct page *page, unsigned int order);
> > +
> >  #else /* CONFIG_KASAN_HW_TAGS */
> >
> >  static inline bool kasan_enabled(void)
> >  {
> > +#ifdef CONFIG_KASAN
> >       return true;
> > +#else
> > +     return false;
> > +#endif
> >  }
>
> Just
>
>         return IS_ENABLED(CONFIG_KASAN);

Will do.

> >  static inline bool kasan_has_integrated_init(void)
> > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> >       return false;
> >  }
> >
> > +static __always_inline void kasan_alloc_pages(struct page *page,
> > +                                           unsigned int order, gfp_t flags)
> > +{
> > +     /* Only available for integrated init. */
> > +     BUILD_BUG();
> > +}
> > +
> > +static __always_inline void kasan_free_pages(struct page *page,
> > +                                          unsigned int order)
> > +{
> > +     /* Only available for integrated init. */
> > +     BUILD_BUG();
> > +}
>
> This *should* always work, as long as the compiler optimizes everything
> like we expect.

Yeah, as I mentioned to Catalin on an earlier revision I'm not a fan
of relying on the compiler optimizing this away, but it looks like
we're already relying on this elsewhere in the kernel.

> But: In this case, I think this is sign that the interface design can be
> improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
> __must_check' to indicate if kasan takes care of init?

I considered a number of different approaches including something like
that before settling on the one in this patch. One consideration was
that we should avoid involving KASAN in normal execution as much as
possible, in order to make the normal code path as comprehensible as
possible. With an approach where alloc/free return a bool the reader
needs to understand what the KASAN alloc/free functions do in the
normal case. Whereas with an approach where an "accessor" function on
the KASAN side returns a bool, it's more obvious that the code has a
"normal path" and a "KASAN path", and readers who only care about the
normal path can ignore the KASAN path.

Does that make sense? I don't feel too strongly so I can change
alloc/free to return a bool if you don't agree.

> The variants here would simply return kasan_has_integrated_init().
>
> That way, there'd be no need for the BUILD_BUG()s and the interface
> becomes harder to misuse by design.
>
> Also, given that kasan_{alloc,free}_pages() initializes memory, this is
> an opportunity to just give them a better name. Perhaps
>
>         /* Returns true if KASAN took care of initialization, false otherwise. */
>         bool __must_check kasan_alloc_pages_try_init(struct page *page, unsigned int order, gfp_t flags);
>         bool __must_check kasan_free_pages_try_init(struct page *page, unsigned int order);

I considered changing the name but concluded that we probably
shouldn't try to pack too much information into the name. With a code
flow like:

if (kasan_has_integrated_init()) {
  kasan_alloc_pages();
} else {
  kernel_init_free_pages();
}

I think it's probably clear enough that kasan_alloc_pages() is doing
the stuff in kernel_init_free_pages() as well.

> [...]
> > -     init = want_init_on_free();
> > -     if (init && !kasan_has_integrated_init())
> > -             kernel_init_free_pages(page, 1 << order);
> > -     kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> > +     if (kasan_has_integrated_init()) {
> > +             if (!skip_kasan_poison)
> > +                     kasan_free_pages(page, order);
>
> I think kasan_free_pages() could return a bool, and this would become
>
>         if (skip_kasan_poison || !kasan_free_pages(...)) {
>                 ...
>
> > +     } else {
> > +             bool init = want_init_on_free();
> > +
> > +             if (init)
> > +                     kernel_init_free_pages(page, 1 << order);
> > +             if (!skip_kasan_poison)
> > +                     kasan_poison_pages(page, order, init);
> > +     }
> >
> >       /*
> >        * arch_free_page() can make the page's contents inaccessible.  s390
> > @@ -2324,8 +2324,6 @@ static bool check_new_pages(struct page *page, unsigned int order)
> >  inline void post_alloc_hook(struct page *page, unsigned int order,
> >                               gfp_t gfp_flags)
> >  {
> > -     bool init;
> > -
> >       set_page_private(page, 0);
> >       set_page_refcounted(page);
> >
> > @@ -2344,10 +2342,16 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
> >        * kasan_alloc_pages and kernel_init_free_pages must be
> >        * kept together to avoid discrepancies in behavior.
> >        */
> > -     init = !want_init_on_free() && want_init_on_alloc(gfp_flags);
> > -     kasan_alloc_pages(page, order, init);
> > -     if (init && !kasan_has_integrated_init())
> > -             kernel_init_free_pages(page, 1 << order);
> > +     if (kasan_has_integrated_init()) {
> > +             kasan_alloc_pages(page, order, gfp_flags);
>
> It looks to me that kasan_alloc_pages() could return a bool, and this
> would become
>
>         if (!kasan_alloc_pages(...)) {
>                 ...
>
> > +     } else {
> > +             bool init =
> > +                     !want_init_on_free() && want_init_on_alloc(gfp_flags);
> > +
>
> [ No need for line-break (for cases like this the kernel is fine with up
> to 100 cols if it improves readability). ]

Okay, I'll make that change.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-26 19:27       ` Peter Collingbourne
@ 2021-05-26 19:54         ` Marco Elver
  -1 siblings, 0 replies; 28+ messages in thread
From: Marco Elver @ 2021-05-26 19:54 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM, kasan-dev

On Wed, 26 May 2021 at 21:28, Peter Collingbourne <pcc@google.com> wrote:
[...]
> > >  static inline bool kasan_has_integrated_init(void)
> > > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> > >       return false;
> > >  }
> > >
> > > +static __always_inline void kasan_alloc_pages(struct page *page,
> > > +                                           unsigned int order, gfp_t flags)
> > > +{
> > > +     /* Only available for integrated init. */
> > > +     BUILD_BUG();
> > > +}
> > > +
> > > +static __always_inline void kasan_free_pages(struct page *page,
> > > +                                          unsigned int order)
> > > +{
> > > +     /* Only available for integrated init. */
> > > +     BUILD_BUG();
> > > +}
> >
> > This *should* always work, as long as the compiler optimizes everything
> > like we expect.
>
> Yeah, as I mentioned to Catalin on an earlier revision I'm not a fan
> of relying on the compiler optimizing this away, but it looks like
> we're already relying on this elsewhere in the kernel.

That's true, and it's also how BUILD_BUG() works underneath (it calls
a  __attribute__((error(msg))) function guarded by a condition, or in
this case without a condition...  new code should usually use
static_assert() but that's obviously not possible here). In fact, if
the kernel is built without optimizations, BUILD_BUG() turns into
no-ops.

And just in case, I do not mind the BUILD_BUG(), because it should always work.

> > But: In this case, I think this is sign that the interface design can be
> > improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
> > __must_check' to indicate if kasan takes care of init?
>
> I considered a number of different approaches including something like
> that before settling on the one in this patch. One consideration was
> that we should avoid involving KASAN in normal execution as much as
> possible, in order to make the normal code path as comprehensible as
> possible. With an approach where alloc/free return a bool the reader
> needs to understand what the KASAN alloc/free functions do in the
> normal case. Whereas with an approach where an "accessor" function on
> the KASAN side returns a bool, it's more obvious that the code has a
> "normal path" and a "KASAN path", and readers who only care about the
> normal path can ignore the KASAN path.
>
> Does that make sense? I don't feel too strongly so I can change
> alloc/free to return a bool if you don't agree.

If this had been considered, then that's fair. I just wanted to point
it out in case it hadn't.

Let's leave as-is.

I also just noticed that we also pass 'init' to kasan_poison_pages(..,
init) in the !kasan_has_integrated_init() case which might be
confusing.

Thanks,
-- Marco


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-26 19:54         ` Marco Elver
  0 siblings, 0 replies; 28+ messages in thread
From: Marco Elver @ 2021-05-26 19:54 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM, kasan-dev

On Wed, 26 May 2021 at 21:28, Peter Collingbourne <pcc@google.com> wrote:
[...]
> > >  static inline bool kasan_has_integrated_init(void)
> > > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> > >       return false;
> > >  }
> > >
> > > +static __always_inline void kasan_alloc_pages(struct page *page,
> > > +                                           unsigned int order, gfp_t flags)
> > > +{
> > > +     /* Only available for integrated init. */
> > > +     BUILD_BUG();
> > > +}
> > > +
> > > +static __always_inline void kasan_free_pages(struct page *page,
> > > +                                          unsigned int order)
> > > +{
> > > +     /* Only available for integrated init. */
> > > +     BUILD_BUG();
> > > +}
> >
> > This *should* always work, as long as the compiler optimizes everything
> > like we expect.
>
> Yeah, as I mentioned to Catalin on an earlier revision I'm not a fan
> of relying on the compiler optimizing this away, but it looks like
> we're already relying on this elsewhere in the kernel.

That's true, and it's also how BUILD_BUG() works underneath (it calls
a  __attribute__((error(msg))) function guarded by a condition, or in
this case without a condition...  new code should usually use
static_assert() but that's obviously not possible here). In fact, if
the kernel is built without optimizations, BUILD_BUG() turns into
no-ops.

And just in case, I do not mind the BUILD_BUG(), because it should always work.

> > But: In this case, I think this is sign that the interface design can be
> > improved. Can we just make kasan_{alloc,free}_pages() return a 'bool
> > __must_check' to indicate if kasan takes care of init?
>
> I considered a number of different approaches including something like
> that before settling on the one in this patch. One consideration was
> that we should avoid involving KASAN in normal execution as much as
> possible, in order to make the normal code path as comprehensible as
> possible. With an approach where alloc/free return a bool the reader
> needs to understand what the KASAN alloc/free functions do in the
> normal case. Whereas with an approach where an "accessor" function on
> the KASAN side returns a bool, it's more obvious that the code has a
> "normal path" and a "KASAN path", and readers who only care about the
> normal path can ignore the KASAN path.
>
> Does that make sense? I don't feel too strongly so I can change
> alloc/free to return a bool if you don't agree.

If this had been considered, then that's fair. I just wanted to point
it out in case it hadn't.

Let's leave as-is.

I also just noticed that we also pass 'init' to kasan_poison_pages(..,
init) in the !kasan_has_integrated_init() case which might be
confusing.

Thanks,
-- Marco

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
  2021-05-25 22:00     ` Andrey Konovalov
@ 2021-05-28  1:04       ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-28  1:04 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	Linux ARM

On Tue, May 25, 2021 at 3:00 PM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> >
> > Currently with integrated init page_alloc.c needs to know whether
> > kasan_alloc_pages() will zero initialize memory, but this will start
> > becoming more complicated once we start adding tag initialization
> > support for user pages. To avoid page_alloc.c needing to know more
> > details of what integrated init will do, move the unpoisoning logic
> > for integrated init into the HW tags implementation. Currently the
> > logic is identical but it will diverge in subsequent patches.
> >
> > For symmetry do the same for poisoning although this logic will
> > be unaffected by subsequent patches.
> >
> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
> > ---
> > v3:
> > - use BUILD_BUG()
> >
> > v2:
> > - fix build with KASAN disabled
> >
> >  include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
> >  mm/kasan/common.c     |  4 +--
> >  mm/kasan/hw_tags.c    | 14 +++++++++
> >  mm/mempool.c          |  6 ++--
> >  mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
> >  5 files changed, 91 insertions(+), 55 deletions(-)
> >
> > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > index b1678a61e6a7..38061673e6ac 100644
> > --- a/include/linux/kasan.h
> > +++ b/include/linux/kasan.h
> > @@ -2,6 +2,7 @@
> >  #ifndef _LINUX_KASAN_H
> >  #define _LINUX_KASAN_H
> >
> > +#include <linux/bug.h>
> >  #include <linux/static_key.h>
> >  #include <linux/types.h>
> >
> > @@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
> >
> >  #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
> >
> > -#ifdef CONFIG_KASAN
> > -
> > -struct kasan_cache {
> > -       int alloc_meta_offset;
> > -       int free_meta_offset;
> > -       bool is_kmalloc;
> > -};
> > -
> >  #ifdef CONFIG_KASAN_HW_TAGS
> >
> >  DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> > @@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
> >         return kasan_enabled();
> >  }
> >
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> > +void kasan_free_pages(struct page *page, unsigned int order);
> > +
> >  #else /* CONFIG_KASAN_HW_TAGS */
> >
> >  static inline bool kasan_enabled(void)
> >  {
> > +#ifdef CONFIG_KASAN
> >         return true;
> > +#else
> > +       return false;
> > +#endif
> >  }
> >
> >  static inline bool kasan_has_integrated_init(void)
> > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> >         return false;
> >  }
> >
> > +static __always_inline void kasan_alloc_pages(struct page *page,
> > +                                             unsigned int order, gfp_t flags)
> > +{
> > +       /* Only available for integrated init. */
> > +       BUILD_BUG();
> > +}
> > +
> > +static __always_inline void kasan_free_pages(struct page *page,
> > +                                            unsigned int order)
> > +{
> > +       /* Only available for integrated init. */
> > +       BUILD_BUG();
> > +}
> > +
> >  #endif /* CONFIG_KASAN_HW_TAGS */
> >
> > +#ifdef CONFIG_KASAN
> > +
> > +struct kasan_cache {
> > +       int alloc_meta_offset;
> > +       int free_meta_offset;
> > +       bool is_kmalloc;
> > +};
> > +
> >  slab_flags_t __kasan_never_merge(void);
> >  static __always_inline slab_flags_t kasan_never_merge(void)
> >  {
> > @@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
> >                 __kasan_unpoison_range(addr, size);
> >  }
> >
> > -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
> > -static __always_inline void kasan_alloc_pages(struct page *page,
> > +void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
> > +static __always_inline void kasan_poison_pages(struct page *page,
> >                                                 unsigned int order, bool init)
> >  {
> >         if (kasan_enabled())
> > -               __kasan_alloc_pages(page, order, init);
> > +               __kasan_poison_pages(page, order, init);
> >  }
> >
> > -void __kasan_free_pages(struct page *page, unsigned int order, bool init);
> > -static __always_inline void kasan_free_pages(struct page *page,
> > -                                               unsigned int order, bool init)
> > +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
> > +static __always_inline void kasan_unpoison_pages(struct page *page,
> > +                                                unsigned int order, bool init)
> >  {
> >         if (kasan_enabled())
> > -               __kasan_free_pages(page, order, init);
> > +               __kasan_unpoison_pages(page, order, init);
> >  }
> >
> >  void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> > @@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
> >
> >  #else /* CONFIG_KASAN */
> >
> > -static inline bool kasan_enabled(void)
> > -{
> > -       return false;
> > -}
> > -static inline bool kasan_has_integrated_init(void)
> > -{
> > -       return false;
> > -}
> >  static inline slab_flags_t kasan_never_merge(void)
> >  {
> >         return 0;
> >  }
> >  static inline void kasan_unpoison_range(const void *address, size_t size) {}
> > -static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
> > -static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
> > +static inline void kasan_poison_pages(struct page *page, unsigned int order,
> > +                                     bool init) {}
> > +static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
> > +                                       bool init) {}
> >  static inline void kasan_cache_create(struct kmem_cache *cache,
> >                                       unsigned int *size,
> >                                       slab_flags_t *flags) {}
> > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > index 6bb87f2acd4e..0ecd293af344 100644
> > --- a/mm/kasan/common.c
> > +++ b/mm/kasan/common.c
> > @@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
> >         return 0;
> >  }
> >
> > -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> > +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
> >  {
> >         u8 tag;
> >         unsigned long i;
> > @@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> >         kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
> >  }
> >
> > -void __kasan_free_pages(struct page *page, unsigned int order, bool init)
> > +void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
> >  {
> >         if (likely(!PageHighMem(page)))
> >                 kasan_poison(page_address(page), PAGE_SIZE << order,
> > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> > index 4004388b4e4b..45e552cb9172 100644
> > --- a/mm/kasan/hw_tags.c
> > +++ b/mm/kasan/hw_tags.c
> > @@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
> >         return &alloc_meta->free_track[0];
> >  }
> >
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
> > +{
> > +       bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> This check is now duplicated. One check here, the same one in
> page_alloc.c. Please either add a helper that gets used in both
> places, or at least a comment that the checks must be kept in sync.

Added a comment in v4.

> > +
> > +       kasan_unpoison_pages(page, order, init);
> > +}
> > +
> > +void kasan_free_pages(struct page *page, unsigned int order)
> > +{
> > +       bool init = want_init_on_free();
>
> Same here.

Likewise.

> > +
> > +       kasan_poison_pages(page, order, init);
> > +}
> > +
> >  #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
> >
> >  void kasan_set_tagging_report_once(bool state)
> > diff --git a/mm/mempool.c b/mm/mempool.c
> > index a258cf4de575..0b8afbec3e35 100644
> > --- a/mm/mempool.c
> > +++ b/mm/mempool.c
> > @@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
> >         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
> >                 kasan_slab_free_mempool(element);
> >         else if (pool->alloc == mempool_alloc_pages)
> > -               kasan_free_pages(element, (unsigned long)pool->pool_data, false);
> > +               kasan_poison_pages(element, (unsigned long)pool->pool_data,
> > +                                  false);
> >  }
> >
> >  static void kasan_unpoison_element(mempool_t *pool, void *element)
> > @@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
> >         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
> >                 kasan_unpoison_range(element, __ksize(element));
> >         else if (pool->alloc == mempool_alloc_pages)
> > -               kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
> > +               kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
> > +                                    false);
> >  }
> >
> >  static __always_inline void add_element(mempool_t *pool, void *element)
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index aaa1655cf682..6e82a7f6fd6f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
> >  static DEFINE_STATIC_KEY_TRUE(deferred_pages);
> >
> >  /*
> > - * Calling kasan_free_pages() only after deferred memory initialization
> > + * Calling kasan_poison_pages() only after deferred memory initialization
> >   * has completed. Poisoning pages during deferred memory init will greatly
> >   * lengthen the process and cause problem in large memory systems as the
> >   * deferred pages initialization is done with interrupt disabled.
> > @@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
> >   * on-demand allocation and then freed again before the deferred pages
> >   * initialization is done, but this is not likely to happen.
> >   */
> > -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> > -                                               bool init, fpi_t fpi_flags)
> > +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> >  {
> > -       if (static_branch_unlikely(&deferred_pages))
> > -               return;
> > -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> > -               return;
> > -       kasan_free_pages(page, order, init);
> > +       return static_branch_unlikely(&deferred_pages) ||
> > +              (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > +               (fpi_flags & FPI_SKIP_KASAN_POISON));
> >  }
> >
> >  /* Returns true if the struct page for the pfn is uninitialised */
> > @@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
> >         return false;
> >  }
> >  #else
> > -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> > -                                               bool init, fpi_t fpi_flags)
> > +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> >  {
> > -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> > -               return;
> > -       kasan_free_pages(page, order, init);
> > +       return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > +               (fpi_flags & FPI_SKIP_KASAN_POISON));
> >  }
> >
> >  static inline bool early_page_uninitialised(unsigned long pfn)
> > @@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
> >                         unsigned int order, bool check_free, fpi_t fpi_flags)
> >  {
> >         int bad = 0;
> > -       bool init;
> > +       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
> >
> >         VM_BUG_ON_PAGE(PageTail(page), page);
> >
> > @@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
> >          * With hardware tag-based KASAN, memory tags must be set before the
> >          * page becomes unavailable via debug_pagealloc or arch_free_page.
> >          */
> > -       init = want_init_on_free();
> > -       if (init && !kasan_has_integrated_init())
> > -               kernel_init_free_pages(page, 1 << order);
> > -       kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> > +       if (kasan_has_integrated_init()) {
>
> Is it guaranteed that this branch will be eliminated when
> kasan_has_integrated_init() is static inline returning false? I know
> this works with macros, but I don't remember seeing cases with static
> inline functions. I guess it's the same, but mentioning just in case
> because BUILD_BUG() stood out.

Here's one example of where we rely on optimization after inlining to
eliminate a BUILD_BUG():
https://github.com/torvalds/linux/blob/3224374f7eb08fbb36d3963895da20ff274b8e6a/arch/arm64/include/asm/arm_dsu_pmu.h#L123

Peter


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init
@ 2021-05-28  1:04       ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-28  1:04 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Alexander Potapenko, Catalin Marinas, Vincenzo Frascino,
	Andrew Morton, Evgenii Stepanov, Linux Memory Management List,
	Linux ARM

On Tue, May 25, 2021 at 3:00 PM Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> >
> > Currently with integrated init page_alloc.c needs to know whether
> > kasan_alloc_pages() will zero initialize memory, but this will start
> > becoming more complicated once we start adding tag initialization
> > support for user pages. To avoid page_alloc.c needing to know more
> > details of what integrated init will do, move the unpoisoning logic
> > for integrated init into the HW tags implementation. Currently the
> > logic is identical but it will diverge in subsequent patches.
> >
> > For symmetry do the same for poisoning although this logic will
> > be unaffected by subsequent patches.
> >
> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > Link: https://linux-review.googlesource.com/id/I2c550234c6c4a893c48c18ff0c6ce658c7c67056
> > ---
> > v3:
> > - use BUILD_BUG()
> >
> > v2:
> > - fix build with KASAN disabled
> >
> >  include/linux/kasan.h | 66 +++++++++++++++++++++++++++----------------
> >  mm/kasan/common.c     |  4 +--
> >  mm/kasan/hw_tags.c    | 14 +++++++++
> >  mm/mempool.c          |  6 ++--
> >  mm/page_alloc.c       | 56 +++++++++++++++++++-----------------
> >  5 files changed, 91 insertions(+), 55 deletions(-)
> >
> > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > index b1678a61e6a7..38061673e6ac 100644
> > --- a/include/linux/kasan.h
> > +++ b/include/linux/kasan.h
> > @@ -2,6 +2,7 @@
> >  #ifndef _LINUX_KASAN_H
> >  #define _LINUX_KASAN_H
> >
> > +#include <linux/bug.h>
> >  #include <linux/static_key.h>
> >  #include <linux/types.h>
> >
> > @@ -79,14 +80,6 @@ static inline void kasan_disable_current(void) {}
> >
> >  #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
> >
> > -#ifdef CONFIG_KASAN
> > -
> > -struct kasan_cache {
> > -       int alloc_meta_offset;
> > -       int free_meta_offset;
> > -       bool is_kmalloc;
> > -};
> > -
> >  #ifdef CONFIG_KASAN_HW_TAGS
> >
> >  DECLARE_STATIC_KEY_FALSE(kasan_flag_enabled);
> > @@ -101,11 +94,18 @@ static inline bool kasan_has_integrated_init(void)
> >         return kasan_enabled();
> >  }
> >
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags);
> > +void kasan_free_pages(struct page *page, unsigned int order);
> > +
> >  #else /* CONFIG_KASAN_HW_TAGS */
> >
> >  static inline bool kasan_enabled(void)
> >  {
> > +#ifdef CONFIG_KASAN
> >         return true;
> > +#else
> > +       return false;
> > +#endif
> >  }
> >
> >  static inline bool kasan_has_integrated_init(void)
> > @@ -113,8 +113,30 @@ static inline bool kasan_has_integrated_init(void)
> >         return false;
> >  }
> >
> > +static __always_inline void kasan_alloc_pages(struct page *page,
> > +                                             unsigned int order, gfp_t flags)
> > +{
> > +       /* Only available for integrated init. */
> > +       BUILD_BUG();
> > +}
> > +
> > +static __always_inline void kasan_free_pages(struct page *page,
> > +                                            unsigned int order)
> > +{
> > +       /* Only available for integrated init. */
> > +       BUILD_BUG();
> > +}
> > +
> >  #endif /* CONFIG_KASAN_HW_TAGS */
> >
> > +#ifdef CONFIG_KASAN
> > +
> > +struct kasan_cache {
> > +       int alloc_meta_offset;
> > +       int free_meta_offset;
> > +       bool is_kmalloc;
> > +};
> > +
> >  slab_flags_t __kasan_never_merge(void);
> >  static __always_inline slab_flags_t kasan_never_merge(void)
> >  {
> > @@ -130,20 +152,20 @@ static __always_inline void kasan_unpoison_range(const void *addr, size_t size)
> >                 __kasan_unpoison_range(addr, size);
> >  }
> >
> > -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init);
> > -static __always_inline void kasan_alloc_pages(struct page *page,
> > +void __kasan_poison_pages(struct page *page, unsigned int order, bool init);
> > +static __always_inline void kasan_poison_pages(struct page *page,
> >                                                 unsigned int order, bool init)
> >  {
> >         if (kasan_enabled())
> > -               __kasan_alloc_pages(page, order, init);
> > +               __kasan_poison_pages(page, order, init);
> >  }
> >
> > -void __kasan_free_pages(struct page *page, unsigned int order, bool init);
> > -static __always_inline void kasan_free_pages(struct page *page,
> > -                                               unsigned int order, bool init)
> > +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init);
> > +static __always_inline void kasan_unpoison_pages(struct page *page,
> > +                                                unsigned int order, bool init)
> >  {
> >         if (kasan_enabled())
> > -               __kasan_free_pages(page, order, init);
> > +               __kasan_unpoison_pages(page, order, init);
> >  }
> >
> >  void __kasan_cache_create(struct kmem_cache *cache, unsigned int *size,
> > @@ -285,21 +307,15 @@ void kasan_restore_multi_shot(bool enabled);
> >
> >  #else /* CONFIG_KASAN */
> >
> > -static inline bool kasan_enabled(void)
> > -{
> > -       return false;
> > -}
> > -static inline bool kasan_has_integrated_init(void)
> > -{
> > -       return false;
> > -}
> >  static inline slab_flags_t kasan_never_merge(void)
> >  {
> >         return 0;
> >  }
> >  static inline void kasan_unpoison_range(const void *address, size_t size) {}
> > -static inline void kasan_alloc_pages(struct page *page, unsigned int order, bool init) {}
> > -static inline void kasan_free_pages(struct page *page, unsigned int order, bool init) {}
> > +static inline void kasan_poison_pages(struct page *page, unsigned int order,
> > +                                     bool init) {}
> > +static inline void kasan_unpoison_pages(struct page *page, unsigned int order,
> > +                                       bool init) {}
> >  static inline void kasan_cache_create(struct kmem_cache *cache,
> >                                       unsigned int *size,
> >                                       slab_flags_t *flags) {}
> > diff --git a/mm/kasan/common.c b/mm/kasan/common.c
> > index 6bb87f2acd4e..0ecd293af344 100644
> > --- a/mm/kasan/common.c
> > +++ b/mm/kasan/common.c
> > @@ -97,7 +97,7 @@ slab_flags_t __kasan_never_merge(void)
> >         return 0;
> >  }
> >
> > -void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> > +void __kasan_unpoison_pages(struct page *page, unsigned int order, bool init)
> >  {
> >         u8 tag;
> >         unsigned long i;
> > @@ -111,7 +111,7 @@ void __kasan_alloc_pages(struct page *page, unsigned int order, bool init)
> >         kasan_unpoison(page_address(page), PAGE_SIZE << order, init);
> >  }
> >
> > -void __kasan_free_pages(struct page *page, unsigned int order, bool init)
> > +void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
> >  {
> >         if (likely(!PageHighMem(page)))
> >                 kasan_poison(page_address(page), PAGE_SIZE << order,
> > diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
> > index 4004388b4e4b..45e552cb9172 100644
> > --- a/mm/kasan/hw_tags.c
> > +++ b/mm/kasan/hw_tags.c
> > @@ -238,6 +238,20 @@ struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
> >         return &alloc_meta->free_track[0];
> >  }
> >
> > +void kasan_alloc_pages(struct page *page, unsigned int order, gfp_t flags)
> > +{
> > +       bool init = !want_init_on_free() && want_init_on_alloc(flags);
>
> This check is now duplicated. One check here, the same one in
> page_alloc.c. Please either add a helper that gets used in both
> places, or at least a comment that the checks must be kept in sync.

Added a comment in v4.

> > +
> > +       kasan_unpoison_pages(page, order, init);
> > +}
> > +
> > +void kasan_free_pages(struct page *page, unsigned int order)
> > +{
> > +       bool init = want_init_on_free();
>
> Same here.

Likewise.

> > +
> > +       kasan_poison_pages(page, order, init);
> > +}
> > +
> >  #if IS_ENABLED(CONFIG_KASAN_KUNIT_TEST)
> >
> >  void kasan_set_tagging_report_once(bool state)
> > diff --git a/mm/mempool.c b/mm/mempool.c
> > index a258cf4de575..0b8afbec3e35 100644
> > --- a/mm/mempool.c
> > +++ b/mm/mempool.c
> > @@ -106,7 +106,8 @@ static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
> >         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
> >                 kasan_slab_free_mempool(element);
> >         else if (pool->alloc == mempool_alloc_pages)
> > -               kasan_free_pages(element, (unsigned long)pool->pool_data, false);
> > +               kasan_poison_pages(element, (unsigned long)pool->pool_data,
> > +                                  false);
> >  }
> >
> >  static void kasan_unpoison_element(mempool_t *pool, void *element)
> > @@ -114,7 +115,8 @@ static void kasan_unpoison_element(mempool_t *pool, void *element)
> >         if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
> >                 kasan_unpoison_range(element, __ksize(element));
> >         else if (pool->alloc == mempool_alloc_pages)
> > -               kasan_alloc_pages(element, (unsigned long)pool->pool_data, false);
> > +               kasan_unpoison_pages(element, (unsigned long)pool->pool_data,
> > +                                    false);
> >  }
> >
> >  static __always_inline void add_element(mempool_t *pool, void *element)
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index aaa1655cf682..6e82a7f6fd6f 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -382,7 +382,7 @@ int page_group_by_mobility_disabled __read_mostly;
> >  static DEFINE_STATIC_KEY_TRUE(deferred_pages);
> >
> >  /*
> > - * Calling kasan_free_pages() only after deferred memory initialization
> > + * Calling kasan_poison_pages() only after deferred memory initialization
> >   * has completed. Poisoning pages during deferred memory init will greatly
> >   * lengthen the process and cause problem in large memory systems as the
> >   * deferred pages initialization is done with interrupt disabled.
> > @@ -394,15 +394,11 @@ static DEFINE_STATIC_KEY_TRUE(deferred_pages);
> >   * on-demand allocation and then freed again before the deferred pages
> >   * initialization is done, but this is not likely to happen.
> >   */
> > -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> > -                                               bool init, fpi_t fpi_flags)
> > +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> >  {
> > -       if (static_branch_unlikely(&deferred_pages))
> > -               return;
> > -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> > -               return;
> > -       kasan_free_pages(page, order, init);
> > +       return static_branch_unlikely(&deferred_pages) ||
> > +              (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > +               (fpi_flags & FPI_SKIP_KASAN_POISON));
> >  }
> >
> >  /* Returns true if the struct page for the pfn is uninitialised */
> > @@ -453,13 +449,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
> >         return false;
> >  }
> >  #else
> > -static inline void kasan_free_nondeferred_pages(struct page *page, int order,
> > -                                               bool init, fpi_t fpi_flags)
> > +static inline bool should_skip_kasan_poison(fpi_t fpi_flags)
> >  {
> > -       if (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > -                       (fpi_flags & FPI_SKIP_KASAN_POISON))
> > -               return;
> > -       kasan_free_pages(page, order, init);
> > +       return (!IS_ENABLED(CONFIG_KASAN_GENERIC) &&
> > +               (fpi_flags & FPI_SKIP_KASAN_POISON));
> >  }
> >
> >  static inline bool early_page_uninitialised(unsigned long pfn)
> > @@ -1245,7 +1238,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
> >                         unsigned int order, bool check_free, fpi_t fpi_flags)
> >  {
> >         int bad = 0;
> > -       bool init;
> > +       bool skip_kasan_poison = should_skip_kasan_poison(fpi_flags);
> >
> >         VM_BUG_ON_PAGE(PageTail(page), page);
> >
> > @@ -1314,10 +1307,17 @@ static __always_inline bool free_pages_prepare(struct page *page,
> >          * With hardware tag-based KASAN, memory tags must be set before the
> >          * page becomes unavailable via debug_pagealloc or arch_free_page.
> >          */
> > -       init = want_init_on_free();
> > -       if (init && !kasan_has_integrated_init())
> > -               kernel_init_free_pages(page, 1 << order);
> > -       kasan_free_nondeferred_pages(page, order, init, fpi_flags);
> > +       if (kasan_has_integrated_init()) {
>
> Is it guaranteed that this branch will be eliminated when
> kasan_has_integrated_init() is static inline returning false? I know
> this works with macros, but I don't remember seeing cases with static
> inline functions. I guess it's the same, but mentioning just in case
> because BUILD_BUG() stood out.

Here's one example of where we rely on optimization after inlining to
eliminate a BUILD_BUG():
https://github.com/torvalds/linux/blob/3224374f7eb08fbb36d3963895da20ff274b8e6a/arch/arm64/include/asm/arm_dsu_pmu.h#L123

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
  2021-05-26 10:45       ` Jann Horn
@ 2021-05-28  1:05         ` Peter Collingbourne
  -1 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-28  1:05 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM

On Wed, May 26, 2021 at 3:45 AM Jann Horn <jannh@google.com> wrote:
>
> On Wed, May 26, 2021 at 12:07 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> > On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> > > Poisoning freed pages protects against kernel use-after-free. The
> > > likelihood of such a bug involving kernel pages is significantly higher
> > > than that for user pages. At the same time, poisoning freed pages can
> > > impose a significant performance cost, which cannot always be justified
> > > for user pages given the lower probability of finding a bug. Therefore,
> > > make it possible to configure the kernel to disable freed user page
> > > poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> > > command line option.
> >
> > So the potential scenario that would be undetectable with
> > kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
> > page and maps it for userspace, 2) the page gets freed in the kernel,
> > 3) kernel accesses the page leading to a use-after-free. Is this
> > correct?

Yes, that's correct.

> > If bugs involving use-after-free accesses on user pages is something
> > that is extremely rare, perhaps we could just change the default and
> > avoid adding a command line switch.
> >
> > Jann, maybe you have an idea of how common something like this is or
> > have other inputs?
>
> GFP_USER is kind of a squishy concept, and if you grep around for it
> in the kernel tree, you can see it being used for all kinds of things
> - including SKBs in some weird ISDN driver, various types of BPF
> allocations, and so on. It's probably the wrong flag to hook if you
> want something that means "these pages will mostly be accessed from
> userspace".
>
> My guess is that what pcc@ is actually interested in are probably
> mainly anonymous pages, and to a lesser degree also page cache pages?
> Those use the more specific GFP_HIGHUSER_MOVABLE (which indicates that
> the kernel will usually not be holding any direct pointers to the page
> outside of rmap/pagecache logic, and that any kernel access to the
> pages will be using the kmap API).
>
> It's probably safe to assume that the majority of kernel bugs won't
> directly involve GFP_HIGHUSER_MOVABLE memory - that's probably mostly
> only going to happen if there are bugs in code that grabs pages with
> get_user_pages* and then kmap()s them, or if there's something broken
> in the pipe logic, or maybe an OOB issue in filesystem parsing code
> (?), or something like that.

This makes sense to me. The pages that I'm most interested in getting
this treatment are indeed the anonymous and, to a lesser extent,
pagecache pages.

The GFP_HIGHUSER_MOVABLE restrictions to me imply a lack of direct
kernel access more strongly than GFP_USER, as you point out. Therefore
I agree with Andrey that we probably don't need a switch for this and
can just change the behavior for GFP_HIGHUSER_MOVABLE. I've done so in
v4, although this required a preparatory patch to merge
__alloc_zeroed_user_highpage() and
alloc_zeroed_user_highpage_movable() so that we actually use the
GFP_HIGHUSER_MOVABLE constant for anonymous pages.

> Peter, is the plan to have this poisoning disabled in production? Is
> there an estimate on slow this is?

Yes, that is the plan. I don't think I'm at liberty to provide exact
numbers, but given that the previous patches mean that at allocation
time we only touch pages once whether or not KASAN is enabled, the
most significant remaining KASAN-associated source of overhead for
user pages is the deallocation time overhead that I'm eliminating in
this patch.

Peter


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags
@ 2021-05-28  1:05         ` Peter Collingbourne
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Collingbourne @ 2021-05-28  1:05 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrey Konovalov, Alexander Potapenko, Catalin Marinas,
	Vincenzo Frascino, Andrew Morton, Evgenii Stepanov,
	Linux Memory Management List, Linux ARM

On Wed, May 26, 2021 at 3:45 AM Jann Horn <jannh@google.com> wrote:
>
> On Wed, May 26, 2021 at 12:07 AM Andrey Konovalov <andreyknvl@gmail.com> wrote:
> > On Wed, May 12, 2021 at 11:09 PM Peter Collingbourne <pcc@google.com> wrote:
> > > Poisoning freed pages protects against kernel use-after-free. The
> > > likelihood of such a bug involving kernel pages is significantly higher
> > > than that for user pages. At the same time, poisoning freed pages can
> > > impose a significant performance cost, which cannot always be justified
> > > for user pages given the lower probability of finding a bug. Therefore,
> > > make it possible to configure the kernel to disable freed user page
> > > poisoning when using HW tags via the new kasan.skip_user_poison_on_free
> > > command line option.
> >
> > So the potential scenario that would be undetectable with
> > kasan.skip_user_poison_on_free enabled is: 1) kernel allocates a user
> > page and maps it for userspace, 2) the page gets freed in the kernel,
> > 3) kernel accesses the page leading to a use-after-free. Is this
> > correct?

Yes, that's correct.

> > If bugs involving use-after-free accesses on user pages is something
> > that is extremely rare, perhaps we could just change the default and
> > avoid adding a command line switch.
> >
> > Jann, maybe you have an idea of how common something like this is or
> > have other inputs?
>
> GFP_USER is kind of a squishy concept, and if you grep around for it
> in the kernel tree, you can see it being used for all kinds of things
> - including SKBs in some weird ISDN driver, various types of BPF
> allocations, and so on. It's probably the wrong flag to hook if you
> want something that means "these pages will mostly be accessed from
> userspace".
>
> My guess is that what pcc@ is actually interested in are probably
> mainly anonymous pages, and to a lesser degree also page cache pages?
> Those use the more specific GFP_HIGHUSER_MOVABLE (which indicates that
> the kernel will usually not be holding any direct pointers to the page
> outside of rmap/pagecache logic, and that any kernel access to the
> pages will be using the kmap API).
>
> It's probably safe to assume that the majority of kernel bugs won't
> directly involve GFP_HIGHUSER_MOVABLE memory - that's probably mostly
> only going to happen if there are bugs in code that grabs pages with
> get_user_pages* and then kmap()s them, or if there's something broken
> in the pipe logic, or maybe an OOB issue in filesystem parsing code
> (?), or something like that.

This makes sense to me. The pages that I'm most interested in getting
this treatment are indeed the anonymous and, to a lesser extent,
pagecache pages.

The GFP_HIGHUSER_MOVABLE restrictions to me imply a lack of direct
kernel access more strongly than GFP_USER, as you point out. Therefore
I agree with Andrey that we probably don't need a switch for this and
can just change the behavior for GFP_HIGHUSER_MOVABLE. I've done so in
v4, although this required a preparatory patch to merge
__alloc_zeroed_user_highpage() and
alloc_zeroed_user_highpage_movable() so that we actually use the
GFP_HIGHUSER_MOVABLE constant for anonymous pages.

> Peter, is the plan to have this poisoning disabled in production? Is
> there an estimate on slow this is?

Yes, that is the plan. I don't think I'm at liberty to provide exact
numbers, but given that the previous patches mean that at allocation
time we only touch pages once whether or not KASAN is enabled, the
most significant remaining KASAN-associated source of overhead for
user pages is the deallocation time overhead that I'm eliminating in
this patch.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-05-28  1:22 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-12 20:09 [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages Peter Collingbourne
2021-05-12 20:09 ` Peter Collingbourne
2021-05-12 20:09 ` [PATCH v3 1/3] kasan: use separate (un)poison implementation for integrated init Peter Collingbourne
2021-05-12 20:09   ` Peter Collingbourne
2021-05-25 22:00   ` Andrey Konovalov
2021-05-25 22:00     ` Andrey Konovalov
2021-05-28  1:04     ` Peter Collingbourne
2021-05-28  1:04       ` Peter Collingbourne
2021-05-26 10:12   ` Marco Elver
2021-05-26 10:12     ` Marco Elver
2021-05-26 19:27     ` Peter Collingbourne
2021-05-26 19:27       ` Peter Collingbourne
2021-05-26 19:54       ` Marco Elver
2021-05-26 19:54         ` Marco Elver
2021-05-12 20:09 ` [PATCH v3 2/3] arm64: mte: handle tags zeroing at page allocation time Peter Collingbourne
2021-05-12 20:09   ` Peter Collingbourne
2021-05-25 22:00   ` Andrey Konovalov
2021-05-25 22:00     ` Andrey Konovalov
2021-05-12 20:09 ` [PATCH v3 3/3] kasan: allow freed user page poisoning to be disabled with HW tags Peter Collingbourne
2021-05-12 20:09   ` Peter Collingbourne
2021-05-25 22:06   ` Andrey Konovalov
2021-05-25 22:06     ` Andrey Konovalov
2021-05-26 10:45     ` Jann Horn
2021-05-26 10:45       ` Jann Horn
2021-05-28  1:05       ` Peter Collingbourne
2021-05-28  1:05         ` Peter Collingbourne
2021-05-25 19:03 ` [PATCH v3 0/3] arm64: improve efficiency of setting tags for user pages Peter Collingbourne
2021-05-25 19:03   ` Peter Collingbourne

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.