All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix
@ 2019-08-06 23:38 Daniel Axtens
  2019-08-06 23:38 ` [PATCH 1/4] kasan: allow arches to provide their own early shadow setup Daniel Axtens
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Daniel Axtens @ 2019-08-06 23:38 UTC (permalink / raw)
  To: aneesh.kumar, christophe.leroy, bsingharora
  Cc: linuxppc-dev, kasan-dev, Daniel Axtens

Building on the work of Christophe, Aneesh and Balbir, I've ported
KASAN to 64-bit Book3S kernels running on the Radix MMU.

It builds on top Christophe's work on 32bit. It also builds on my
generic KASAN_VMALLOC series, available at:
https://patchwork.kernel.org/project/linux-mm/list/?series=153209

This provides full inline instrumentation on radix, but does require
that you be able to specify the amount of memory on the system at
compile time. More details in patch 4.

Notable changes from the RFC:

 - I've dropped Book3E 64-bit for now.

 - Now instead of hacking into the KASAN core to disable module
   allocations, we use KASAN_VMALLOC.

 - More testing, including on real hardware. This revealed that
   discontiguous memory is a bit of a headache, at the moment we
   must disable memory not contiguous from 0. 
   
 - Update to deal with kasan bitops instrumentation that landed
   between RFC and now.

 - Documentation!

 - Various cleanups and tweaks.

I am getting occasional problems on boot of real hardware where it
seems vmalloc space mappings don't get installed in time. (We get a
BUG that memory is not accessible, but by the time we hit xmon the
memory then is accessible!) It happens once every few boots. I haven't
yet been able to figure out what is happening and why. I'm going to
look in to it, but I think the patches are in good enough shape to
review while I work on it.

Regards,
Daniel

Daniel Axtens (4):
  kasan: allow arches to provide their own early shadow setup
  kasan: support instrumented bitops with generic non-atomic bitops
  powerpc: support KASAN instrumentation of bitops
  powerpc: Book3S 64-bit "heavyweight" KASAN support

 Documentation/dev-tools/kasan.rst            |   7 +-
 Documentation/powerpc/kasan.txt              | 111 ++++++++++++++
 arch/powerpc/Kconfig                         |   4 +
 arch/powerpc/Kconfig.debug                   |  21 +++
 arch/powerpc/Makefile                        |   7 +
 arch/powerpc/include/asm/bitops.h            |  25 ++--
 arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
 arch/powerpc/include/asm/kasan.h             |  35 ++++-
 arch/powerpc/kernel/process.c                |   8 ++
 arch/powerpc/kernel/prom.c                   |  57 +++++++-
 arch/powerpc/mm/kasan/Makefile               |   1 +
 arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 ++++++++++
 include/asm-generic/bitops-instrumented.h    | 144 ++++++++++---------
 include/linux/kasan.h                        |   2 +
 lib/Kconfig.kasan                            |   3 +
 mm/kasan/init.c                              |  10 ++
 16 files changed, 431 insertions(+), 85 deletions(-)
 create mode 100644 Documentation/powerpc/kasan.txt
 create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] kasan: allow arches to provide their own early shadow setup
  2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
@ 2019-08-06 23:38 ` Daniel Axtens
  2019-08-07 15:14   ` Christophe Leroy
  2019-08-06 23:38 ` [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops Daniel Axtens
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Daniel Axtens @ 2019-08-06 23:38 UTC (permalink / raw)
  To: aneesh.kumar, christophe.leroy, bsingharora
  Cc: linuxppc-dev, kasan-dev, Daniel Axtens

powerpc supports several different MMUs. In particular, book3s
machines support both a hash-table based MMU and a radix MMU.
These MMUs support different numbers of entries per directory
level: the PTES_PER_* defines evaluate to variables, not constants.
This leads to complier errors as global variables must have constant
sizes.

Allow architectures to manage their own early shadow variables so we
can work around this on powerpc.

Signed-off-by: Daniel Axtens <dja@axtens.net>

---
Changes from RFC:

 - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
   a random #define to a config option selected when building for
   ppc64 book3s
---
 include/linux/kasan.h |  2 ++
 lib/Kconfig.kasan     |  3 +++
 mm/kasan/init.c       | 10 ++++++++++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index ec81113fcee4..15933da52a3e 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -14,11 +14,13 @@ struct task_struct;
 #include <asm/kasan.h>
 #include <asm/pgtable.h>
 
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
 extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
 extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
 extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
 extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
+#endif
 
 int kasan_populate_early_shadow(const void *shadow_start,
 				const void *shadow_end);
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index a320dc2e9317..0621a0129c04 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
 config	HAVE_ARCH_KASAN_VMALLOC
 	bool
 
+config ARCH_HAS_KASAN_EARLY_SHADOW
+	bool
+
 config CC_HAS_KASAN_GENERIC
 	def_bool $(cc-option, -fsanitize=kernel-address)
 
diff --git a/mm/kasan/init.c b/mm/kasan/init.c
index ce45c491ebcd..7ef2b87a7988 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -31,10 +31,14 @@
  *   - Latter it reused it as zero shadow to cover large ranges of memory
  *     that allowed to access, but not handled by kasan (vmalloc/vmemmap ...).
  */
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
+#endif
 
 #if CONFIG_PGTABLE_LEVELS > 4
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
+#endif
 static inline bool kasan_p4d_table(pgd_t pgd)
 {
 	return pgd_page(pgd) == virt_to_page(lm_alias(kasan_early_shadow_p4d));
@@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
 }
 #endif
 #if CONFIG_PGTABLE_LEVELS > 3
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
+#endif
 static inline bool kasan_pud_table(p4d_t p4d)
 {
 	return p4d_page(p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud));
@@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
 }
 #endif
 #if CONFIG_PGTABLE_LEVELS > 2
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
+#endif
 static inline bool kasan_pmd_table(pud_t pud)
 {
 	return pud_page(pud) == virt_to_page(lm_alias(kasan_early_shadow_pmd));
@@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
 	return false;
 }
 #endif
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
 pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
+#endif
 
 static inline bool kasan_pte_table(pmd_t pmd)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops
  2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
  2019-08-06 23:38 ` [PATCH 1/4] kasan: allow arches to provide their own early shadow setup Daniel Axtens
@ 2019-08-06 23:38 ` Daniel Axtens
  2019-08-07 14:58   ` Christophe Leroy
  2019-08-06 23:38 ` [PATCH 3/4] powerpc: support KASAN instrumentation of bitops Daniel Axtens
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Daniel Axtens @ 2019-08-06 23:38 UTC (permalink / raw)
  To: aneesh.kumar, christophe.leroy, bsingharora
  Cc: linuxppc-dev, kasan-dev, Daniel Axtens

Currently bitops-instrumented.h assumes that the architecture provides
both the atomic and non-atomic versions of the bitops (e.g. both
set_bit and __set_bit). This is true on x86, but is not always true:
there is a generic bitops/non-atomic.h header that provides generic
non-atomic versions. powerpc uses this generic version, so it does
not have it's own e.g. __set_bit that could be renamed arch___set_bit.

Rearrange bitops-instrumented.h. As operations in bitops/non-atomic.h
will already be instrumented (they use regular memory accesses), put
the instrumenting wrappers for them behind an ifdef. Only include
these instrumentation wrappers if non-atomic.h has not been included.

Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 include/asm-generic/bitops-instrumented.h | 144 ++++++++++++----------
 1 file changed, 76 insertions(+), 68 deletions(-)

diff --git a/include/asm-generic/bitops-instrumented.h b/include/asm-generic/bitops-instrumented.h
index ddd1c6d9d8db..2fe8f7e12a11 100644
--- a/include/asm-generic/bitops-instrumented.h
+++ b/include/asm-generic/bitops-instrumented.h
@@ -29,21 +29,6 @@ static inline void set_bit(long nr, volatile unsigned long *addr)
 	arch_set_bit(nr, addr);
 }
 
-/**
- * __set_bit - Set a bit in memory
- * @nr: the bit to set
- * @addr: the address to start counting from
- *
- * Unlike set_bit(), this function is non-atomic. If it is called on the same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __set_bit(long nr, volatile unsigned long *addr)
-{
-	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	arch___set_bit(nr, addr);
-}
-
 /**
  * clear_bit - Clears a bit in memory
  * @nr: Bit to clear
@@ -57,21 +42,6 @@ static inline void clear_bit(long nr, volatile unsigned long *addr)
 	arch_clear_bit(nr, addr);
 }
 
-/**
- * __clear_bit - Clears a bit in memory
- * @nr: the bit to clear
- * @addr: the address to start counting from
- *
- * Unlike clear_bit(), this function is non-atomic. If it is called on the same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __clear_bit(long nr, volatile unsigned long *addr)
-{
-	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	arch___clear_bit(nr, addr);
-}
-
 /**
  * clear_bit_unlock - Clear a bit in memory, for unlock
  * @nr: the bit to set
@@ -116,21 +86,6 @@ static inline void change_bit(long nr, volatile unsigned long *addr)
 	arch_change_bit(nr, addr);
 }
 
-/**
- * __change_bit - Toggle a bit in memory
- * @nr: the bit to change
- * @addr: the address to start counting from
- *
- * Unlike change_bit(), this function is non-atomic. If it is called on the same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __change_bit(long nr, volatile unsigned long *addr)
-{
-	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	arch___change_bit(nr, addr);
-}
-
 /**
  * test_and_set_bit - Set a bit and return its old value
  * @nr: Bit to set
@@ -144,20 +99,6 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
 	return arch_test_and_set_bit(nr, addr);
 }
 
-/**
- * __test_and_set_bit - Set a bit and return its old value
- * @nr: Bit to set
- * @addr: Address to count from
- *
- * This operation is non-atomic. If two instances of this operation race, one
- * can appear to succeed but actually fail.
- */
-static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
-{
-	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	return arch___test_and_set_bit(nr, addr);
-}
-
 /**
  * test_and_set_bit_lock - Set a bit and return its old value, for lock
  * @nr: Bit to set
@@ -187,30 +128,96 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
 }
 
 /**
- * __test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to clear
+ * test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This is an atomic fully-ordered operation (implied full memory barrier).
+ */
+static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
+{
+	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
+	return arch_test_and_change_bit(nr, addr);
+}
+
+/*
+ * If the arch is using the generic non-atomic bit ops, they are already
+ * instrumented, and we don't need to create wrappers. Only wrap if we
+ * haven't included that header.
+ */
+#ifndef _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
+
+/**
+ * __set_bit - Set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Unlike set_bit(), this function is non-atomic. If it is called on the same
+ * region of memory concurrently, the effect may be that only one operation
+ * succeeds.
+ */
+static inline void __set_bit(long nr, volatile unsigned long *addr)
+{
+	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
+	arch___set_bit(nr, addr);
+}
+
+/**
+ * __clear_bit - Clears a bit in memory
+ * @nr: the bit to clear
+ * @addr: the address to start counting from
+ *
+ * Unlike clear_bit(), this function is non-atomic. If it is called on the same
+ * region of memory concurrently, the effect may be that only one operation
+ * succeeds.
+ */
+static inline void __clear_bit(long nr, volatile unsigned long *addr)
+{
+	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
+	arch___clear_bit(nr, addr);
+}
+
+/**
+ * __change_bit - Toggle a bit in memory
+ * @nr: the bit to change
+ * @addr: the address to start counting from
+ *
+ * Unlike change_bit(), this function is non-atomic. If it is called on the same
+ * region of memory concurrently, the effect may be that only one operation
+ * succeeds.
+ */
+static inline void __change_bit(long nr, volatile unsigned long *addr)
+{
+	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
+	arch___change_bit(nr, addr);
+}
+
+/**
+ * __test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
  * @addr: Address to count from
  *
  * This operation is non-atomic. If two instances of this operation race, one
  * can appear to succeed but actually fail.
  */
-static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
+static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
 {
 	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	return arch___test_and_clear_bit(nr, addr);
+	return arch___test_and_set_bit(nr, addr);
 }
 
 /**
- * test_and_change_bit - Change a bit and return its old value
- * @nr: Bit to change
+ * __test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
  * @addr: Address to count from
  *
- * This is an atomic fully-ordered operation (implied full memory barrier).
+ * This operation is non-atomic. If two instances of this operation race, one
+ * can appear to succeed but actually fail.
  */
-static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
+static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
 {
 	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-	return arch_test_and_change_bit(nr, addr);
+	return arch___test_and_clear_bit(nr, addr);
 }
 
 /**
@@ -237,6 +244,7 @@ static inline bool test_bit(long nr, const volatile unsigned long *addr)
 	kasan_check_read(addr + BIT_WORD(nr), sizeof(long));
 	return arch_test_bit(nr, addr);
 }
+#endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
 
 #if defined(arch_clear_bit_unlock_is_negative_byte)
 /**
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] powerpc: support KASAN instrumentation of bitops
  2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
  2019-08-06 23:38 ` [PATCH 1/4] kasan: allow arches to provide their own early shadow setup Daniel Axtens
  2019-08-06 23:38 ` [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops Daniel Axtens
@ 2019-08-06 23:38 ` Daniel Axtens
  2019-08-07 15:01   ` Christophe Leroy
  2019-08-06 23:38 ` [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support Daniel Axtens
  2019-08-07 15:45 ` [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Christophe Leroy
  4 siblings, 1 reply; 17+ messages in thread
From: Daniel Axtens @ 2019-08-06 23:38 UTC (permalink / raw)
  To: aneesh.kumar, christophe.leroy, bsingharora
  Cc: linuxppc-dev, Nicholas Piggin, kasan-dev, Daniel Axtens

In KASAN development I noticed that the powerpc-specific bitops
were not being picked up by the KASAN test suite.

Instrumentation is done via the bitops-instrumented.h header. It
requies that arch-specific versions of bitop functions are renamed
to arch_*. Do this renaming.

For clear_bit_unlock_is_negative_byte, the current implementation
uses the PG_waiter constant. This works because it's a preprocessor
macro - so it's only actually evaluated in contexts where PG_waiter
is defined. With instrumentation however, it becomes a static inline
function, and all of a sudden we need the actual value of PG_waiter.
Because of the order of header includes, it's not available and we
fail to compile. Instead, manually specify that we care about bit 7.
This is still correct: bit 7 is the bit that would mark a negative
byte, but it does obscure the origin a little bit.

Cc: Nicholas Piggin <npiggin@gmail.com> # clear_bit_unlock_negative_byte
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 arch/powerpc/include/asm/bitops.h | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h
index 603aed229af7..19dc16e62e6a 100644
--- a/arch/powerpc/include/asm/bitops.h
+++ b/arch/powerpc/include/asm/bitops.h
@@ -86,22 +86,22 @@ DEFINE_BITOP(clear_bits, andc, "")
 DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)
 DEFINE_BITOP(change_bits, xor, "")
 
-static __inline__ void set_bit(int nr, volatile unsigned long *addr)
+static __inline__ void arch_set_bit(int nr, volatile unsigned long *addr)
 {
 	set_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
 }
 
-static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
+static __inline__ void arch_clear_bit(int nr, volatile unsigned long *addr)
 {
 	clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
 }
 
-static __inline__ void clear_bit_unlock(int nr, volatile unsigned long *addr)
+static __inline__ void arch_clear_bit_unlock(int nr, volatile unsigned long *addr)
 {
 	clear_bits_unlock(BIT_MASK(nr), addr + BIT_WORD(nr));
 }
 
-static __inline__ void change_bit(int nr, volatile unsigned long *addr)
+static __inline__ void arch_change_bit(int nr, volatile unsigned long *addr)
 {
 	change_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
 }
@@ -138,26 +138,26 @@ DEFINE_TESTOP(test_and_clear_bits, andc, PPC_ATOMIC_ENTRY_BARRIER,
 DEFINE_TESTOP(test_and_change_bits, xor, PPC_ATOMIC_ENTRY_BARRIER,
 	      PPC_ATOMIC_EXIT_BARRIER, 0)
 
-static __inline__ int test_and_set_bit(unsigned long nr,
+static __inline__ int arch_test_and_set_bit(unsigned long nr,
 				       volatile unsigned long *addr)
 {
 	return test_and_set_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
 }
 
-static __inline__ int test_and_set_bit_lock(unsigned long nr,
+static __inline__ int arch_test_and_set_bit_lock(unsigned long nr,
 				       volatile unsigned long *addr)
 {
 	return test_and_set_bits_lock(BIT_MASK(nr),
 				addr + BIT_WORD(nr)) != 0;
 }
 
-static __inline__ int test_and_clear_bit(unsigned long nr,
+static __inline__ int arch_test_and_clear_bit(unsigned long nr,
 					 volatile unsigned long *addr)
 {
 	return test_and_clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
 }
 
-static __inline__ int test_and_change_bit(unsigned long nr,
+static __inline__ int arch_test_and_change_bit(unsigned long nr,
 					  volatile unsigned long *addr)
 {
 	return test_and_change_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
@@ -186,14 +186,14 @@ static __inline__ unsigned long clear_bit_unlock_return_word(int nr,
 }
 
 /* This is a special function for mm/filemap.c */
-#define clear_bit_unlock_is_negative_byte(nr, addr)			\
-	(clear_bit_unlock_return_word(nr, addr) & BIT_MASK(PG_waiters))
+#define arch_clear_bit_unlock_is_negative_byte(nr, addr)		\
+	(clear_bit_unlock_return_word(nr, addr) & BIT_MASK(7))
 
 #endif /* CONFIG_PPC64 */
 
 #include <asm-generic/bitops/non-atomic.h>
 
-static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr)
+static __inline__ void arch___clear_bit_unlock(int nr, volatile unsigned long *addr)
 {
 	__asm__ __volatile__(PPC_RELEASE_BARRIER "" ::: "memory");
 	__clear_bit(nr, addr);
@@ -239,6 +239,9 @@ unsigned long __arch_hweight64(__u64 w);
 
 #include <asm-generic/bitops/find.h>
 
+/* wrappers that deal with KASAN instrumentation */
+#include <asm-generic/bitops-instrumented.h>
+
 /* Little-endian versions */
 #include <asm-generic/bitops/le.h>
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
                   ` (2 preceding siblings ...)
  2019-08-06 23:38 ` [PATCH 3/4] powerpc: support KASAN instrumentation of bitops Daniel Axtens
@ 2019-08-06 23:38 ` Daniel Axtens
  2019-08-07 16:34   ` Christophe Leroy
  2019-08-07 15:45 ` [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Christophe Leroy
  4 siblings, 1 reply; 17+ messages in thread
From: Daniel Axtens @ 2019-08-06 23:38 UTC (permalink / raw)
  To: aneesh.kumar, christophe.leroy, bsingharora
  Cc: linuxppc-dev, kasan-dev, Daniel Axtens

KASAN support on powerpc64 is interesting:

 - We want to be able to support inline instrumentation so as to be
   able to catch global and stack issues.

 - We run a lot of code at boot in real mode. This includes stuff like
   printk(), so it's not feasible to just disable instrumentation
   around it.

   [For those not immersed in ppc64, in real mode, the top nibble or
   2 bits (depending on radix/hash mmu) of the address is ignored. To
   make things work, we put the linear mapping at
   0xc000000000000000. This means that a pointer to part of the linear
   mapping will work both in real mode, where it will be interpreted
   as a physical address of the form 0x000..., and out of real mode,
   where it will go via the linear mapping.]

 - Inline instrumentation requires a fixed offset.

 - Because of our running things in real mode, the offset has to
   point to valid memory both in and out of real mode.

This makes finding somewhere to put the KASAN shadow region a bit fun.

One approach is just to give up on inline instrumentation and override
the address->shadow calculation. This way we can delay all checking
until after we get everything set up to our satisfaction. However,
we'd really like to do better.

What we can do - if we know _at compile time_ how much contiguous
physical memory we have - is to set aside the top 1/8th of the memory
and use that. This is a big hammer (hence the "heavyweight" name) and
comes with 3 big consequences:

 - kernels will simply fail to boot on machines with less memory than
   specified when compiling.

 - kernels running on machines with more memory than specified when
   compiling will simply ignore the extra memory.

 - there's no nice way to handle physically discontiguous memory, so
   you are restricted to the first physical memory block.

If you can bear all this, you get pretty full support for KASAN.

Despite the limitations, it can still find bugs,
e.g. http://patchwork.ozlabs.org/patch/1103775/

The current implementation is Radix only. I am open to extending
it to hash at some point but I don't think it should hold up v1.

Massive thanks to mpe, who had the idea for the initial design.

Signed-off-by: Daniel Axtens <dja@axtens.net>

---
Changes since the rfc:

 - Boots real and virtual hardware, kvm works.

 - disabled reporting when we're checking the stack for exception
   frames. The behaviour isn't wrong, just incompatible with KASAN.

 - Documentation!

 - Dropped old module stuff in favour of KASAN_VMALLOC.

The bugs with ftrace and kuap were due to kernel bloat pushing
prom_init calls to be done via the plt. Because we did not have
a relocatable kernel, and they are done very early, this caused
everything to explode. Compile with CONFIG_RELOCATABLE!

---
 Documentation/dev-tools/kasan.rst            |   7 +-
 Documentation/powerpc/kasan.txt              | 111 +++++++++++++++++++
 arch/powerpc/Kconfig                         |   4 +
 arch/powerpc/Kconfig.debug                   |  21 ++++
 arch/powerpc/Makefile                        |   7 ++
 arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
 arch/powerpc/include/asm/kasan.h             |  35 +++++-
 arch/powerpc/kernel/process.c                |   8 ++
 arch/powerpc/kernel/prom.c                   |  57 +++++++++-
 arch/powerpc/mm/kasan/Makefile               |   1 +
 arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 +++++++++++++
 11 files changed, 326 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/powerpc/kasan.txt
 create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index 35fda484a672..48d3b669e577 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -22,7 +22,9 @@ global variables yet.
 Tag-based KASAN is only supported in Clang and requires version 7.0.0 or later.
 
 Currently generic KASAN is supported for the x86_64, arm64, xtensa and s390
-architectures, and tag-based KASAN is supported only for arm64.
+architectures. It is also supported on powerpc for 32-bit kernels, and for
+64-bit kernels running under the radix MMU. Tag-based KASAN is supported only
+for arm64.
 
 Usage
 -----
@@ -252,7 +254,8 @@ CONFIG_KASAN_VMALLOC
 ~~~~~~~~~~~~~~~~~~~~
 
 With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
-cost of greater memory usage. Currently this is only supported on x86.
+cost of greater memory usage. Currently this is optional on x86, and
+required on 64-bit powerpc.
 
 This works by hooking into vmalloc and vmap, and dynamically
 allocating real shadow memory to back the mappings.
diff --git a/Documentation/powerpc/kasan.txt b/Documentation/powerpc/kasan.txt
new file mode 100644
index 000000000000..a5592454353b
--- /dev/null
+++ b/Documentation/powerpc/kasan.txt
@@ -0,0 +1,111 @@
+KASAN is supported on powerpc on 32-bit and 64-bit Radix only.
+
+32 bit support
+==============
+
+KASAN is supported on both hash and nohash MMUs on 32-bit.
+
+The shadow area sits at the top of the kernel virtual memory space above the
+fixmap area and occupies one eighth of the total kernel virtual memory space.
+
+Instrumentation of the vmalloc area is not currently supported, but modules are.
+
+64 bit support
+==============
+
+Currently, only the radix MMU is supported. There have been versions for Book3E
+processors floating around on the mailing list, but nothing has been merged.
+
+KASAN support on Book3S is interesting:
+
+ - We want to be able to support inline instrumentation so as to be able to
+   catch global and stack issues.
+
+ - Inline instrumentation requires a fixed offset.
+
+ - We run a lot of code at boot in real mode. This includes stuff like printk(),
+   so it's not feasible to just disable instrumentation around it.
+
+ - Because of our running things in real mode, the offset has to point to valid
+   memory both in and out of real mode.
+
+This makes finding somewhere to put the KASAN shadow region a bit fun.
+
+One approach is just to give up on inline instrumentation. This way we can delay
+all checks until after we get everything set up to our satisfaction. However,
+we'd really like to do better.
+
+If we know _at compile time_ how much contiguous physical memory we have, we can
+set aside the top 1/8th of physical memory and use that. This is a big hammer
+and comes with 2 big consequences:
+
+ - kernels will simply fail to boot on machines with less memory than specified
+   when compiling.
+
+ - kernels running on machines with more memory than specified when compiling
+   will simply ignore the extra memory.
+
+If you can live with this, you get full support for KASAN.
+
+Tips
+----
+
+ - Compile with CONFIG_RELOCATABLE.
+
+   In development, we found boot hangs when building with ftrace and KUAP. These
+   ended up being due to kernel bloat pushing prom_init calls to be done via the
+   PLT. Because we did not have a relocatable kernel, and they are done very
+   early, this caused us to jump off into somewhere invalid. Enabling relocation
+   fixes this.
+
+NUMA/discontiguous physical memory
+----------------------------------
+
+We currently cannot really deal with discontiguous physical memory. You are
+restricted to the physical memory that is contiguous from physical address zero,
+and must specify the size of that memory, not total memory, when configuring
+your kernel.
+
+Discontiguous memory can occur when you have a machine with memory spread across
+multiple nodes. For example, on a Talos II with 64GB of RAM:
+
+ - 32GB runs from 0x0 to 0x0000_0008_0000_0000,
+ - then there's a gap,
+ - then the final 32GB runs from 0x0000_2000_0000_0000 to 0x0000_2008_0000_0000.
+
+This can create _significant_ issues:
+
+ - If we try to treat the machine as having 64GB of _contiguous_ RAM, we would
+   assume that ran from 0x0 to 0x0000_0010_0000_0000. We'd then reserve the last
+   1/8th - 0x0000_000e_0000_0000 to 0x0000_0010_0000_0000 as the shadow
+   region. But when we try to access any of that, we'll try to access pages that
+   are not physically present.
+
+ - If we try to base the shadow region size on the top address, we'll need to
+   reserve 0x2008_0000_0000 / 8 = 0x0401_0000_0000 bytes = 4100 GB of memory,
+   which will clearly not work on a system with 64GB of RAM.
+
+Therefore, you are restricted to the memory in the node starting at 0x0. For
+this system, that's 32GB. If you specify a contiguous physical memory size
+greater than the size of the first contiguous region of memory, the system will
+be unable to boot or even print an error message warning you.
+
+You can determine the layout of your system's memory by observing the messages
+that the Radix MMU prints on boot. The Talos II discussed earlier has:
+
+radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
+radix-mmu: Mapped 0x0000000040000000-0x0000000800000000 with 1.00 GiB pages
+radix-mmu: Mapped 0x0000200000000000-0x0000200800000000 with 1.00 GiB pages
+
+As discussed, you'd configure this system for 32768 MB.
+
+Another system prints:
+
+radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
+radix-mmu: Mapped 0x0000000040000000-0x0000002000000000 with 1.00 GiB pages
+radix-mmu: Mapped 0x0000200000000000-0x0000202000000000 with 1.00 GiB pages
+
+This machine has more memory: 0x0000_0040_0000_0000 total, but only
+0x0000_0020_0000_0000 is physically contiguous from zero, so we'd configure the
+kernel for 131072 MB of physically contiguous memory.
+
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d8dcd8820369..3d6deee100e2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -171,6 +171,10 @@ config PPC
 	select HAVE_ARCH_HUGE_VMAP		if PPC_BOOK3S_64 && PPC_RADIX_MMU
 	select HAVE_ARCH_JUMP_LABEL
 	select HAVE_ARCH_KASAN			if PPC32
+	select HAVE_ARCH_KASAN			if PPC_BOOK3S_64 && PPC_RADIX_MMU
+	select ARCH_HAS_KASAN_EARLY_SHADOW	if PPC_BOOK3S_64
+	select HAVE_ARCH_KASAN_VMALLOC		if PPC_BOOK3S_64
+	select KASAN_VMALLOC			if KASAN
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_MMAP_RND_BITS
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index c59920920ddc..2d6fb7b1ba59 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -394,6 +394,27 @@ config PPC_FAST_ENDIAN_SWITCH
         help
 	  If you're unsure what this is, say N.
 
+config PHYS_MEM_SIZE_FOR_KASAN
+	int "Contiguous physical memory size for KASAN (MB)"
+	depends on KASAN && PPC_BOOK3S_64
+	help
+
+	  To get inline instrumentation support for KASAN on 64-bit Book3S
+	  machines, you need to know how much contiguous physical memory your
+	  system has. A shadow offset will be calculated based on this figure,
+	  which will be compiled in to the kernel. KASAN will use this offset
+	  to access its shadow region, which is used to verify memory accesses.
+
+	  If you attempt to boot on a system with less memory than you specify
+	  here, your system will fail to boot very early in the process. If you
+	  boot on a system with more memory than you specify, the extra memory
+	  will wasted - it will be reserved and not used.
+
+	  For systems with discontiguous blocks of physical memory, specify the
+	  size of the block starting at 0x0. You can determine this by looking
+	  at the memory layout info printed to dmesg by the radix MMU code
+	  early in boot. See Documentation/powerpc/kasan.txt.
+
 config KASAN_SHADOW_OFFSET
 	hex
 	depends on KASAN
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index c345b79414a9..33e7bba4c8db 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -229,6 +229,13 @@ ifdef CONFIG_476FPE_ERR46
 		-T $(srctree)/arch/powerpc/platforms/44x/ppc476_modules.lds
 endif
 
+ifdef CONFIG_KASAN
+ifdef CONFIG_PPC_BOOK3S_64
+# 0xa800000000000000 = 12105675798371893248
+KASAN_SHADOW_OFFSET = $(shell echo 7 \* 1024 \* 1024 \* $(CONFIG_PHYS_MEM_SIZE_FOR_KASAN) / 8 + 12105675798371893248 | bc)
+endif
+endif
+
 # No AltiVec or VSX instructions when building kernel
 KBUILD_CFLAGS += $(call cc-option,-mno-altivec)
 KBUILD_CFLAGS += $(call cc-option,-mno-vsx)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index e04a839cb5b9..4c011cc15e05 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -35,6 +35,11 @@
 #define RADIX_PMD_SHIFT		(PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
 #define RADIX_PUD_SHIFT		(RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
 #define RADIX_PGD_SHIFT		(RADIX_PUD_SHIFT + RADIX_PUD_INDEX_SIZE)
+
+#define R_PTRS_PER_PTE		(1 << RADIX_PTE_INDEX_SIZE)
+#define R_PTRS_PER_PMD		(1 << RADIX_PMD_INDEX_SIZE)
+#define R_PTRS_PER_PUD		(1 << RADIX_PUD_INDEX_SIZE)
+
 /*
  * Size of EA range mapped by our pagetables.
  */
diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
index 296e51c2f066..d6b4028c296b 100644
--- a/arch/powerpc/include/asm/kasan.h
+++ b/arch/powerpc/include/asm/kasan.h
@@ -14,13 +14,20 @@
 
 #ifndef __ASSEMBLY__
 
-#include <asm/page.h>
+#ifdef CONFIG_KASAN
+void kasan_init(void);
+#else
+static inline void kasan_init(void) { }
+#endif
 
 #define KASAN_SHADOW_SCALE_SHIFT	3
 
 #define KASAN_SHADOW_START	(KASAN_SHADOW_OFFSET + \
 				 (PAGE_OFFSET >> KASAN_SHADOW_SCALE_SHIFT))
 
+#ifdef CONFIG_PPC32
+#include <asm/page.h>
+
 #define KASAN_SHADOW_OFFSET	ASM_CONST(CONFIG_KASAN_SHADOW_OFFSET)
 
 #define KASAN_SHADOW_END	0UL
@@ -30,11 +37,33 @@
 #ifdef CONFIG_KASAN
 void kasan_early_init(void);
 void kasan_mmu_init(void);
-void kasan_init(void);
 #else
-static inline void kasan_init(void) { }
 static inline void kasan_mmu_init(void) { }
 #endif
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
+#include <asm/pgtable.h>
+
+/*
+ * The KASAN shadow offset is such that linear map (0xc000...) is shadowed by
+ * the last 8th of linearly mapped physical memory. This way, if the code uses
+ * 0xc addresses throughout, accesses work both in in real mode (where the top
+ * 2 bits are ignored) and outside of real mode.
+ */
+#define KASAN_SHADOW_OFFSET ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
+				1024 * 1024 * 7 / 8 + 0xa800000000000000UL)
+
+#define KASAN_SHADOW_SIZE ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
+				1024 * 1024 * 1 / 8)
+
+extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
+
+extern pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE];
+extern pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD];
+extern pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD];
+extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
+#endif /* CONFIG_PPC_BOOK3S_64 */
 
 #endif /* __ASSEMBLY */
 #endif
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 8fc4de0d22b4..31602536e72b 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2097,7 +2097,14 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
 		/*
 		 * See if this is an exception frame.
 		 * We look for the "regshere" marker in the current frame.
+		 *
+		 * KASAN may complain about this. If it is an exception frame,
+		 * we won't have unpoisoned the stack in asm when we set the
+		 * exception marker. If it's not an exception frame, who knows
+		 * how things are laid out - the shadow could be in any state
+		 * at all. Just disable KASAN reporting for now.
 		 */
+		kasan_disable_current();
 		if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
 		    && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
 			struct pt_regs *regs = (struct pt_regs *)
@@ -2107,6 +2114,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
 			       regs->trap, (void *)regs->nip, (void *)lr);
 			firstframe = 1;
 		}
+		kasan_enable_current();
 
 		sp = newsp;
 	} while (count++ < kstack_depth_to_print);
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 7159e791a70d..dde5f2896ab6 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -71,6 +71,7 @@ unsigned long tce_alloc_start, tce_alloc_end;
 u64 ppc64_rma_size;
 #endif
 static phys_addr_t first_memblock_size;
+static phys_addr_t top_phys_addr;
 static int __initdata boot_cpu_count;
 
 static int __init early_parse_mem(char *p)
@@ -448,6 +449,21 @@ static bool validate_mem_limit(u64 base, u64 *size)
 {
 	u64 max_mem = 1UL << (MAX_PHYSMEM_BITS);
 
+#ifdef CONFIG_KASAN
+	/*
+	 * To handle the NUMA/discontiguous memory case, don't allow a block
+	 * to be added if it falls completely beyond the configured physical
+	 * memory.
+	 *
+	 * See Documentation/powerpc/kasan.txt
+	 */
+	if (base >= (u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * 1024 * 1024) {
+		pr_warn("KASAN: not adding mem block at %llx (size %llx)",
+			base, *size);
+		return false;
+	}
+#endif
+
 	if (base >= max_mem)
 		return false;
 	if ((base + *size) > max_mem)
@@ -571,8 +587,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 
 	/* Add the chunk to the MEMBLOCK list */
 	if (add_mem_to_memblock) {
-		if (validate_mem_limit(base, &size))
+		if (validate_mem_limit(base, &size)) {
 			memblock_add(base, size);
+			if (base + size > top_phys_addr)
+				top_phys_addr = base + size;
+		}
 	}
 }
 
@@ -612,6 +631,8 @@ static void __init early_reserve_mem_dt(void)
 static void __init early_reserve_mem(void)
 {
 	__be64 *reserve_map;
+	phys_addr_t kasan_shadow_start __maybe_unused;
+	phys_addr_t kasan_memory_size __maybe_unused;
 
 	reserve_map = (__be64 *)(((unsigned long)initial_boot_params) +
 			fdt_off_mem_rsvmap(initial_boot_params));
@@ -650,6 +671,40 @@ static void __init early_reserve_mem(void)
 		return;
 	}
 #endif
+
+#if defined(CONFIG_KASAN) && defined(CONFIG_PPC_BOOK3S_64)
+	kasan_memory_size = ((phys_addr_t)CONFIG_PHYS_MEM_SIZE_FOR_KASAN
+		* 1024 * 1024);
+
+	if (top_phys_addr < kasan_memory_size) {
+		/*
+		 * We are doomed. Attempts to call e.g. panic() are likely to
+		 * fail because they call out into instrumented code, which
+		 * will almost certainly access memory beyond the end of
+		 * physical memory. Hang here so that at least the NIP points
+		 * somewhere that will help you debug it if you look at it in
+		 * qemu.
+		 */
+		while (true)
+			;
+	} else if (top_phys_addr > kasan_memory_size) {
+		/* print a biiiig warning in hopes people notice */
+		pr_err("===========================================\n"
+			"Physical memory exceeds compiled-in maximum!\n"
+			"This kernel was compiled for KASAN with %u MB physical memory.\n"
+			"The actual physical memory detected is %llu MB.\n"
+			"Memory above the compiled limit will not be used!\n"
+			"===========================================\n",
+			CONFIG_PHYS_MEM_SIZE_FOR_KASAN,
+			top_phys_addr / (1024 * 1024));
+	}
+
+	kasan_shadow_start = _ALIGN_DOWN(kasan_memory_size * 7 / 8, PAGE_SIZE);
+	DBG("reserving %llx -> %llx for KASAN",
+	    kasan_shadow_start, top_phys_addr);
+	memblock_reserve(kasan_shadow_start,
+			 top_phys_addr - kasan_shadow_start);
+#endif
 }
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
index 6577897673dd..ff8143ba1e4d 100644
--- a/arch/powerpc/mm/kasan/Makefile
+++ b/arch/powerpc/mm/kasan/Makefile
@@ -3,3 +3,4 @@
 KASAN_SANITIZE := n
 
 obj-$(CONFIG_PPC32)           += kasan_init_32.o
+obj-$(CONFIG_PPC_BOOK3S_64)   += kasan_init_book3s_64.o
diff --git a/arch/powerpc/mm/kasan/kasan_init_book3s_64.c b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c
new file mode 100644
index 000000000000..fafda3d5e9a3
--- /dev/null
+++ b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KASAN for 64-bit Book3S powerpc
+ *
+ * Copyright (C) 2019 IBM Corporation
+ * Author: Daniel Axtens <dja@axtens.net>
+ */
+
+#define DISABLE_BRANCH_PROFILING
+
+#include <linux/kasan.h>
+#include <linux/printk.h>
+#include <linux/sched/task.h>
+#include <asm/pgalloc.h>
+
+unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
+
+pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE] __page_aligned_bss;
+pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD] __page_aligned_bss;
+pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD] __page_aligned_bss;
+p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
+
+void __init kasan_init(void)
+{
+	int i;
+	void *k_start = kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START);
+	void *k_end = kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END);
+
+	unsigned long pte_val = __pa(kasan_early_shadow_page)
+					| pgprot_val(PAGE_KERNEL) | _PAGE_PTE;
+
+	if (!early_radix_enabled())
+		panic("KASAN requires radix!");
+
+	for (i = 0; i < PTRS_PER_PTE; i++)
+		kasan_early_shadow_pte[i] = __pte(pte_val);
+
+	for (i = 0; i < PTRS_PER_PMD; i++)
+		pmd_populate_kernel(&init_mm, &kasan_early_shadow_pmd[i],
+				    kasan_early_shadow_pte);
+
+	for (i = 0; i < PTRS_PER_PUD; i++)
+		pud_populate(&init_mm, &kasan_early_shadow_pud[i],
+			     kasan_early_shadow_pmd);
+
+
+	memset(kasan_mem_to_shadow((void *)PAGE_OFFSET), KASAN_SHADOW_INIT,
+		KASAN_SHADOW_SIZE);
+
+	kasan_populate_early_shadow(
+		kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START),
+		kasan_mem_to_shadow((void *)RADIX_VMALLOC_START));
+
+	/* leave a hole here for vmalloc */
+
+	kasan_populate_early_shadow(
+		kasan_mem_to_shadow((void *)RADIX_VMALLOC_END),
+		kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END));
+
+	flush_tlb_kernel_range((unsigned long)k_start, (unsigned long)k_end);
+
+	/* mark early shadow region as RO and wipe */
+	for (i = 0; i < PTRS_PER_PTE; i++)
+		__set_pte_at(&init_mm, (unsigned long)kasan_early_shadow_page,
+			&kasan_early_shadow_pte[i],
+			pfn_pte(virt_to_pfn(kasan_early_shadow_page),
+			__pgprot(_PAGE_PTE | _PAGE_KERNEL_RO | _PAGE_BASE)),
+			0);
+	memset(kasan_early_shadow_page, 0, PAGE_SIZE);
+
+	kasan_init_tags();
+
+	/* Enable error messages */
+	init_task.kasan_depth = 0;
+	pr_info("KASAN init done (64-bit Book3S heavyweight mode)\n");
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops
  2019-08-06 23:38 ` [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops Daniel Axtens
@ 2019-08-07 14:58   ` Christophe Leroy
  0 siblings, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 14:58 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
> Currently bitops-instrumented.h assumes that the architecture provides
> both the atomic and non-atomic versions of the bitops (e.g. both
> set_bit and __set_bit). This is true on x86, but is not always true:
> there is a generic bitops/non-atomic.h header that provides generic
> non-atomic versions. powerpc uses this generic version, so it does
> not have it's own e.g. __set_bit that could be renamed arch___set_bit.
> 
> Rearrange bitops-instrumented.h. As operations in bitops/non-atomic.h
> will already be instrumented (they use regular memory accesses), put
> the instrumenting wrappers for them behind an ifdef. Only include
> these instrumentation wrappers if non-atomic.h has not been included.

What about moving and splitting bitops-instrumented.h into:
bitops/atomic-instrumented.h
bitops/non-atomic-instrumented.h
bitops/lock-instrumented.h

I think that would be cleaner than hacking the file with the _GUARDS_ of 
another header file (is that method used anywhere else in header files ?)

Christophe

> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>   include/asm-generic/bitops-instrumented.h | 144 ++++++++++++----------
>   1 file changed, 76 insertions(+), 68 deletions(-)
> 
> diff --git a/include/asm-generic/bitops-instrumented.h b/include/asm-generic/bitops-instrumented.h
> index ddd1c6d9d8db..2fe8f7e12a11 100644
> --- a/include/asm-generic/bitops-instrumented.h
> +++ b/include/asm-generic/bitops-instrumented.h
> @@ -29,21 +29,6 @@ static inline void set_bit(long nr, volatile unsigned long *addr)
>   	arch_set_bit(nr, addr);
>   }
>   
> -/**
> - * __set_bit - Set a bit in memory
> - * @nr: the bit to set
> - * @addr: the address to start counting from
> - *
> - * Unlike set_bit(), this function is non-atomic. If it is called on the same
> - * region of memory concurrently, the effect may be that only one operation
> - * succeeds.
> - */
> -static inline void __set_bit(long nr, volatile unsigned long *addr)
> -{
> -	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	arch___set_bit(nr, addr);
> -}
> -
>   /**
>    * clear_bit - Clears a bit in memory
>    * @nr: Bit to clear
> @@ -57,21 +42,6 @@ static inline void clear_bit(long nr, volatile unsigned long *addr)
>   	arch_clear_bit(nr, addr);
>   }
>   
> -/**
> - * __clear_bit - Clears a bit in memory
> - * @nr: the bit to clear
> - * @addr: the address to start counting from
> - *
> - * Unlike clear_bit(), this function is non-atomic. If it is called on the same
> - * region of memory concurrently, the effect may be that only one operation
> - * succeeds.
> - */
> -static inline void __clear_bit(long nr, volatile unsigned long *addr)
> -{
> -	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	arch___clear_bit(nr, addr);
> -}
> -
>   /**
>    * clear_bit_unlock - Clear a bit in memory, for unlock
>    * @nr: the bit to set
> @@ -116,21 +86,6 @@ static inline void change_bit(long nr, volatile unsigned long *addr)
>   	arch_change_bit(nr, addr);
>   }
>   
> -/**
> - * __change_bit - Toggle a bit in memory
> - * @nr: the bit to change
> - * @addr: the address to start counting from
> - *
> - * Unlike change_bit(), this function is non-atomic. If it is called on the same
> - * region of memory concurrently, the effect may be that only one operation
> - * succeeds.
> - */
> -static inline void __change_bit(long nr, volatile unsigned long *addr)
> -{
> -	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	arch___change_bit(nr, addr);
> -}
> -
>   /**
>    * test_and_set_bit - Set a bit and return its old value
>    * @nr: Bit to set
> @@ -144,20 +99,6 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
>   	return arch_test_and_set_bit(nr, addr);
>   }
>   
> -/**
> - * __test_and_set_bit - Set a bit and return its old value
> - * @nr: Bit to set
> - * @addr: Address to count from
> - *
> - * This operation is non-atomic. If two instances of this operation race, one
> - * can appear to succeed but actually fail.
> - */
> -static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
> -{
> -	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	return arch___test_and_set_bit(nr, addr);
> -}
> -
>   /**
>    * test_and_set_bit_lock - Set a bit and return its old value, for lock
>    * @nr: Bit to set
> @@ -187,30 +128,96 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
>   }
>   
>   /**
> - * __test_and_clear_bit - Clear a bit and return its old value
> - * @nr: Bit to clear
> + * test_and_change_bit - Change a bit and return its old value
> + * @nr: Bit to change
> + * @addr: Address to count from
> + *
> + * This is an atomic fully-ordered operation (implied full memory barrier).
> + */
> +static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
> +{
> +	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> +	return arch_test_and_change_bit(nr, addr);
> +}
> +
> +/*
> + * If the arch is using the generic non-atomic bit ops, they are already
> + * instrumented, and we don't need to create wrappers. Only wrap if we
> + * haven't included that header.
> + */
> +#ifndef _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
> +
> +/**
> + * __set_bit - Set a bit in memory
> + * @nr: the bit to set
> + * @addr: the address to start counting from
> + *
> + * Unlike set_bit(), this function is non-atomic. If it is called on the same
> + * region of memory concurrently, the effect may be that only one operation
> + * succeeds.
> + */
> +static inline void __set_bit(long nr, volatile unsigned long *addr)
> +{
> +	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> +	arch___set_bit(nr, addr);
> +}
> +
> +/**
> + * __clear_bit - Clears a bit in memory
> + * @nr: the bit to clear
> + * @addr: the address to start counting from
> + *
> + * Unlike clear_bit(), this function is non-atomic. If it is called on the same
> + * region of memory concurrently, the effect may be that only one operation
> + * succeeds.
> + */
> +static inline void __clear_bit(long nr, volatile unsigned long *addr)
> +{
> +	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> +	arch___clear_bit(nr, addr);
> +}
> +
> +/**
> + * __change_bit - Toggle a bit in memory
> + * @nr: the bit to change
> + * @addr: the address to start counting from
> + *
> + * Unlike change_bit(), this function is non-atomic. If it is called on the same
> + * region of memory concurrently, the effect may be that only one operation
> + * succeeds.
> + */
> +static inline void __change_bit(long nr, volatile unsigned long *addr)
> +{
> +	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> +	arch___change_bit(nr, addr);
> +}
> +
> +/**
> + * __test_and_set_bit - Set a bit and return its old value
> + * @nr: Bit to set
>    * @addr: Address to count from
>    *
>    * This operation is non-atomic. If two instances of this operation race, one
>    * can appear to succeed but actually fail.
>    */
> -static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
> +static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
>   {
>   	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	return arch___test_and_clear_bit(nr, addr);
> +	return arch___test_and_set_bit(nr, addr);
>   }
>   
>   /**
> - * test_and_change_bit - Change a bit and return its old value
> - * @nr: Bit to change
> + * __test_and_clear_bit - Clear a bit and return its old value
> + * @nr: Bit to clear
>    * @addr: Address to count from
>    *
> - * This is an atomic fully-ordered operation (implied full memory barrier).
> + * This operation is non-atomic. If two instances of this operation race, one
> + * can appear to succeed but actually fail.
>    */
> -static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
> +static inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
>   {
>   	kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
> -	return arch_test_and_change_bit(nr, addr);
> +	return arch___test_and_clear_bit(nr, addr);
>   }
>   
>   /**
> @@ -237,6 +244,7 @@ static inline bool test_bit(long nr, const volatile unsigned long *addr)
>   	kasan_check_read(addr + BIT_WORD(nr), sizeof(long));
>   	return arch_test_bit(nr, addr);
>   }
> +#endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
>   
>   #if defined(arch_clear_bit_unlock_is_negative_byte)
>   /**
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] powerpc: support KASAN instrumentation of bitops
  2019-08-06 23:38 ` [PATCH 3/4] powerpc: support KASAN instrumentation of bitops Daniel Axtens
@ 2019-08-07 15:01   ` Christophe Leroy
  0 siblings, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 15:01 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora
  Cc: linuxppc-dev, Nicholas Piggin, kasan-dev



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
> In KASAN development I noticed that the powerpc-specific bitops
> were not being picked up by the KASAN test suite.
> 
> Instrumentation is done via the bitops-instrumented.h header. It
> requies that arch-specific versions of bitop functions are renamed
> to arch_*. Do this renaming.
> 
> For clear_bit_unlock_is_negative_byte, the current implementation
> uses the PG_waiter constant. This works because it's a preprocessor
> macro - so it's only actually evaluated in contexts where PG_waiter
> is defined. With instrumentation however, it becomes a static inline
> function, and all of a sudden we need the actual value of PG_waiter.
> Because of the order of header includes, it's not available and we
> fail to compile. Instead, manually specify that we care about bit 7.
> This is still correct: bit 7 is the bit that would mark a negative
> byte, but it does obscure the origin a little bit.
> 
> Cc: Nicholas Piggin <npiggin@gmail.com> # clear_bit_unlock_negative_byte
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> ---
>   arch/powerpc/include/asm/bitops.h | 25 ++++++++++++++-----------
>   1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h
> index 603aed229af7..19dc16e62e6a 100644
> --- a/arch/powerpc/include/asm/bitops.h
> +++ b/arch/powerpc/include/asm/bitops.h
> @@ -86,22 +86,22 @@ DEFINE_BITOP(clear_bits, andc, "")
>   DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)
>   DEFINE_BITOP(change_bits, xor, "")
>   
> -static __inline__ void set_bit(int nr, volatile unsigned long *addr)
> +static __inline__ void arch_set_bit(int nr, volatile unsigned long *addr)
>   {
>   	set_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
>   }
>   
> -static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
> +static __inline__ void arch_clear_bit(int nr, volatile unsigned long *addr)
>   {
>   	clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
>   }
>   
> -static __inline__ void clear_bit_unlock(int nr, volatile unsigned long *addr)
> +static __inline__ void arch_clear_bit_unlock(int nr, volatile unsigned long *addr)
>   {
>   	clear_bits_unlock(BIT_MASK(nr), addr + BIT_WORD(nr));
>   }
>   
> -static __inline__ void change_bit(int nr, volatile unsigned long *addr)
> +static __inline__ void arch_change_bit(int nr, volatile unsigned long *addr)
>   {
>   	change_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
>   }
> @@ -138,26 +138,26 @@ DEFINE_TESTOP(test_and_clear_bits, andc, PPC_ATOMIC_ENTRY_BARRIER,
>   DEFINE_TESTOP(test_and_change_bits, xor, PPC_ATOMIC_ENTRY_BARRIER,
>   	      PPC_ATOMIC_EXIT_BARRIER, 0)
>   
> -static __inline__ int test_and_set_bit(unsigned long nr,
> +static __inline__ int arch_test_and_set_bit(unsigned long nr,
>   				       volatile unsigned long *addr)
>   {
>   	return test_and_set_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
>   }
>   
> -static __inline__ int test_and_set_bit_lock(unsigned long nr,
> +static __inline__ int arch_test_and_set_bit_lock(unsigned long nr,
>   				       volatile unsigned long *addr)
>   {
>   	return test_and_set_bits_lock(BIT_MASK(nr),
>   				addr + BIT_WORD(nr)) != 0;
>   }
>   
> -static __inline__ int test_and_clear_bit(unsigned long nr,
> +static __inline__ int arch_test_and_clear_bit(unsigned long nr,
>   					 volatile unsigned long *addr)
>   {
>   	return test_and_clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
>   }
>   
> -static __inline__ int test_and_change_bit(unsigned long nr,
> +static __inline__ int arch_test_and_change_bit(unsigned long nr,
>   					  volatile unsigned long *addr)
>   {
>   	return test_and_change_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
> @@ -186,14 +186,14 @@ static __inline__ unsigned long clear_bit_unlock_return_word(int nr,
>   }
>   
>   /* This is a special function for mm/filemap.c */
> -#define clear_bit_unlock_is_negative_byte(nr, addr)			\
> -	(clear_bit_unlock_return_word(nr, addr) & BIT_MASK(PG_waiters))
> +#define arch_clear_bit_unlock_is_negative_byte(nr, addr)		\
> +	(clear_bit_unlock_return_word(nr, addr) & BIT_MASK(7))

Maybe add a comment reminding that 7 is PG_waiters ?

Christophe

>   
>   #endif /* CONFIG_PPC64 */
>   
>   #include <asm-generic/bitops/non-atomic.h>
>   
> -static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr)
> +static __inline__ void arch___clear_bit_unlock(int nr, volatile unsigned long *addr)
>   {
>   	__asm__ __volatile__(PPC_RELEASE_BARRIER "" ::: "memory");
>   	__clear_bit(nr, addr);
> @@ -239,6 +239,9 @@ unsigned long __arch_hweight64(__u64 w);
>   
>   #include <asm-generic/bitops/find.h>
>   
> +/* wrappers that deal with KASAN instrumentation */
> +#include <asm-generic/bitops-instrumented.h>
> +
>   /* Little-endian versions */
>   #include <asm-generic/bitops/le.h>
>   
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] kasan: allow arches to provide their own early shadow setup
  2019-08-06 23:38 ` [PATCH 1/4] kasan: allow arches to provide their own early shadow setup Daniel Axtens
@ 2019-08-07 15:14   ` Christophe Leroy
  2019-08-07 20:19     ` Christophe Leroy
  2019-12-10  4:50     ` Daniel Axtens
  0 siblings, 2 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 15:14 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
> powerpc supports several different MMUs. In particular, book3s
> machines support both a hash-table based MMU and a radix MMU.
> These MMUs support different numbers of entries per directory
> level: the PTES_PER_* defines evaluate to variables, not constants.
> This leads to complier errors as global variables must have constant
> sizes.
> 
> Allow architectures to manage their own early shadow variables so we
> can work around this on powerpc.

This seems rather strange to move the early shadow tables out of 
mm/kasan/init.c allthough they are used there still.

What about doing for all what is already done for 
kasan_early_shadow_p4d[], in extenso define constant max sizes 
MAX_PTRS_PER_PTE, MAX_PTRS_PER_PMD and MAX_PTRS_PER_PUD ?

With a set of the following, it would remain transparent for other arches.
#ifndef MAX_PTRS_PER_PXX
#define MAX_PTRS_PER_PXX PTRS_PER_PXX
#endif

Then you would just need to do the following for Radix:

#define MAX_PTRS_PER_PTE		(1 << RADIX_PTE_INDEX_SIZE)
#define MAX_PTRS_PER_PMD		(1 << RADIX_PMD_INDEX_SIZE)
#define MAX_PTRS_PER_PUD		(1 << RADIX_PUD_INDEX_SIZE)


For the kasan_early_shadow_page[], I don't think we have variable 
PAGE_SIZE, have we ?

Christophe


> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> 
> ---
> Changes from RFC:
> 
>   - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
>     a random #define to a config option selected when building for
>     ppc64 book3s
> ---
>   include/linux/kasan.h |  2 ++
>   lib/Kconfig.kasan     |  3 +++
>   mm/kasan/init.c       | 10 ++++++++++
>   3 files changed, 15 insertions(+)
> 
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index ec81113fcee4..15933da52a3e 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -14,11 +14,13 @@ struct task_struct;
>   #include <asm/kasan.h>
>   #include <asm/pgtable.h>
>   
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
>   extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
>   extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
>   extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
>   extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
> +#endif
>   
>   int kasan_populate_early_shadow(const void *shadow_start,
>   				const void *shadow_end);
> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
> index a320dc2e9317..0621a0129c04 100644
> --- a/lib/Kconfig.kasan
> +++ b/lib/Kconfig.kasan
> @@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
>   config	HAVE_ARCH_KASAN_VMALLOC
>   	bool
>   
> +config ARCH_HAS_KASAN_EARLY_SHADOW
> +	bool
> +
>   config CC_HAS_KASAN_GENERIC
>   	def_bool $(cc-option, -fsanitize=kernel-address)
>   
> diff --git a/mm/kasan/init.c b/mm/kasan/init.c
> index ce45c491ebcd..7ef2b87a7988 100644
> --- a/mm/kasan/init.c
> +++ b/mm/kasan/init.c
> @@ -31,10 +31,14 @@
>    *   - Latter it reused it as zero shadow to cover large ranges of memory
>    *     that allowed to access, but not handled by kasan (vmalloc/vmemmap ...).
>    */
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
> +#endif
>   
>   #if CONFIG_PGTABLE_LEVELS > 4
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
> +#endif
>   static inline bool kasan_p4d_table(pgd_t pgd)
>   {
>   	return pgd_page(pgd) == virt_to_page(lm_alias(kasan_early_shadow_p4d));
> @@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
>   }
>   #endif
>   #if CONFIG_PGTABLE_LEVELS > 3
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
> +#endif
>   static inline bool kasan_pud_table(p4d_t p4d)
>   {
>   	return p4d_page(p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud));
> @@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
>   }
>   #endif
>   #if CONFIG_PGTABLE_LEVELS > 2
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
> +#endif
>   static inline bool kasan_pmd_table(pud_t pud)
>   {
>   	return pud_page(pud) == virt_to_page(lm_alias(kasan_early_shadow_pmd));
> @@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
>   	return false;
>   }
>   #endif
> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>   pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
> +#endif
>   
>   static inline bool kasan_pte_table(pmd_t pmd)
>   {
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix
  2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
                   ` (3 preceding siblings ...)
  2019-08-06 23:38 ` [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support Daniel Axtens
@ 2019-08-07 15:45 ` Christophe Leroy
  2019-08-16  4:11   ` Daniel Axtens
  4 siblings, 1 reply; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 15:45 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
> Building on the work of Christophe, Aneesh and Balbir, I've ported
> KASAN to 64-bit Book3S kernels running on the Radix MMU.
> 
> It builds on top Christophe's work on 32bit. It also builds on my
> generic KASAN_VMALLOC series, available at:
> https://patchwork.kernel.org/project/linux-mm/list/?series=153209

Would be good to send that one to the powerpc list as well.

> 
> This provides full inline instrumentation on radix, but does require
> that you be able to specify the amount of memory on the system at
> compile time. More details in patch 4.
> 
> Notable changes from the RFC:
> 
>   - I've dropped Book3E 64-bit for now.
> 
>   - Now instead of hacking into the KASAN core to disable module
>     allocations, we use KASAN_VMALLOC.
> 
>   - More testing, including on real hardware. This revealed that
>     discontiguous memory is a bit of a headache, at the moment we
>     must disable memory not contiguous from 0.
>     
>   - Update to deal with kasan bitops instrumentation that landed
>     between RFC and now.

This is rather independant and also applies to PPC32. Could it be a 
separate series that Michael could apply earlier ?

Christophe

> 
>   - Documentation!
> 
>   - Various cleanups and tweaks.
> 
> I am getting occasional problems on boot of real hardware where it
> seems vmalloc space mappings don't get installed in time. (We get a
> BUG that memory is not accessible, but by the time we hit xmon the
> memory then is accessible!) It happens once every few boots. I haven't
> yet been able to figure out what is happening and why. I'm going to
> look in to it, but I think the patches are in good enough shape to
> review while I work on it.
> 
> Regards,
> Daniel
> 
> Daniel Axtens (4):
>    kasan: allow arches to provide their own early shadow setup
>    kasan: support instrumented bitops with generic non-atomic bitops
>    powerpc: support KASAN instrumentation of bitops
>    powerpc: Book3S 64-bit "heavyweight" KASAN support
> 
>   Documentation/dev-tools/kasan.rst            |   7 +-
>   Documentation/powerpc/kasan.txt              | 111 ++++++++++++++
>   arch/powerpc/Kconfig                         |   4 +
>   arch/powerpc/Kconfig.debug                   |  21 +++
>   arch/powerpc/Makefile                        |   7 +
>   arch/powerpc/include/asm/bitops.h            |  25 ++--
>   arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
>   arch/powerpc/include/asm/kasan.h             |  35 ++++-
>   arch/powerpc/kernel/process.c                |   8 ++
>   arch/powerpc/kernel/prom.c                   |  57 +++++++-
>   arch/powerpc/mm/kasan/Makefile               |   1 +
>   arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 ++++++++++
>   include/asm-generic/bitops-instrumented.h    | 144 ++++++++++---------
>   include/linux/kasan.h                        |   2 +
>   lib/Kconfig.kasan                            |   3 +
>   mm/kasan/init.c                              |  10 ++
>   16 files changed, 431 insertions(+), 85 deletions(-)
>   create mode 100644 Documentation/powerpc/kasan.txt
>   create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-08-06 23:38 ` [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support Daniel Axtens
@ 2019-08-07 16:34   ` Christophe Leroy
  2019-08-09 15:35     ` Christophe Leroy
  2019-12-10  5:10     ` Daniel Axtens
  0 siblings, 2 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 16:34 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
> KASAN support on powerpc64 is interesting:
> 
>   - We want to be able to support inline instrumentation so as to be
>     able to catch global and stack issues.
> 
>   - We run a lot of code at boot in real mode. This includes stuff like
>     printk(), so it's not feasible to just disable instrumentation
>     around it.

Have you definitely given up the idea of doing a standard implementation 
of KASAN like other 64 bits arches have done ?

Isn't it possible to setup an early 1:1 mapping and go in virtual mode 
earlier ? What is so different between book3s64 and book3e64 ?
On book3e64, we've been able to setup KASAN before printing anything 
(except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?

> 
>     [For those not immersed in ppc64, in real mode, the top nibble or
>     2 bits (depending on radix/hash mmu) of the address is ignored. To
>     make things work, we put the linear mapping at
>     0xc000000000000000. This means that a pointer to part of the linear
>     mapping will work both in real mode, where it will be interpreted
>     as a physical address of the form 0x000..., and out of real mode,
>     where it will go via the linear mapping.]
> 
>   - Inline instrumentation requires a fixed offset.
> 
>   - Because of our running things in real mode, the offset has to
>     point to valid memory both in and out of real mode.
> 
> This makes finding somewhere to put the KASAN shadow region a bit fun.
> 
> One approach is just to give up on inline instrumentation and override
> the address->shadow calculation. This way we can delay all checking
> until after we get everything set up to our satisfaction. However,
> we'd really like to do better.
> 
> What we can do - if we know _at compile time_ how much contiguous
> physical memory we have - is to set aside the top 1/8th of the memory
> and use that. This is a big hammer (hence the "heavyweight" name) and
> comes with 3 big consequences:
> 
>   - kernels will simply fail to boot on machines with less memory than
>     specified when compiling.
> 
>   - kernels running on machines with more memory than specified when
>     compiling will simply ignore the extra memory.
> 
>   - there's no nice way to handle physically discontiguous memory, so
>     you are restricted to the first physical memory block.
> 
> If you can bear all this, you get pretty full support for KASAN.
> 
> Despite the limitations, it can still find bugs,
> e.g. http://patchwork.ozlabs.org/patch/1103775/
> 
> The current implementation is Radix only. I am open to extending
> it to hash at some point but I don't think it should hold up v1.
> 
> Massive thanks to mpe, who had the idea for the initial design.
> 
> Signed-off-by: Daniel Axtens <dja@axtens.net>
> 
> ---
> Changes since the rfc:
> 
>   - Boots real and virtual hardware, kvm works.
> 
>   - disabled reporting when we're checking the stack for exception
>     frames. The behaviour isn't wrong, just incompatible with KASAN.

Does this applies to / impacts PPC32 at all ?

> 
>   - Documentation!
> 
>   - Dropped old module stuff in favour of KASAN_VMALLOC.

You said in the cover that this is done to avoid having to split modules 
out of VMALLOC area. Would it be an issue to perform that split ?
I can understand it is not easy on 32 bits because vmalloc space is 
rather small, but on 64 bits don't we have enough virtual space to 
confortably split modules out of vmalloc ? The 64 bits already splits 
ioremap away from vmalloc whereas 32 bits have them merged too.

> 
> The bugs with ftrace and kuap were due to kernel bloat pushing
> prom_init calls to be done via the plt. Because we did not have
> a relocatable kernel, and they are done very early, this caused
> everything to explode. Compile with CONFIG_RELOCATABLE!
> 
> ---
>   Documentation/dev-tools/kasan.rst            |   7 +-
>   Documentation/powerpc/kasan.txt              | 111 +++++++++++++++++++
>   arch/powerpc/Kconfig                         |   4 +
>   arch/powerpc/Kconfig.debug                   |  21 ++++
>   arch/powerpc/Makefile                        |   7 ++
>   arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
>   arch/powerpc/include/asm/kasan.h             |  35 +++++-
>   arch/powerpc/kernel/process.c                |   8 ++
>   arch/powerpc/kernel/prom.c                   |  57 +++++++++-
>   arch/powerpc/mm/kasan/Makefile               |   1 +
>   arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 +++++++++++++
>   11 files changed, 326 insertions(+), 6 deletions(-)
>   create mode 100644 Documentation/powerpc/kasan.txt
>   create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c
> 
> diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
> index 35fda484a672..48d3b669e577 100644
> --- a/Documentation/dev-tools/kasan.rst
> +++ b/Documentation/dev-tools/kasan.rst
> @@ -22,7 +22,9 @@ global variables yet.
>   Tag-based KASAN is only supported in Clang and requires version 7.0.0 or later.
>   
>   Currently generic KASAN is supported for the x86_64, arm64, xtensa and s390
> -architectures, and tag-based KASAN is supported only for arm64.
> +architectures. It is also supported on powerpc for 32-bit kernels, and for
> +64-bit kernels running under the radix MMU. Tag-based KASAN is supported only
> +for arm64.

Could the 32 bits documentation stuff be a separate patch ?

>   
>   Usage
>   -----
> @@ -252,7 +254,8 @@ CONFIG_KASAN_VMALLOC
>   ~~~~~~~~~~~~~~~~~~~~
>   
>   With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
> -cost of greater memory usage. Currently this is only supported on x86.
> +cost of greater memory usage. Currently this is optional on x86, and
> +required on 64-bit powerpc.
>   
>   This works by hooking into vmalloc and vmap, and dynamically
>   allocating real shadow memory to back the mappings.
> diff --git a/Documentation/powerpc/kasan.txt b/Documentation/powerpc/kasan.txt
> new file mode 100644
> index 000000000000..a5592454353b
> --- /dev/null
> +++ b/Documentation/powerpc/kasan.txt
> @@ -0,0 +1,111 @@
> +KASAN is supported on powerpc on 32-bit and 64-bit Radix only.
> +
> +32 bit support
> +==============
> +
> +KASAN is supported on both hash and nohash MMUs on 32-bit.
> +
> +The shadow area sits at the top of the kernel virtual memory space above the
> +fixmap area and occupies one eighth of the total kernel virtual memory space.
> +
> +Instrumentation of the vmalloc area is not currently supported, but modules are.
> +
> +64 bit support
> +==============
> +
> +Currently, only the radix MMU is supported. There have been versions for Book3E
> +processors floating around on the mailing list, but nothing has been merged.
> +
> +KASAN support on Book3S is interesting:

And support for others is not interesting ? :)

> +
> + - We want to be able to support inline instrumentation so as to be able to
> +   catch global and stack issues.
> +
> + - Inline instrumentation requires a fixed offset.
> +
> + - We run a lot of code at boot in real mode. This includes stuff like printk(),
> +   so it's not feasible to just disable instrumentation around it.
> +
> + - Because of our running things in real mode, the offset has to point to valid
> +   memory both in and out of real mode.
> +
> +This makes finding somewhere to put the KASAN shadow region a bit fun.
> +
> +One approach is just to give up on inline instrumentation. This way we can delay
> +all checks until after we get everything set up to our satisfaction. However,
> +we'd really like to do better.
> +
> +If we know _at compile time_ how much contiguous physical memory we have, we can
> +set aside the top 1/8th of physical memory and use that. This is a big hammer
> +and comes with 2 big consequences:
> +
> + - kernels will simply fail to boot on machines with less memory than specified
> +   when compiling.
> +
> + - kernels running on machines with more memory than specified when compiling
> +   will simply ignore the extra memory.
> +
> +If you can live with this, you get full support for KASAN.
> +
> +Tips
> +----
> +
> + - Compile with CONFIG_RELOCATABLE.
> +
> +   In development, we found boot hangs when building with ftrace and KUAP. These
> +   ended up being due to kernel bloat pushing prom_init calls to be done via the
> +   PLT. Because we did not have a relocatable kernel, and they are done very
> +   early, this caused us to jump off into somewhere invalid. Enabling relocation
> +   fixes this.
> +
> +NUMA/discontiguous physical memory
> +----------------------------------
> +
> +We currently cannot really deal with discontiguous physical memory. You are
> +restricted to the physical memory that is contiguous from physical address zero,
> +and must specify the size of that memory, not total memory, when configuring
> +your kernel.
> +
> +Discontiguous memory can occur when you have a machine with memory spread across
> +multiple nodes. For example, on a Talos II with 64GB of RAM:
> +
> + - 32GB runs from 0x0 to 0x0000_0008_0000_0000,
> + - then there's a gap,
> + - then the final 32GB runs from 0x0000_2000_0000_0000 to 0x0000_2008_0000_0000.
> +
> +This can create _significant_ issues:
> +
> + - If we try to treat the machine as having 64GB of _contiguous_ RAM, we would
> +   assume that ran from 0x0 to 0x0000_0010_0000_0000. We'd then reserve the last
> +   1/8th - 0x0000_000e_0000_0000 to 0x0000_0010_0000_0000 as the shadow
> +   region. But when we try to access any of that, we'll try to access pages that
> +   are not physically present.
> +
> + - If we try to base the shadow region size on the top address, we'll need to
> +   reserve 0x2008_0000_0000 / 8 = 0x0401_0000_0000 bytes = 4100 GB of memory,
> +   which will clearly not work on a system with 64GB of RAM.
> +
> +Therefore, you are restricted to the memory in the node starting at 0x0. For
> +this system, that's 32GB. If you specify a contiguous physical memory size
> +greater than the size of the first contiguous region of memory, the system will
> +be unable to boot or even print an error message warning you.

Can't we restrict ourselve to the first block at startup while we are in 
real mode, but then support the entire mem once we have switched on the 
MMU ?

> +
> +You can determine the layout of your system's memory by observing the messages
> +that the Radix MMU prints on boot. The Talos II discussed earlier has:
> +
> +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
> +radix-mmu: Mapped 0x0000000040000000-0x0000000800000000 with 1.00 GiB pages
> +radix-mmu: Mapped 0x0000200000000000-0x0000200800000000 with 1.00 GiB pages
> +
> +As discussed, you'd configure this system for 32768 MB.
> +
> +Another system prints:
> +
> +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
> +radix-mmu: Mapped 0x0000000040000000-0x0000002000000000 with 1.00 GiB pages
> +radix-mmu: Mapped 0x0000200000000000-0x0000202000000000 with 1.00 GiB pages
> +
> +This machine has more memory: 0x0000_0040_0000_0000 total, but only
> +0x0000_0020_0000_0000 is physically contiguous from zero, so we'd configure the
> +kernel for 131072 MB of physically contiguous memory.
> +
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index d8dcd8820369..3d6deee100e2 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -171,6 +171,10 @@ config PPC
>   	select HAVE_ARCH_HUGE_VMAP		if PPC_BOOK3S_64 && PPC_RADIX_MMU
>   	select HAVE_ARCH_JUMP_LABEL
>   	select HAVE_ARCH_KASAN			if PPC32
> +	select HAVE_ARCH_KASAN			if PPC_BOOK3S_64 && PPC_RADIX_MMU
> +	select ARCH_HAS_KASAN_EARLY_SHADOW	if PPC_BOOK3S_64

See comment on patch 1, would be better to avoid that.

> +	select HAVE_ARCH_KASAN_VMALLOC		if PPC_BOOK3S_64
> +	select KASAN_VMALLOC			if KASAN
>   	select HAVE_ARCH_KGDB
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
> index c59920920ddc..2d6fb7b1ba59 100644
> --- a/arch/powerpc/Kconfig.debug
> +++ b/arch/powerpc/Kconfig.debug
> @@ -394,6 +394,27 @@ config PPC_FAST_ENDIAN_SWITCH
>           help
>   	  If you're unsure what this is, say N.
>   
> +config PHYS_MEM_SIZE_FOR_KASAN
> +	int "Contiguous physical memory size for KASAN (MB)"
> +	depends on 

Drop the depend and maybe do:
	int "Contiguous physical memory size for KASAN (MB)" if KASAN && 
PPC_BOOK3S_64
	default 0

Will allow you to not enclose it into #ifdef KASAN in the code below

> +	help
> +
> +	  To get inline instrumentation support for KASAN on 64-bit Book3S
> +	  machines, you need to know how much contiguous physical memory your
> +	  system has. A shadow offset will be calculated based on this figure,
> +	  which will be compiled in to the kernel. KASAN will use this offset
> +	  to access its shadow region, which is used to verify memory accesses.
> +
> +	  If you attempt to boot on a system with less memory than you specify
> +	  here, your system will fail to boot very early in the process. If you
> +	  boot on a system with more memory than you specify, the extra memory
> +	  will wasted - it will be reserved and not used.
> +
> +	  For systems with discontiguous blocks of physical memory, specify the
> +	  size of the block starting at 0x0. You can determine this by looking
> +	  at the memory layout info printed to dmesg by the radix MMU code
> +	  early in boot. See Documentation/powerpc/kasan.txt.
> +
>   config KASAN_SHADOW_OFFSET
>   	hex
>   	depends on KASAN
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index c345b79414a9..33e7bba4c8db 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -229,6 +229,13 @@ ifdef CONFIG_476FPE_ERR46
>   		-T $(srctree)/arch/powerpc/platforms/44x/ppc476_modules.lds
>   endif
>   
> +ifdef CONFIG_KASAN
> +ifdef CONFIG_PPC_BOOK3S_64
> +# 0xa800000000000000 = 12105675798371893248
> +KASAN_SHADOW_OFFSET = $(shell echo 7 \* 1024 \* 1024 \* $(CONFIG_PHYS_MEM_SIZE_FOR_KASAN) / 8 + 12105675798371893248 | bc)
> +endif
> +endif
> +
>   # No AltiVec or VSX instructions when building kernel
>   KBUILD_CFLAGS += $(call cc-option,-mno-altivec)
>   KBUILD_CFLAGS += $(call cc-option,-mno-vsx)
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index e04a839cb5b9..4c011cc15e05 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -35,6 +35,11 @@
>   #define RADIX_PMD_SHIFT		(PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
>   #define RADIX_PUD_SHIFT		(RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
>   #define RADIX_PGD_SHIFT		(RADIX_PUD_SHIFT + RADIX_PUD_INDEX_SIZE)
> +
> +#define R_PTRS_PER_PTE		(1 << RADIX_PTE_INDEX_SIZE)
> +#define R_PTRS_PER_PMD		(1 << RADIX_PMD_INDEX_SIZE)
> +#define R_PTRS_PER_PUD		(1 << RADIX_PUD_INDEX_SIZE)
> +

See my suggestion in patch 1.

>   /*
>    * Size of EA range mapped by our pagetables.
>    */
> diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
> index 296e51c2f066..d6b4028c296b 100644
> --- a/arch/powerpc/include/asm/kasan.h
> +++ b/arch/powerpc/include/asm/kasan.h
> @@ -14,13 +14,20 @@
>   
>   #ifndef __ASSEMBLY__
>   
> -#include <asm/page.h>
> +#ifdef CONFIG_KASAN
> +void kasan_init(void);
> +#else
> +static inline void kasan_init(void) { }
> +#endif
>   
>   #define KASAN_SHADOW_SCALE_SHIFT	3
>   
>   #define KASAN_SHADOW_START	(KASAN_SHADOW_OFFSET + \
>   				 (PAGE_OFFSET >> KASAN_SHADOW_SCALE_SHIFT))
>   
> +#ifdef CONFIG_PPC32
> +#include <asm/page.h>
> +
>   #define KASAN_SHADOW_OFFSET	ASM_CONST(CONFIG_KASAN_SHADOW_OFFSET)
>   
>   #define KASAN_SHADOW_END	0UL
> @@ -30,11 +37,33 @@
>   #ifdef CONFIG_KASAN
>   void kasan_early_init(void);
>   void kasan_mmu_init(void);
> -void kasan_init(void);
>   #else
> -static inline void kasan_init(void) { }
>   static inline void kasan_mmu_init(void) { }
>   #endif
> +#endif
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +#include <asm/pgtable.h>
> +
> +/*
> + * The KASAN shadow offset is such that linear map (0xc000...) is shadowed by
> + * the last 8th of linearly mapped physical memory. This way, if the code uses
> + * 0xc addresses throughout, accesses work both in in real mode (where the top
> + * 2 bits are ignored) and outside of real mode.
> + */
> +#define KASAN_SHADOW_OFFSET ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
> +				1024 * 1024 * 7 / 8 + 0xa800000000000000UL)

Already calculated in the Makefile, can't we reuse it ?

'X * 1024 * 1024' is usually better understood is 'X << 20' instead.


> +
> +#define KASAN_SHADOW_SIZE ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
> +				1024 * 1024 * 1 / 8)
> +
> +extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
> +
> +extern pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE];
> +extern pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD];
> +extern pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD];
> +extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
> +#endif /* CONFIG_PPC_BOOK3S_64 */
>   
>   #endif /* __ASSEMBLY */
>   #endif
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 8fc4de0d22b4..31602536e72b 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -2097,7 +2097,14 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
>   		/*
>   		 * See if this is an exception frame.
>   		 * We look for the "regshere" marker in the current frame.
> +		 *
> +		 * KASAN may complain about this. If it is an exception frame,
> +		 * we won't have unpoisoned the stack in asm when we set the
> +		 * exception marker. If it's not an exception frame, who knows
> +		 * how things are laid out - the shadow could be in any state
> +		 * at all. Just disable KASAN reporting for now.
>   		 */
> +		kasan_disable_current();
>   		if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
>   		    && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
>   			struct pt_regs *regs = (struct pt_regs *)
> @@ -2107,6 +2114,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
>   			       regs->trap, (void *)regs->nip, (void *)lr);
>   			firstframe = 1;
>   		}
> +		kasan_enable_current();
>   
>   		sp = newsp;
>   	} while (count++ < kstack_depth_to_print);
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 7159e791a70d..dde5f2896ab6 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -71,6 +71,7 @@ unsigned long tce_alloc_start, tce_alloc_end;
>   u64 ppc64_rma_size;
>   #endif
>   static phys_addr_t first_memblock_size;
> +static phys_addr_t top_phys_addr;
>   static int __initdata boot_cpu_count;
>   
>   static int __init early_parse_mem(char *p)
> @@ -448,6 +449,21 @@ static bool validate_mem_limit(u64 base, u64 *size)
>   {
>   	u64 max_mem = 1UL << (MAX_PHYSMEM_BITS);
>   
> +#ifdef CONFIG_KASAN
> +	/*
> +	 * To handle the NUMA/discontiguous memory case, don't allow a block
> +	 * to be added if it falls completely beyond the configured physical
> +	 * memory.
> +	 *
> +	 * See Documentation/powerpc/kasan.txt
> +	 */
> +	if (base >= (u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * 1024 * 1024) {
> +		pr_warn("KASAN: not adding mem block at %llx (size %llx)",
> +			base, *size);
> +		return false;
> +	}
> +#endif
> +
>   	if (base >= max_mem)
>   		return false;
>   	if ((base + *size) > max_mem)
> @@ -571,8 +587,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>   
>   	/* Add the chunk to the MEMBLOCK list */
>   	if (add_mem_to_memblock) {
> -		if (validate_mem_limit(base, &size))
> +		if (validate_mem_limit(base, &size)) {
>   			memblock_add(base, size);
> +			if (base + size > top_phys_addr)
> +				top_phys_addr = base + size;
> +		}
>   	}
>   }
>   
> @@ -612,6 +631,8 @@ static void __init early_reserve_mem_dt(void)
>   static void __init early_reserve_mem(void)
>   {
>   	__be64 *reserve_map;
> +	phys_addr_t kasan_shadow_start __maybe_unused;
> +	phys_addr_t kasan_memory_size __maybe_unused;

Could we avoid those uggly __maybe_unused ?

>   
>   	reserve_map = (__be64 *)(((unsigned long)initial_boot_params) +
>   			fdt_off_mem_rsvmap(initial_boot_params));
> @@ -650,6 +671,40 @@ static void __init early_reserve_mem(void)
>   		return;
>   	}
>   #endif
> +
> +#if defined(CONFIG_KASAN) && defined(CONFIG_PPC_BOOK3S_64)

Would be better to do following instead of the #ifdef

if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
}

This would avoid the __maybe_unused above, and would allow to valide the 
code even when CONFIG_KASAN is not selected.

This would probably require to set a default value for 
CONFIG_PHYS_MEM_SIZE_FOR_KASAN when KASAN is not set, or maybe define an 
intermediate const somewhere in some .h which takes 
CONFIG_PHYS_MEM_SIZE_FOR_KASAN when KASAN is there and 0 when KASAN is 
not there

> +	kasan_memory_size = ((phys_addr_t)CONFIG_PHYS_MEM_SIZE_FOR_KASAN
> +		* 1024 * 1024);
> +
> +	if (top_phys_addr < kasan_memory_size) {
> +		/*
> +		 * We are doomed. Attempts to call e.g. panic() are likely to
> +		 * fail because they call out into instrumented code, which
> +		 * will almost certainly access memory beyond the end of
> +		 * physical memory. Hang here so that at least the NIP points
> +		 * somewhere that will help you debug it if you look at it in
> +		 * qemu.
> +		 */
> +		while (true)
> +			;

A bit gross ?


> +	} else if (top_phys_addr > kasan_memory_size) {
> +		/* print a biiiig warning in hopes people notice */
> +		pr_err("===========================================\n"
> +			"Physical memory exceeds compiled-in maximum!\n"
> +			"This kernel was compiled for KASAN with %u MB physical memory.\n"
> +			"The actual physical memory detected is %llu MB.\n"
> +			"Memory above the compiled limit will not be used!\n"
> +			"===========================================\n",
> +			CONFIG_PHYS_MEM_SIZE_FOR_KASAN,
> +			top_phys_addr / (1024 * 1024));
> +	}
> +
> +	kasan_shadow_start = _ALIGN_DOWN(kasan_memory_size * 7 / 8, PAGE_SIZE);
> +	DBG("reserving %llx -> %llx for KASAN",
> +	    kasan_shadow_start, top_phys_addr);
> +	memblock_reserve(kasan_shadow_start,
> +			 top_phys_addr - kasan_shadow_start);
> +#endif
>   }
>   
>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
> index 6577897673dd..ff8143ba1e4d 100644
> --- a/arch/powerpc/mm/kasan/Makefile
> +++ b/arch/powerpc/mm/kasan/Makefile
> @@ -3,3 +3,4 @@
>   KASAN_SANITIZE := n
>   
>   obj-$(CONFIG_PPC32)           += kasan_init_32.o
> +obj-$(CONFIG_PPC_BOOK3S_64)   += kasan_init_book3s_64.o
> diff --git a/arch/powerpc/mm/kasan/kasan_init_book3s_64.c b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c

In the same spirit as what was done in other mm subdirs, could we rename 
those files to:
init_32.o
init_book3s64.o


> new file mode 100644
> index 000000000000..fafda3d5e9a3
> --- /dev/null
> +++ b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c
> @@ -0,0 +1,76 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * KASAN for 64-bit Book3S powerpc
> + *
> + * Copyright (C) 2019 IBM Corporation
> + * Author: Daniel Axtens <dja@axtens.net>
> + */
> +
> +#define DISABLE_BRANCH_PROFILING
> +
> +#include <linux/kasan.h>
> +#include <linux/printk.h>
> +#include <linux/sched/task.h>
> +#include <asm/pgalloc.h>
> +
> +unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;

What's the difference with what's defined in the kasan generic part and 
that you have opted out in first patch ?

> +
> +pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE] __page_aligned_bss;
> +pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD] __page_aligned_bss;
> +pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD] __page_aligned_bss;
> +p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;

See my suggestion for those in patch 1.

> +
> +void __init kasan_init(void)
> +{
> +	int i;
> +	void *k_start = kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START);
> +	void *k_end = kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END);
> +
> +	unsigned long pte_val = __pa(kasan_early_shadow_page)
> +					| pgprot_val(PAGE_KERNEL) | _PAGE_PTE;
> +
> +	if (!early_radix_enabled())
> +		panic("KASAN requires radix!");
> +
> +	for (i = 0; i < PTRS_PER_PTE; i++)
> +		kasan_early_shadow_pte[i] = __pte(pte_val);

Shouldn't you use __set_pte_at() here ?

> +
> +	for (i = 0; i < PTRS_PER_PMD; i++)
> +		pmd_populate_kernel(&init_mm, &kasan_early_shadow_pmd[i],
> +				    kasan_early_shadow_pte);
> +
> +	for (i = 0; i < PTRS_PER_PUD; i++)
> +		pud_populate(&init_mm, &kasan_early_shadow_pud[i],
> +			     kasan_early_shadow_pmd);
> +
> +
> +	memset(kasan_mem_to_shadow((void *)PAGE_OFFSET), KASAN_SHADOW_INIT,
> +		KASAN_SHADOW_SIZE);
> +
> +	kasan_populate_early_shadow(
> +		kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START),
> +		kasan_mem_to_shadow((void *)RADIX_VMALLOC_START));
> +
> +	/* leave a hole here for vmalloc */
> +
> +	kasan_populate_early_shadow(
> +		kasan_mem_to_shadow((void *)RADIX_VMALLOC_END),
> +		kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END));
> +
> +	flush_tlb_kernel_range((unsigned long)k_start, (unsigned long)k_end);
> +
> +	/* mark early shadow region as RO and wipe */
> +	for (i = 0; i < PTRS_PER_PTE; i++)
> +		__set_pte_at(&init_mm, (unsigned long)kasan_early_shadow_page,
> +			&kasan_early_shadow_pte[i],
> +			pfn_pte(virt_to_pfn(kasan_early_shadow_page),
> +			__pgprot(_PAGE_PTE | _PAGE_KERNEL_RO | _PAGE_BASE)),

Isn't there an already existing val for the above set of flags ?

> +			0);
> +	memset(kasan_early_shadow_page, 0, PAGE_SIZE);
> +
> +	kasan_init_tags();

You said in the doc that only arm supports that ...

> +
> +	/* Enable error messages */
> +	init_task.kasan_depth = 0;
> +	pr_info("KASAN init done (64-bit Book3S heavyweight mode)\n");
> +}
> 

Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] kasan: allow arches to provide their own early shadow setup
  2019-08-07 15:14   ` Christophe Leroy
@ 2019-08-07 20:19     ` Christophe Leroy
  2019-12-10  4:50     ` Daniel Axtens
  1 sibling, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-07 20:19 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 07/08/2019 à 17:14, Christophe Leroy a écrit :
> 
> 
> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>> powerpc supports several different MMUs. In particular, book3s
>> machines support both a hash-table based MMU and a radix MMU.
>> These MMUs support different numbers of entries per directory
>> level: the PTES_PER_* defines evaluate to variables, not constants.
>> This leads to complier errors as global variables must have constant
>> sizes.
>>
>> Allow architectures to manage their own early shadow variables so we
>> can work around this on powerpc.
> 
> This seems rather strange to move the early shadow tables out of 
> mm/kasan/init.c allthough they are used there still.
> 
> What about doing for all what is already done for 
> kasan_early_shadow_p4d[], in extenso define constant max sizes 
> MAX_PTRS_PER_PTE, MAX_PTRS_PER_PMD and MAX_PTRS_PER_PUD ?

To illustrate my suggestion, see commit c65e774fb3f6af21 ("x86/mm: Make 
PGDIR_SHIFT and PTRS_PER_P4D variable")

The same principle should apply on all variable powerpc PTRS_PER_XXX.

Christophe

> 
> With a set of the following, it would remain transparent for other arches.
> #ifndef MAX_PTRS_PER_PXX
> #define MAX_PTRS_PER_PXX PTRS_PER_PXX
> #endif
> 
> Then you would just need to do the following for Radix:
> 
> #define MAX_PTRS_PER_PTE        (1 << RADIX_PTE_INDEX_SIZE)
> #define MAX_PTRS_PER_PMD        (1 << RADIX_PMD_INDEX_SIZE)
> #define MAX_PTRS_PER_PUD        (1 << RADIX_PUD_INDEX_SIZE)
> 
> 
> For the kasan_early_shadow_page[], I don't think we have variable 
> PAGE_SIZE, have we ?
> 
> Christophe
> 
> 
>>
>> Signed-off-by: Daniel Axtens <dja@axtens.net>
>>
>> ---
>> Changes from RFC:
>>
>>   - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
>>     a random #define to a config option selected when building for
>>     ppc64 book3s
>> ---
>>   include/linux/kasan.h |  2 ++
>>   lib/Kconfig.kasan     |  3 +++
>>   mm/kasan/init.c       | 10 ++++++++++
>>   3 files changed, 15 insertions(+)
>>
>> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
>> index ec81113fcee4..15933da52a3e 100644
>> --- a/include/linux/kasan.h
>> +++ b/include/linux/kasan.h
>> @@ -14,11 +14,13 @@ struct task_struct;
>>   #include <asm/kasan.h>
>>   #include <asm/pgtable.h>
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
>>   extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
>>   extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
>>   extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
>>   extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>> +#endif
>>   int kasan_populate_early_shadow(const void *shadow_start,
>>                   const void *shadow_end);
>> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
>> index a320dc2e9317..0621a0129c04 100644
>> --- a/lib/Kconfig.kasan
>> +++ b/lib/Kconfig.kasan
>> @@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
>>   config    HAVE_ARCH_KASAN_VMALLOC
>>       bool
>> +config ARCH_HAS_KASAN_EARLY_SHADOW
>> +    bool
>> +
>>   config CC_HAS_KASAN_GENERIC
>>       def_bool $(cc-option, -fsanitize=kernel-address)
>> diff --git a/mm/kasan/init.c b/mm/kasan/init.c
>> index ce45c491ebcd..7ef2b87a7988 100644
>> --- a/mm/kasan/init.c
>> +++ b/mm/kasan/init.c
>> @@ -31,10 +31,14 @@
>>    *   - Latter it reused it as zero shadow to cover large ranges of 
>> memory
>>    *     that allowed to access, but not handled by kasan 
>> (vmalloc/vmemmap ...).
>>    */
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
>> +#endif
>>   #if CONFIG_PGTABLE_LEVELS > 4
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_p4d_table(pgd_t pgd)
>>   {
>>       return pgd_page(pgd) == 
>> virt_to_page(lm_alias(kasan_early_shadow_p4d));
>> @@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
>>   }
>>   #endif
>>   #if CONFIG_PGTABLE_LEVELS > 3
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_pud_table(p4d_t p4d)
>>   {
>>       return p4d_page(p4d) == 
>> virt_to_page(lm_alias(kasan_early_shadow_pud));
>> @@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
>>   }
>>   #endif
>>   #if CONFIG_PGTABLE_LEVELS > 2
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_pmd_table(pud_t pud)
>>   {
>>       return pud_page(pud) == 
>> virt_to_page(lm_alias(kasan_early_shadow_pmd));
>> @@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
>>       return false;
>>   }
>>   #endif
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_pte_table(pmd_t pmd)
>>   {
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-08-07 16:34   ` Christophe Leroy
@ 2019-08-09 15:35     ` Christophe Leroy
  2019-12-10  5:10     ` Daniel Axtens
  1 sibling, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-08-09 15:35 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: aneesh.kumar, linuxppc-dev, kasan-dev

Hi Daniel,

Le 07/08/2019 à 18:34, Christophe Leroy a écrit :
> 
> 
> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>> KASAN support on powerpc64 is interesting:
>>
>>   - We want to be able to support inline instrumentation so as to be
>>     able to catch global and stack issues.
>>
>>   - We run a lot of code at boot in real mode. This includes stuff like
>>     printk(), so it's not feasible to just disable instrumentation
>>     around it.
> 
> Have you definitely given up the idea of doing a standard implementation 
> of KASAN like other 64 bits arches have done ?
> 
> Isn't it possible to setup an early 1:1 mapping and go in virtual mode 
> earlier ? What is so different between book3s64 and book3e64 ?
> On book3e64, we've been able to setup KASAN before printing anything 
> (except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?
> 

I looked at it once more, and cannot find that "We run a lot of code at 
boot in real mode. This includes stuff like printk()".

Can you provide exemples ?

AFAICS, there are two things which are run in real mode at boot:
1/ prom_init() in kernel/prom_init.c
2/ early_setup() in kernel/setup_64.c

1/ KASAN is already inhibited for prom_init(), and prom_init() only uses 
prom_printf() to display stuff.
2/ early_setup() only call a subset of simple functions. By regrouping 
things in a new file called early_64.c as done for PPC32 with 
early_32.c, we can easily inhibit kasan for those few stuff. printk() is 
not used there either, there is even a comment at the startup of 
early_setup() telling /* -------- printk is _NOT_ safe to use here ! 
------- */. The only things that perform display is the function 
udbg_printf(), which is called only when DEBUG is set and which is 
linked to CONFIG_PPC_EARLY_DEBUG. We already discussed that and agreed 
that CONFIG_PPC_EARLY_DEBUG could be made exclusive of CONFIG_KASAN.

Once early_setup() has run, BOOK3S64 goes in virtual mode, just like 
BOOK3E does.

What am I missing ?

Thanks
Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix
  2019-08-07 15:45 ` [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Christophe Leroy
@ 2019-08-16  4:11   ` Daniel Axtens
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Axtens @ 2019-08-16  4:11 UTC (permalink / raw)
  To: Christophe Leroy, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev

Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>> Building on the work of Christophe, Aneesh and Balbir, I've ported
>> KASAN to 64-bit Book3S kernels running on the Radix MMU.
>> 
>> It builds on top Christophe's work on 32bit. It also builds on my
>> generic KASAN_VMALLOC series, available at:
>> https://patchwork.kernel.org/project/linux-mm/list/?series=153209
>
> Would be good to send that one to the powerpc list as well.
>

Done for v4.

>> 
>> This provides full inline instrumentation on radix, but does require
>> that you be able to specify the amount of memory on the system at
>> compile time. More details in patch 4.
>> 
>> Notable changes from the RFC:
>> 
>>   - I've dropped Book3E 64-bit for now.
>> 
>>   - Now instead of hacking into the KASAN core to disable module
>>     allocations, we use KASAN_VMALLOC.
>> 
>>   - More testing, including on real hardware. This revealed that
>>     discontiguous memory is a bit of a headache, at the moment we
>>     must disable memory not contiguous from 0.
>>     
>>   - Update to deal with kasan bitops instrumentation that landed
>>     between RFC and now.
>
> This is rather independant and also applies to PPC32. Could it be a 
> separate series that Michael could apply earlier ?
>

Will do this and address your feedback on the rest of the series later.

Regards,
Daniel

> Christophe
>
>> 
>>   - Documentation!
>> 
>>   - Various cleanups and tweaks.
>> 
>> I am getting occasional problems on boot of real hardware where it
>> seems vmalloc space mappings don't get installed in time. (We get a
>> BUG that memory is not accessible, but by the time we hit xmon the
>> memory then is accessible!) It happens once every few boots. I haven't
>> yet been able to figure out what is happening and why. I'm going to
>> look in to it, but I think the patches are in good enough shape to
>> review while I work on it.
>> 
>> Regards,
>> Daniel
>> 
>> Daniel Axtens (4):
>>    kasan: allow arches to provide their own early shadow setup
>>    kasan: support instrumented bitops with generic non-atomic bitops
>>    powerpc: support KASAN instrumentation of bitops
>>    powerpc: Book3S 64-bit "heavyweight" KASAN support
>> 
>>   Documentation/dev-tools/kasan.rst            |   7 +-
>>   Documentation/powerpc/kasan.txt              | 111 ++++++++++++++
>>   arch/powerpc/Kconfig                         |   4 +
>>   arch/powerpc/Kconfig.debug                   |  21 +++
>>   arch/powerpc/Makefile                        |   7 +
>>   arch/powerpc/include/asm/bitops.h            |  25 ++--
>>   arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
>>   arch/powerpc/include/asm/kasan.h             |  35 ++++-
>>   arch/powerpc/kernel/process.c                |   8 ++
>>   arch/powerpc/kernel/prom.c                   |  57 +++++++-
>>   arch/powerpc/mm/kasan/Makefile               |   1 +
>>   arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 ++++++++++
>>   include/asm-generic/bitops-instrumented.h    | 144 ++++++++++---------
>>   include/linux/kasan.h                        |   2 +
>>   lib/Kconfig.kasan                            |   3 +
>>   mm/kasan/init.c                              |  10 ++
>>   16 files changed, 431 insertions(+), 85 deletions(-)
>>   create mode 100644 Documentation/powerpc/kasan.txt
>>   create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c
>> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] kasan: allow arches to provide their own early shadow setup
  2019-08-07 15:14   ` Christophe Leroy
  2019-08-07 20:19     ` Christophe Leroy
@ 2019-12-10  4:50     ` Daniel Axtens
  1 sibling, 0 replies; 17+ messages in thread
From: Daniel Axtens @ 2019-12-10  4:50 UTC (permalink / raw)
  To: Christophe Leroy, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev

Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>> powerpc supports several different MMUs. In particular, book3s
>> machines support both a hash-table based MMU and a radix MMU.
>> These MMUs support different numbers of entries per directory
>> level: the PTES_PER_* defines evaluate to variables, not constants.
>> This leads to complier errors as global variables must have constant
>> sizes.
>> 
>> Allow architectures to manage their own early shadow variables so we
>> can work around this on powerpc.
>
> This seems rather strange to move the early shadow tables out of 
> mm/kasan/init.c allthough they are used there still.
>
> What about doing for all what is already done for 
> kasan_early_shadow_p4d[], in extenso define constant max sizes 
> MAX_PTRS_PER_PTE, MAX_PTRS_PER_PMD and MAX_PTRS_PER_PUD ?

I have added this. I haven't tried the ifndef magic, I've just defined
the constant for all arches that implement KASAN.

Regards,
Daniel

>
> With a set of the following, it would remain transparent for other arches.
> #ifndef MAX_PTRS_PER_PXX
> #define MAX_PTRS_PER_PXX PTRS_PER_PXX
> #endif
>
> Then you would just need to do the following for Radix:
>
> #define MAX_PTRS_PER_PTE		(1 << RADIX_PTE_INDEX_SIZE)
> #define MAX_PTRS_PER_PMD		(1 << RADIX_PMD_INDEX_SIZE)
> #define MAX_PTRS_PER_PUD		(1 << RADIX_PUD_INDEX_SIZE)
>
>
> For the kasan_early_shadow_page[], I don't think we have variable 
> PAGE_SIZE, have we ?
>
> Christophe
>
>
>> 
>> Signed-off-by: Daniel Axtens <dja@axtens.net>
>> 
>> ---
>> Changes from RFC:
>> 
>>   - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
>>     a random #define to a config option selected when building for
>>     ppc64 book3s
>> ---
>>   include/linux/kasan.h |  2 ++
>>   lib/Kconfig.kasan     |  3 +++
>>   mm/kasan/init.c       | 10 ++++++++++
>>   3 files changed, 15 insertions(+)
>> 
>> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
>> index ec81113fcee4..15933da52a3e 100644
>> --- a/include/linux/kasan.h
>> +++ b/include/linux/kasan.h
>> @@ -14,11 +14,13 @@ struct task_struct;
>>   #include <asm/kasan.h>
>>   #include <asm/pgtable.h>
>>   
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
>>   extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
>>   extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
>>   extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
>>   extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>> +#endif
>>   
>>   int kasan_populate_early_shadow(const void *shadow_start,
>>   				const void *shadow_end);
>> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
>> index a320dc2e9317..0621a0129c04 100644
>> --- a/lib/Kconfig.kasan
>> +++ b/lib/Kconfig.kasan
>> @@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
>>   config	HAVE_ARCH_KASAN_VMALLOC
>>   	bool
>>   
>> +config ARCH_HAS_KASAN_EARLY_SHADOW
>> +	bool
>> +
>>   config CC_HAS_KASAN_GENERIC
>>   	def_bool $(cc-option, -fsanitize=kernel-address)
>>   
>> diff --git a/mm/kasan/init.c b/mm/kasan/init.c
>> index ce45c491ebcd..7ef2b87a7988 100644
>> --- a/mm/kasan/init.c
>> +++ b/mm/kasan/init.c
>> @@ -31,10 +31,14 @@
>>    *   - Latter it reused it as zero shadow to cover large ranges of memory
>>    *     that allowed to access, but not handled by kasan (vmalloc/vmemmap ...).
>>    */
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
>> +#endif
>>   
>>   #if CONFIG_PGTABLE_LEVELS > 4
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_p4d_table(pgd_t pgd)
>>   {
>>   	return pgd_page(pgd) == virt_to_page(lm_alias(kasan_early_shadow_p4d));
>> @@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
>>   }
>>   #endif
>>   #if CONFIG_PGTABLE_LEVELS > 3
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_pud_table(p4d_t p4d)
>>   {
>>   	return p4d_page(p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud));
>> @@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
>>   }
>>   #endif
>>   #if CONFIG_PGTABLE_LEVELS > 2
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
>> +#endif
>>   static inline bool kasan_pmd_table(pud_t pud)
>>   {
>>   	return pud_page(pud) == virt_to_page(lm_alias(kasan_early_shadow_pmd));
>> @@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
>>   	return false;
>>   }
>>   #endif
>> +#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
>>   pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
>> +#endif
>>   
>>   static inline bool kasan_pte_table(pmd_t pmd)
>>   {
>> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-08-07 16:34   ` Christophe Leroy
  2019-08-09 15:35     ` Christophe Leroy
@ 2019-12-10  5:10     ` Daniel Axtens
  2019-12-13 11:44       ` Christophe Leroy
  1 sibling, 1 reply; 17+ messages in thread
From: Daniel Axtens @ 2019-12-10  5:10 UTC (permalink / raw)
  To: Christophe Leroy, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev

Christophe Leroy <christophe.leroy@c-s.fr> writes:

> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>> KASAN support on powerpc64 is interesting:
>> 
>>   - We want to be able to support inline instrumentation so as to be
>>     able to catch global and stack issues.
>> 
>>   - We run a lot of code at boot in real mode. This includes stuff like
>>     printk(), so it's not feasible to just disable instrumentation
>>     around it.
>
> Have you definitely given up the idea of doing a standard implementation 
> of KASAN like other 64 bits arches have done ?
>
> Isn't it possible to setup an early 1:1 mapping and go in virtual mode 
> earlier ? What is so different between book3s64 and book3e64 ?
> On book3e64, we've been able to setup KASAN before printing anything 
> (except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?

So I got this pretty wrong when trying to explain it. The problem isn't
that we run the code in boot as I said, it's that a bunch of the KVM
code runs in real mode.

>>   - disabled reporting when we're checking the stack for exception
>>     frames. The behaviour isn't wrong, just incompatible with KASAN.
>
> Does this applies to / impacts PPC32 at all ?

It should. I found that when doing stack walks, the code would touch
memory that KASAN hadn't unpoisioned. I'm a bit surprised you haven't
seen it arise, tbh.

>>   - Dropped old module stuff in favour of KASAN_VMALLOC.
>
> You said in the cover that this is done to avoid having to split modules 
> out of VMALLOC area. Would it be an issue to perform that split ?
> I can understand it is not easy on 32 bits because vmalloc space is 
> rather small, but on 64 bits don't we have enough virtual space to 
> confortably split modules out of vmalloc ? The 64 bits already splits 
> ioremap away from vmalloc whereas 32 bits have them merged too.

I could have done this. Maybe I should have done this. But now I have
done vmalloc space support.

>> The bugs with ftrace and kuap were due to kernel bloat pushing
>> prom_init calls to be done via the plt. Because we did not have
>> a relocatable kernel, and they are done very early, this caused
>> everything to explode. Compile with CONFIG_RELOCATABLE!
>> 
>> ---
>>   Documentation/dev-tools/kasan.rst            |   7 +-
>>   Documentation/powerpc/kasan.txt              | 111 +++++++++++++++++++
>>   arch/powerpc/Kconfig                         |   4 +
>>   arch/powerpc/Kconfig.debug                   |  21 ++++
>>   arch/powerpc/Makefile                        |   7 ++
>>   arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
>>   arch/powerpc/include/asm/kasan.h             |  35 +++++-
>>   arch/powerpc/kernel/process.c                |   8 ++
>>   arch/powerpc/kernel/prom.c                   |  57 +++++++++-
>>   arch/powerpc/mm/kasan/Makefile               |   1 +
>>   arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 +++++++++++++
>>   11 files changed, 326 insertions(+), 6 deletions(-)
>>   create mode 100644 Documentation/powerpc/kasan.txt
>>   create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c
>> 
>> diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
>> index 35fda484a672..48d3b669e577 100644
>> --- a/Documentation/dev-tools/kasan.rst
>> +++ b/Documentation/dev-tools/kasan.rst
>> @@ -22,7 +22,9 @@ global variables yet.
>>   Tag-based KASAN is only supported in Clang and requires version 7.0.0 or later.
>>   
>>   Currently generic KASAN is supported for the x86_64, arm64, xtensa and s390
>> -architectures, and tag-based KASAN is supported only for arm64.
>> +architectures. It is also supported on powerpc for 32-bit kernels, and for
>> +64-bit kernels running under the radix MMU. Tag-based KASAN is supported only
>> +for arm64.
>
> Could the 32 bits documentation stuff be a separate patch ?

Done.
>
>>   
>>   Usage
>>   -----
>> @@ -252,7 +254,8 @@ CONFIG_KASAN_VMALLOC
>>   ~~~~~~~~~~~~~~~~~~~~
>>   
>>   With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
>> -cost of greater memory usage. Currently this is only supported on x86.
>> +cost of greater memory usage. Currently this is optional on x86, and
>> +required on 64-bit powerpc.
>>   
>>   This works by hooking into vmalloc and vmap, and dynamically
>>   allocating real shadow memory to back the mappings.
>> diff --git a/Documentation/powerpc/kasan.txt b/Documentation/powerpc/kasan.txt
>> new file mode 100644
>> index 000000000000..a5592454353b
>> --- /dev/null
>> +++ b/Documentation/powerpc/kasan.txt
>> @@ -0,0 +1,111 @@
>> +KASAN is supported on powerpc on 32-bit and 64-bit Radix only.
>> +
>> +32 bit support
>> +==============
>> +
>> +KASAN is supported on both hash and nohash MMUs on 32-bit.
>> +
>> +The shadow area sits at the top of the kernel virtual memory space above the
>> +fixmap area and occupies one eighth of the total kernel virtual memory space.
>> +
>> +Instrumentation of the vmalloc area is not currently supported, but modules are.
>> +
>> +64 bit support
>> +==============
>> +
>> +Currently, only the radix MMU is supported. There have been versions for Book3E
>> +processors floating around on the mailing list, but nothing has been merged.
>> +
>> +KASAN support on Book3S is interesting:
>
> And support for others is not interesting ? :)
>
:) I've reworded it to be a bit more professional.

>> +
>> + - We want to be able to support inline instrumentation so as to be able to
>> +   catch global and stack issues.
>> +
>> + - Inline instrumentation requires a fixed offset.
>> +
>> + - We run a lot of code at boot in real mode. This includes stuff like printk(),
>> +   so it's not feasible to just disable instrumentation around it.
>> +
>> + - Because of our running things in real mode, the offset has to point to valid
>> +   memory both in and out of real mode.
>> +
>> +This makes finding somewhere to put the KASAN shadow region a bit fun.
>> +
>> +One approach is just to give up on inline instrumentation. This way we can delay
>> +all checks until after we get everything set up to our satisfaction. However,
>> +we'd really like to do better.
>> +
>> +If we know _at compile time_ how much contiguous physical memory we have, we can
>> +set aside the top 1/8th of physical memory and use that. This is a big hammer
>> +and comes with 2 big consequences:
>> +
>> + - kernels will simply fail to boot on machines with less memory than specified
>> +   when compiling.
>> +
>> + - kernels running on machines with more memory than specified when compiling
>> +   will simply ignore the extra memory.
>> +
>> +If you can live with this, you get full support for KASAN.
>> +
>> +Tips
>> +----
>> +
>> + - Compile with CONFIG_RELOCATABLE.
>> +
>> +   In development, we found boot hangs when building with ftrace and KUAP. These
>> +   ended up being due to kernel bloat pushing prom_init calls to be done via the
>> +   PLT. Because we did not have a relocatable kernel, and they are done very
>> +   early, this caused us to jump off into somewhere invalid. Enabling relocation
>> +   fixes this.
>> +
>> +NUMA/discontiguous physical memory
>> +----------------------------------
>> +
>> +We currently cannot really deal with discontiguous physical memory. You are
>> +restricted to the physical memory that is contiguous from physical address zero,
>> +and must specify the size of that memory, not total memory, when configuring
>> +your kernel.
>> +
>> +Discontiguous memory can occur when you have a machine with memory spread across
>> +multiple nodes. For example, on a Talos II with 64GB of RAM:
>> +
>> + - 32GB runs from 0x0 to 0x0000_0008_0000_0000,
>> + - then there's a gap,
>> + - then the final 32GB runs from 0x0000_2000_0000_0000 to 0x0000_2008_0000_0000.
>> +
>> +This can create _significant_ issues:
>> +
>> + - If we try to treat the machine as having 64GB of _contiguous_ RAM, we would
>> +   assume that ran from 0x0 to 0x0000_0010_0000_0000. We'd then reserve the last
>> +   1/8th - 0x0000_000e_0000_0000 to 0x0000_0010_0000_0000 as the shadow
>> +   region. But when we try to access any of that, we'll try to access pages that
>> +   are not physically present.
>> +
>> + - If we try to base the shadow region size on the top address, we'll need to
>> +   reserve 0x2008_0000_0000 / 8 = 0x0401_0000_0000 bytes = 4100 GB of memory,
>> +   which will clearly not work on a system with 64GB of RAM.
>> +
>> +Therefore, you are restricted to the memory in the node starting at 0x0. For
>> +this system, that's 32GB. If you specify a contiguous physical memory size
>> +greater than the size of the first contiguous region of memory, the system will
>> +be unable to boot or even print an error message warning you.
>
> Can't we restrict ourselve to the first block at startup while we are in 
> real mode, but then support the entire mem once we have switched on the 
> MMU ?

We could only do this if we could guarantee that all memory accessed by
KVM real mode handlers was in the the first block. I don't think I can
guarantee tihs.

>> +
>> +You can determine the layout of your system's memory by observing the messages
>> +that the Radix MMU prints on boot. The Talos II discussed earlier has:
>> +
>> +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
>> +radix-mmu: Mapped 0x0000000040000000-0x0000000800000000 with 1.00 GiB pages
>> +radix-mmu: Mapped 0x0000200000000000-0x0000200800000000 with 1.00 GiB pages
>> +
>> +As discussed, you'd configure this system for 32768 MB.
>> +
>> +Another system prints:
>> +
>> +radix-mmu: Mapped 0x0000000000000000-0x0000000040000000 with 1.00 GiB pages (exec)
>> +radix-mmu: Mapped 0x0000000040000000-0x0000002000000000 with 1.00 GiB pages
>> +radix-mmu: Mapped 0x0000200000000000-0x0000202000000000 with 1.00 GiB pages
>> +
>> +This machine has more memory: 0x0000_0040_0000_0000 total, but only
>> +0x0000_0020_0000_0000 is physically contiguous from zero, so we'd configure the
>> +kernel for 131072 MB of physically contiguous memory.
>> +
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index d8dcd8820369..3d6deee100e2 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -171,6 +171,10 @@ config PPC
>>   	select HAVE_ARCH_HUGE_VMAP		if PPC_BOOK3S_64 && PPC_RADIX_MMU
>>   	select HAVE_ARCH_JUMP_LABEL
>>   	select HAVE_ARCH_KASAN			if PPC32
>> +	select HAVE_ARCH_KASAN			if PPC_BOOK3S_64 && PPC_RADIX_MMU
>> +	select ARCH_HAS_KASAN_EARLY_SHADOW	if PPC_BOOK3S_64
>
> See comment on patch 1, would be better to avoid that.
>
>> +	select HAVE_ARCH_KASAN_VMALLOC		if PPC_BOOK3S_64
>> +	select KASAN_VMALLOC			if KASAN
>>   	select HAVE_ARCH_KGDB
>>   	select HAVE_ARCH_MMAP_RND_BITS
>>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
>> diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
>> index c59920920ddc..2d6fb7b1ba59 100644
>> --- a/arch/powerpc/Kconfig.debug
>> +++ b/arch/powerpc/Kconfig.debug
>> @@ -394,6 +394,27 @@ config PPC_FAST_ENDIAN_SWITCH
>>           help
>>   	  If you're unsure what this is, say N.
>>   
>> +config PHYS_MEM_SIZE_FOR_KASAN
>> +	int "Contiguous physical memory size for KASAN (MB)"
>> +	depends on 
>
> Drop the depend and maybe do:
> 	int "Contiguous physical memory size for KASAN (MB)" if KASAN && 
> PPC_BOOK3S_64
> 	default 0
>

Done.

> Will allow you to not enclose it into #ifdef KASAN in the code below
>
>> +	help
>> +
>> +	  To get inline instrumentation support for KASAN on 64-bit Book3S
>> +	  machines, you need to know how much contiguous physical memory your
>> +	  system has. A shadow offset will be calculated based on this figure,
>> +	  which will be compiled in to the kernel. KASAN will use this offset
>> +	  to access its shadow region, which is used to verify memory accesses.
>> +
>> +	  If you attempt to boot on a system with less memory than you specify
>> +	  here, your system will fail to boot very early in the process. If you
>> +	  boot on a system with more memory than you specify, the extra memory
>> +	  will wasted - it will be reserved and not used.
>> +
>> +	  For systems with discontiguous blocks of physical memory, specify the
>> +	  size of the block starting at 0x0. You can determine this by looking
>> +	  at the memory layout info printed to dmesg by the radix MMU code
>> +	  early in boot. See Documentation/powerpc/kasan.txt.
>> +
>>   config KASAN_SHADOW_OFFSET
>>   	hex
>>   	depends on KASAN
>> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
>> index c345b79414a9..33e7bba4c8db 100644
>> --- a/arch/powerpc/Makefile
>> +++ b/arch/powerpc/Makefile
>> @@ -229,6 +229,13 @@ ifdef CONFIG_476FPE_ERR46
>>   		-T $(srctree)/arch/powerpc/platforms/44x/ppc476_modules.lds
>>   endif
>>   
>> +ifdef CONFIG_KASAN
>> +ifdef CONFIG_PPC_BOOK3S_64
>> +# 0xa800000000000000 = 12105675798371893248
>> +KASAN_SHADOW_OFFSET = $(shell echo 7 \* 1024 \* 1024 \* $(CONFIG_PHYS_MEM_SIZE_FOR_KASAN) / 8 + 12105675798371893248 | bc)
>> +endif
>> +endif
>> +
>>   # No AltiVec or VSX instructions when building kernel
>>   KBUILD_CFLAGS += $(call cc-option,-mno-altivec)
>>   KBUILD_CFLAGS += $(call cc-option,-mno-vsx)
>> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
>> index e04a839cb5b9..4c011cc15e05 100644
>> --- a/arch/powerpc/include/asm/book3s/64/radix.h
>> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
>> @@ -35,6 +35,11 @@
>>   #define RADIX_PMD_SHIFT		(PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
>>   #define RADIX_PUD_SHIFT		(RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
>>   #define RADIX_PGD_SHIFT		(RADIX_PUD_SHIFT + RADIX_PUD_INDEX_SIZE)
>> +
>> +#define R_PTRS_PER_PTE		(1 << RADIX_PTE_INDEX_SIZE)
>> +#define R_PTRS_PER_PMD		(1 << RADIX_PMD_INDEX_SIZE)
>> +#define R_PTRS_PER_PUD		(1 << RADIX_PUD_INDEX_SIZE)
>> +
>
> See my suggestion in patch 1.
>
>>   /*
>>    * Size of EA range mapped by our pagetables.
>>    */
>> diff --git a/arch/powerpc/include/asm/kasan.h b/arch/powerpc/include/asm/kasan.h
>> index 296e51c2f066..d6b4028c296b 100644
>> --- a/arch/powerpc/include/asm/kasan.h
>> +++ b/arch/powerpc/include/asm/kasan.h
>> @@ -14,13 +14,20 @@
>>   
>>   #ifndef __ASSEMBLY__
>>   
>> -#include <asm/page.h>
>> +#ifdef CONFIG_KASAN
>> +void kasan_init(void);
>> +#else
>> +static inline void kasan_init(void) { }
>> +#endif
>>   
>>   #define KASAN_SHADOW_SCALE_SHIFT	3
>>   
>>   #define KASAN_SHADOW_START	(KASAN_SHADOW_OFFSET + \
>>   				 (PAGE_OFFSET >> KASAN_SHADOW_SCALE_SHIFT))
>>   
>> +#ifdef CONFIG_PPC32
>> +#include <asm/page.h>
>> +
>>   #define KASAN_SHADOW_OFFSET	ASM_CONST(CONFIG_KASAN_SHADOW_OFFSET)
>>   
>>   #define KASAN_SHADOW_END	0UL
>> @@ -30,11 +37,33 @@
>>   #ifdef CONFIG_KASAN
>>   void kasan_early_init(void);
>>   void kasan_mmu_init(void);
>> -void kasan_init(void);
>>   #else
>> -static inline void kasan_init(void) { }
>>   static inline void kasan_mmu_init(void) { }
>>   #endif
>> +#endif
>> +
>> +#ifdef CONFIG_PPC_BOOK3S_64
>> +#include <asm/pgtable.h>
>> +
>> +/*
>> + * The KASAN shadow offset is such that linear map (0xc000...) is shadowed by
>> + * the last 8th of linearly mapped physical memory. This way, if the code uses
>> + * 0xc addresses throughout, accesses work both in in real mode (where the top
>> + * 2 bits are ignored) and outside of real mode.
>> + */
>> +#define KASAN_SHADOW_OFFSET ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
>> +				1024 * 1024 * 7 / 8 + 0xa800000000000000UL)
>
> Already calculated in the Makefile, can't we reuse it ?
>
> 'X * 1024 * 1024' is usually better understood is 'X << 20' instead.
>
>
>> +
>> +#define KASAN_SHADOW_SIZE ((u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * \
>> +				1024 * 1024 * 1 / 8)
>> +
>> +extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
>> +
>> +extern pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE];
>> +extern pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD];
>> +extern pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD];
>> +extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
>> +#endif /* CONFIG_PPC_BOOK3S_64 */
>>   
>>   #endif /* __ASSEMBLY */
>>   #endif
>> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
>> index 8fc4de0d22b4..31602536e72b 100644
>> --- a/arch/powerpc/kernel/process.c
>> +++ b/arch/powerpc/kernel/process.c
>> @@ -2097,7 +2097,14 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
>>   		/*
>>   		 * See if this is an exception frame.
>>   		 * We look for the "regshere" marker in the current frame.
>> +		 *
>> +		 * KASAN may complain about this. If it is an exception frame,
>> +		 * we won't have unpoisoned the stack in asm when we set the
>> +		 * exception marker. If it's not an exception frame, who knows
>> +		 * how things are laid out - the shadow could be in any state
>> +		 * at all. Just disable KASAN reporting for now.
>>   		 */
>> +		kasan_disable_current();
>>   		if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
>>   		    && stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
>>   			struct pt_regs *regs = (struct pt_regs *)
>> @@ -2107,6 +2114,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
>>   			       regs->trap, (void *)regs->nip, (void *)lr);
>>   			firstframe = 1;
>>   		}
>> +		kasan_enable_current();
>>   
>>   		sp = newsp;
>>   	} while (count++ < kstack_depth_to_print);
>> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
>> index 7159e791a70d..dde5f2896ab6 100644
>> --- a/arch/powerpc/kernel/prom.c
>> +++ b/arch/powerpc/kernel/prom.c
>> @@ -71,6 +71,7 @@ unsigned long tce_alloc_start, tce_alloc_end;
>>   u64 ppc64_rma_size;
>>   #endif
>>   static phys_addr_t first_memblock_size;
>> +static phys_addr_t top_phys_addr;
>>   static int __initdata boot_cpu_count;
>>   
>>   static int __init early_parse_mem(char *p)
>> @@ -448,6 +449,21 @@ static bool validate_mem_limit(u64 base, u64 *size)
>>   {
>>   	u64 max_mem = 1UL << (MAX_PHYSMEM_BITS);
>>   
>> +#ifdef CONFIG_KASAN
>> +	/*
>> +	 * To handle the NUMA/discontiguous memory case, don't allow a block
>> +	 * to be added if it falls completely beyond the configured physical
>> +	 * memory.
>> +	 *
>> +	 * See Documentation/powerpc/kasan.txt
>> +	 */
>> +	if (base >= (u64)CONFIG_PHYS_MEM_SIZE_FOR_KASAN * 1024 * 1024) {
>> +		pr_warn("KASAN: not adding mem block at %llx (size %llx)",
>> +			base, *size);
>> +		return false;
>> +	}
>> +#endif
>> +
>>   	if (base >= max_mem)
>>   		return false;
>>   	if ((base + *size) > max_mem)
>> @@ -571,8 +587,11 @@ void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>>   
>>   	/* Add the chunk to the MEMBLOCK list */
>>   	if (add_mem_to_memblock) {
>> -		if (validate_mem_limit(base, &size))
>> +		if (validate_mem_limit(base, &size)) {
>>   			memblock_add(base, size);
>> +			if (base + size > top_phys_addr)
>> +				top_phys_addr = base + size;
>> +		}
>>   	}
>>   }
>>   
>> @@ -612,6 +631,8 @@ static void __init early_reserve_mem_dt(void)
>>   static void __init early_reserve_mem(void)
>>   {
>>   	__be64 *reserve_map;
>> +	phys_addr_t kasan_shadow_start __maybe_unused;
>> +	phys_addr_t kasan_memory_size __maybe_unused;
>
> Could we avoid those uggly __maybe_unused ?
>
>>   
>>   	reserve_map = (__be64 *)(((unsigned long)initial_boot_params) +
>>   			fdt_off_mem_rsvmap(initial_boot_params));
>> @@ -650,6 +671,40 @@ static void __init early_reserve_mem(void)
>>   		return;
>>   	}
>>   #endif
>> +
>> +#if defined(CONFIG_KASAN) && defined(CONFIG_PPC_BOOK3S_64)
>
> Would be better to do following instead of the #ifdef
>
> if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {
> }
>
> This would avoid the __maybe_unused above, and would allow to valide the 
> code even when CONFIG_KASAN is not selected.
>
> This would probably require to set a default value for 
> CONFIG_PHYS_MEM_SIZE_FOR_KASAN when KASAN is not set, or maybe define an 
> intermediate const somewhere in some .h which takes 
> CONFIG_PHYS_MEM_SIZE_FOR_KASAN when KASAN is there and 0 when KASAN is 
> not there
>
Done.

>> +	kasan_memory_size = ((phys_addr_t)CONFIG_PHYS_MEM_SIZE_FOR_KASAN
>> +		* 1024 * 1024);
>> +
>> +	if (top_phys_addr < kasan_memory_size) {
>> +		/*
>> +		 * We are doomed. Attempts to call e.g. panic() are likely to
>> +		 * fail because they call out into instrumented code, which
>> +		 * will almost certainly access memory beyond the end of
>> +		 * physical memory. Hang here so that at least the NIP points
>> +		 * somewhere that will help you debug it if you look at it in
>> +		 * qemu.
>> +		 */
>> +		while (true)
>> +			;
>
> A bit gross ?
>

It is, but I don't know how to do it better. I don't want to panic(), as
explained... What did you have in mind?

>
>> +	} else if (top_phys_addr > kasan_memory_size) {
>> +		/* print a biiiig warning in hopes people notice */
>> +		pr_err("===========================================\n"
>> +			"Physical memory exceeds compiled-in maximum!\n"
>> +			"This kernel was compiled for KASAN with %u MB physical memory.\n"
>> +			"The actual physical memory detected is %llu MB.\n"
>> +			"Memory above the compiled limit will not be used!\n"
>> +			"===========================================\n",
>> +			CONFIG_PHYS_MEM_SIZE_FOR_KASAN,
>> +			top_phys_addr / (1024 * 1024));
>> +	}
>> +
>> +	kasan_shadow_start = _ALIGN_DOWN(kasan_memory_size * 7 / 8, PAGE_SIZE);
>> +	DBG("reserving %llx -> %llx for KASAN",
>> +	    kasan_shadow_start, top_phys_addr);
>> +	memblock_reserve(kasan_shadow_start,
>> +			 top_phys_addr - kasan_shadow_start);
>> +#endif
>>   }
>>   
>>   #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>> diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
>> index 6577897673dd..ff8143ba1e4d 100644
>> --- a/arch/powerpc/mm/kasan/Makefile
>> +++ b/arch/powerpc/mm/kasan/Makefile
>> @@ -3,3 +3,4 @@
>>   KASAN_SANITIZE := n
>>   
>>   obj-$(CONFIG_PPC32)           += kasan_init_32.o
>> +obj-$(CONFIG_PPC_BOOK3S_64)   += kasan_init_book3s_64.o
>> diff --git a/arch/powerpc/mm/kasan/kasan_init_book3s_64.c b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c
>
> In the same spirit as what was done in other mm subdirs, could we rename 
> those files to:
> init_32.o
> init_book3s64.o

Done.

>
>> new file mode 100644
>> index 000000000000..fafda3d5e9a3
>> --- /dev/null
>> +++ b/arch/powerpc/mm/kasan/kasan_init_book3s_64.c
>> @@ -0,0 +1,76 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * KASAN for 64-bit Book3S powerpc
>> + *
>> + * Copyright (C) 2019 IBM Corporation
>> + * Author: Daniel Axtens <dja@axtens.net>
>> + */
>> +
>> +#define DISABLE_BRANCH_PROFILING
>> +
>> +#include <linux/kasan.h>
>> +#include <linux/printk.h>
>> +#include <linux/sched/task.h>
>> +#include <asm/pgalloc.h>
>> +
>> +unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
>
> What's the difference with what's defined in the kasan generic part and 
> that you have opted out in first patch ?
>
>> +
>> +pte_t kasan_early_shadow_pte[R_PTRS_PER_PTE] __page_aligned_bss;
>> +pmd_t kasan_early_shadow_pmd[R_PTRS_PER_PMD] __page_aligned_bss;
>> +pud_t kasan_early_shadow_pud[R_PTRS_PER_PUD] __page_aligned_bss;
>> +p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
>
> See my suggestion for those in patch 1.
>
>> +
>> +void __init kasan_init(void)
>> +{
>> +	int i;
>> +	void *k_start = kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START);
>> +	void *k_end = kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END);
>> +
>> +	unsigned long pte_val = __pa(kasan_early_shadow_page)
>> +					| pgprot_val(PAGE_KERNEL) | _PAGE_PTE;
>> +
>> +	if (!early_radix_enabled())
>> +		panic("KASAN requires radix!");
>> +
>> +	for (i = 0; i < PTRS_PER_PTE; i++)
>> +		kasan_early_shadow_pte[i] = __pte(pte_val);
>
> Shouldn't you use __set_pte_at() here ?
>
Yes, done.

>> +
>> +	for (i = 0; i < PTRS_PER_PMD; i++)
>> +		pmd_populate_kernel(&init_mm, &kasan_early_shadow_pmd[i],
>> +				    kasan_early_shadow_pte);
>> +
>> +	for (i = 0; i < PTRS_PER_PUD; i++)
>> +		pud_populate(&init_mm, &kasan_early_shadow_pud[i],
>> +			     kasan_early_shadow_pmd);
>> +
>> +
>> +	memset(kasan_mem_to_shadow((void *)PAGE_OFFSET), KASAN_SHADOW_INIT,
>> +		KASAN_SHADOW_SIZE);
>> +
>> +	kasan_populate_early_shadow(
>> +		kasan_mem_to_shadow((void *)RADIX_KERN_VIRT_START),
>> +		kasan_mem_to_shadow((void *)RADIX_VMALLOC_START));
>> +
>> +	/* leave a hole here for vmalloc */
>> +
>> +	kasan_populate_early_shadow(
>> +		kasan_mem_to_shadow((void *)RADIX_VMALLOC_END),
>> +		kasan_mem_to_shadow((void *)RADIX_VMEMMAP_END));
>> +
>> +	flush_tlb_kernel_range((unsigned long)k_start, (unsigned long)k_end);
>> +
>> +	/* mark early shadow region as RO and wipe */
>> +	for (i = 0; i < PTRS_PER_PTE; i++)
>> +		__set_pte_at(&init_mm, (unsigned long)kasan_early_shadow_page,
>> +			&kasan_early_shadow_pte[i],
>> +			pfn_pte(virt_to_pfn(kasan_early_shadow_page),
>> +			__pgprot(_PAGE_PTE | _PAGE_KERNEL_RO | _PAGE_BASE)),
>
> Isn't there an already existing val for the above set of flags ?
>
See if you like the simplified version in v2.

>> +			0);
>> +	memset(kasan_early_shadow_page, 0, PAGE_SIZE);
>> +
>> +	kasan_init_tags();
>
> You said in the doc that only arm supports that ...
>
Indeed - serves me right for copying arm code!

Regards,
Daniel

>> +
>> +	/* Enable error messages */
>> +	init_task.kasan_depth = 0;
>> +	pr_info("KASAN init done (64-bit Book3S heavyweight mode)\n");
>> +}
>> 
>
> Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-12-10  5:10     ` Daniel Axtens
@ 2019-12-13 11:44       ` Christophe Leroy
  2019-12-13 13:40         ` Daniel Axtens
  0 siblings, 1 reply; 17+ messages in thread
From: Christophe Leroy @ 2019-12-13 11:44 UTC (permalink / raw)
  To: Daniel Axtens, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev



Le 10/12/2019 à 06:10, Daniel Axtens a écrit :
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
> 
>> Le 07/08/2019 à 01:38, Daniel Axtens a écrit :
>>> KASAN support on powerpc64 is interesting:
>>>
>>>    - We want to be able to support inline instrumentation so as to be
>>>      able to catch global and stack issues.
>>>
>>>    - We run a lot of code at boot in real mode. This includes stuff like
>>>      printk(), so it's not feasible to just disable instrumentation
>>>      around it.
>>
>> Have you definitely given up the idea of doing a standard implementation
>> of KASAN like other 64 bits arches have done ?
>>
>> Isn't it possible to setup an early 1:1 mapping and go in virtual mode
>> earlier ? What is so different between book3s64 and book3e64 ?
>> On book3e64, we've been able to setup KASAN before printing anything
>> (except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?
> 
> So I got this pretty wrong when trying to explain it. The problem isn't
> that we run the code in boot as I said, it's that a bunch of the KVM
> code runs in real mode.

Ok.

Does it mean we would be able to implement it the standard way when 
CONFIG_KVM is not selected ?

> 
>>>    - disabled reporting when we're checking the stack for exception
>>>      frames. The behaviour isn't wrong, just incompatible with KASAN.
>>
>> Does this applies to / impacts PPC32 at all ?
> 
> It should. I found that when doing stack walks, the code would touch
> memory that KASAN hadn't unpoisioned. I'm a bit surprised you haven't
> seen it arise, tbh.

How do you trigger that ?

I've tried to provoke some faults with LKDTM that provoke BUG dumps, but 
it doesn't trip.
I also performed task state listing via sysrq, and I don't get anything 
wrong either.

> 
>>>    - Dropped old module stuff in favour of KASAN_VMALLOC.
>>
>> You said in the cover that this is done to avoid having to split modules
>> out of VMALLOC area. Would it be an issue to perform that split ?
>> I can understand it is not easy on 32 bits because vmalloc space is
>> rather small, but on 64 bits don't we have enough virtual space to
>> confortably split modules out of vmalloc ? The 64 bits already splits
>> ioremap away from vmalloc whereas 32 bits have them merged too.
> 
> I could have done this. Maybe I should have done this. But now I have
> done vmalloc space support.

So you force the use of KASAN_VMALLOC ? Doesn't it have a performance 
impact ?


Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support
  2019-12-13 11:44       ` Christophe Leroy
@ 2019-12-13 13:40         ` Daniel Axtens
  0 siblings, 0 replies; 17+ messages in thread
From: Daniel Axtens @ 2019-12-13 13:40 UTC (permalink / raw)
  To: Christophe Leroy, aneesh.kumar, bsingharora; +Cc: linuxppc-dev, kasan-dev

Hi Christophe,

>>>>    - We run a lot of code at boot in real mode. This includes stuff like
>>>>      printk(), so it's not feasible to just disable instrumentation
>>>>      around it.
>>>
>>> Have you definitely given up the idea of doing a standard implementation
>>> of KASAN like other 64 bits arches have done ?
>>>
>>> Isn't it possible to setup an early 1:1 mapping and go in virtual mode
>>> earlier ? What is so different between book3s64 and book3e64 ?
>>> On book3e64, we've been able to setup KASAN before printing anything
>>> (except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?
>> 
>> So I got this pretty wrong when trying to explain it. The problem isn't
>> that we run the code in boot as I said, it's that a bunch of the KVM
>> code runs in real mode.
>
> Ok.
>
> Does it mean we would be able to implement it the standard way when 
> CONFIG_KVM is not selected ?

I suppose, but KVM is pretty important to me!

>>>>    - disabled reporting when we're checking the stack for exception
>>>>      frames. The behaviour isn't wrong, just incompatible with KASAN.
>>>
>>> Does this applies to / impacts PPC32 at all ?
>> 
>> It should. I found that when doing stack walks, the code would touch
>> memory that KASAN hadn't unpoisioned. I'm a bit surprised you haven't
>> seen it arise, tbh.
>
> How do you trigger that ?
>
> I've tried to provoke some faults with LKDTM that provoke BUG dumps, but 
> it doesn't trip.
> I also performed task state listing via sysrq, and I don't get anything 
> wrong either.

I'll try to disable this and see if I can trigger it again.

>>>>    - Dropped old module stuff in favour of KASAN_VMALLOC.
>>>
>>> You said in the cover that this is done to avoid having to split modules
>>> out of VMALLOC area. Would it be an issue to perform that split ?
>>> I can understand it is not easy on 32 bits because vmalloc space is
>>> rather small, but on 64 bits don't we have enough virtual space to
>>> confortably split modules out of vmalloc ? The 64 bits already splits
>>> ioremap away from vmalloc whereas 32 bits have them merged too.
>> 
>> I could have done this. Maybe I should have done this. But now I have
>> done vmalloc space support.
>
> So you force the use of KASAN_VMALLOC ? Doesn't it have a performance 
> impact ?

It has a perfomance impact when allocating and freeing virtual address
space in the vmalloc region, yes. There should be no discernable impact
when using vmalloc space.

My team is actively working on vmap-stack support for ppc64, with the
end goal of running syzkaller with vmap-stack and kasan. vmap-stack plus
kasan requires kasan-vmalloc, so for my purposes doing things in this
order makes sense.

I'd be happy to have a later series introduce the split and then make
KASAN_VMALLOC optional. I would need to understand the implications of
splitting the address space from a KASLR point of view: I don't want to
accidentally overly restrict the available randomness.

Regards,
Daniel

>
> Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-12-13 21:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-06 23:38 [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Daniel Axtens
2019-08-06 23:38 ` [PATCH 1/4] kasan: allow arches to provide their own early shadow setup Daniel Axtens
2019-08-07 15:14   ` Christophe Leroy
2019-08-07 20:19     ` Christophe Leroy
2019-12-10  4:50     ` Daniel Axtens
2019-08-06 23:38 ` [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops Daniel Axtens
2019-08-07 14:58   ` Christophe Leroy
2019-08-06 23:38 ` [PATCH 3/4] powerpc: support KASAN instrumentation of bitops Daniel Axtens
2019-08-07 15:01   ` Christophe Leroy
2019-08-06 23:38 ` [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support Daniel Axtens
2019-08-07 16:34   ` Christophe Leroy
2019-08-09 15:35     ` Christophe Leroy
2019-12-10  5:10     ` Daniel Axtens
2019-12-13 11:44       ` Christophe Leroy
2019-12-13 13:40         ` Daniel Axtens
2019-08-07 15:45 ` [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix Christophe Leroy
2019-08-16  4:11   ` Daniel Axtens

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.