linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
@ 2020-01-22 17:52 Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 2/6] powerpc/32s: Fix bad_kuap_fault() Christophe Leroy
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Linus Torvalds, Alexander Viro, Andrew Morton
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Some architectures grand full access to userspace regardless of the
address/len passed to user_access_begin(), but other architectures
only grand access to the requested area.

For exemple, on 32 bits powerpc (book3s/32), access is granted by
segments of 256 Mbytes.

Modify filldir() and filldir64() to request the real area they need
to get access to, i.e. the area covering the parent dirent (if any)
and the contiguous current dirent.

Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: have user_access_begin() cover both parent dirent (if any) and current dirent
---
 fs/readdir.c | 50 ++++++++++++++++++++++++++++----------------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index d26d5ea4de7b..3f9b4488d9b7 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -214,7 +214,7 @@ struct getdents_callback {
 static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		   loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent __user * dirent;
+	struct linux_dirent __user * dirent, *dirent0;
 	struct getdents_callback *buf =
 		container_of(ctx, struct getdents_callback, ctx);
 	unsigned long d_ino;
@@ -232,19 +232,22 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		buf->error = -EOVERFLOW;
 		return -EOVERFLOW;
 	}
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
+	dirent0 = buf->previous;
+	if (dirent0 && signal_pending(current))
 		return -EINTR;
 
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
 	dirent = buf->current_dir;
+	if (dirent0) {
+		int sz = (void __user *)dirent + reclen -
+			 (void __user *)dirent0;
+
+		if (!user_access_begin(dirent0, sz))
+			goto efault;
+		unsafe_put_user(offset, &dirent0->d_off, efault_end);
+	} else {
+		if (!user_access_begin(dirent, reclen))
+			goto efault;
+	}
 	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
@@ -307,7 +310,7 @@ struct getdents_callback64 {
 static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 		     loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent64 __user *dirent;
+	struct linux_dirent64 __user *dirent, *dirent0;
 	struct getdents_callback64 *buf =
 		container_of(ctx, struct getdents_callback64, ctx);
 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
@@ -319,19 +322,22 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
 		return -EINVAL;
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
+	dirent0 = buf->previous;
+	if (dirent0 && signal_pending(current))
 		return -EINTR;
 
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
 	dirent = buf->current_dir;
+	if (dirent0) {
+		int sz = (void __user *)dirent + reclen -
+			 (void __user *)dirent0;
+
+		if (!user_access_begin(dirent0, sz))
+			goto efault;
+		unsafe_put_user(offset, &dirent0->d_off, efault_end);
+	} else {
+		if (!user_access_begin(dirent, reclen))
+			goto efault;
+	}
 	unsafe_put_user(ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, &dirent->d_type, efault_end);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/6] powerpc/32s: Fix bad_kuap_fault()
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
@ 2020-01-22 17:52 ` Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 3/6] powerpc/kuap: Fix set direction in allow/prevent_user_access() Christophe Leroy
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

At the moment, bad_kuap_fault() reports a fault only if a bad access
to userspace occurred while access to userspace was not granted.

But if a fault occurs for a write outside the allowed userspace
segment(s) that have been unlocked, bad_kuap_fault() fails to
detect it and the kernel loops forever in do_page_fault().

Fix it by checking that the accessed address is within the allowed
range.

Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: added missing address parametre to bad_kuap_fault() in asm/kup.h
---
 arch/powerpc/include/asm/book3s/32/kup.h       | 9 +++++++--
 arch/powerpc/include/asm/book3s/64/kup-radix.h | 3 ++-
 arch/powerpc/include/asm/kup.h                 | 6 +++++-
 arch/powerpc/include/asm/nohash/32/kup-8xx.h   | 3 ++-
 arch/powerpc/mm/fault.c                        | 2 +-
 5 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index f9dc597b0b86..d88008c8eb85 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -131,12 +131,17 @@ static inline void prevent_user_access(void __user *to, const void __user *from,
 	kuap_update_sr(mfsrin(addr) | SR_KS, addr, end);	/* set Ks */
 }
 
-static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 {
+	unsigned long begin = regs->kuap & 0xf0000000;
+	unsigned long end = regs->kuap << 28;
+
 	if (!is_write)
 		return false;
 
-	return WARN(!regs->kuap, "Bug: write fault blocked by segment registers !");
+	return WARN(address < begin || address >= end,
+		    "Bug: write fault blocked by segment registers !");
 }
 
 #endif /* CONFIG_PPC_KUAP */
diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index f254de956d6a..dbbd22cb80f5 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -95,7 +95,8 @@ static inline void prevent_user_access(void __user *to, const void __user *from,
 	set_kuap(AMR_KUAP_BLOCKED);
 }
 
-static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 {
 	return WARN(mmu_has_feature(MMU_FTR_RADIX_KUAP) &&
 		    (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 5b5e39643a27..812e66f31934 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -45,7 +45,11 @@ static inline void allow_user_access(void __user *to, const void __user *from,
 				     unsigned long size) { }
 static inline void prevent_user_access(void __user *to, const void __user *from,
 				       unsigned long size) { }
-static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) { return false; }
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
+{
+	return false;
+}
 #endif /* CONFIG_PPC_KUAP */
 
 static inline void allow_read_from_user(const void __user *from, unsigned long size)
diff --git a/arch/powerpc/include/asm/nohash/32/kup-8xx.h b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
index 1006a427e99c..f2fea603b929 100644
--- a/arch/powerpc/include/asm/nohash/32/kup-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
@@ -46,7 +46,8 @@ static inline void prevent_user_access(void __user *to, const void __user *from,
 	mtspr(SPRN_MD_AP, MD_APG_KUAP);
 }
 
-static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 {
 	return WARN(!((regs->kuap ^ MD_APG_KUAP) & 0xf0000000),
 		    "Bug: fault blocked by AP register !");
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b5047f9b5dec..1baeb045f7f4 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -233,7 +233,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
 
 	// Read/write fault in a valid region (the exception table search passed
 	// above), but blocked by KUAP is bad, it can never succeed.
-	if (bad_kuap_fault(regs, is_write))
+	if (bad_kuap_fault(regs, address, is_write))
 		return true;
 
 	// What's left? Kernel fault on user in well defined regions (extable
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/6] powerpc/kuap: Fix set direction in allow/prevent_user_access()
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 2/6] powerpc/32s: Fix bad_kuap_fault() Christophe Leroy
@ 2020-01-22 17:52 ` Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 4/6] powerpc/32s: Drop NULL addr verification Christophe Leroy
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

__builtin_constant_p() always return 0 for pointers, so on RADIX
we always end up opening both direction (by writing 0 in SPR29):

0000000000000170 <._copy_to_user>:
...
 1b0:	4c 00 01 2c 	isync
 1b4:	39 20 00 00 	li      r9,0
 1b8:	7d 3d 03 a6 	mtspr   29,r9
 1bc:	4c 00 01 2c 	isync
 1c0:	48 00 00 01 	bl      1c0 <._copy_to_user+0x50>
			1c0: R_PPC64_REL24	.__copy_tofrom_user
...
0000000000000220 <._copy_from_user>:
...
 2ac:	4c 00 01 2c 	isync
 2b0:	39 20 00 00 	li      r9,0
 2b4:	7d 3d 03 a6 	mtspr   29,r9
 2b8:	4c 00 01 2c 	isync
 2bc:	7f c5 f3 78 	mr      r5,r30
 2c0:	7f 83 e3 78 	mr      r3,r28
 2c4:	48 00 00 01 	bl      2c4 <._copy_from_user+0xa4>
			2c4: R_PPC64_REL24	.__copy_tofrom_user
...

Use an explicit parameter for direction selection, so that GCC
is able to see it is a constant:

00000000000001b0 <._copy_to_user>:
...
 1f0:	4c 00 01 2c 	isync
 1f4:	3d 20 40 00 	lis     r9,16384
 1f8:	79 29 07 c6 	rldicr  r9,r9,32,31
 1fc:	7d 3d 03 a6 	mtspr   29,r9
 200:	4c 00 01 2c 	isync
 204:	48 00 00 01 	bl      204 <._copy_to_user+0x54>
			204: R_PPC64_REL24	.__copy_tofrom_user
...
0000000000000260 <._copy_from_user>:
...
 2ec:	4c 00 01 2c 	isync
 2f0:	39 20 ff ff 	li      r9,-1
 2f4:	79 29 00 04 	rldicr  r9,r9,0,0
 2f8:	7d 3d 03 a6 	mtspr   29,r9
 2fc:	4c 00 01 2c 	isync
 300:	7f c5 f3 78 	mr      r5,r30
 304:	7f 83 e3 78 	mr      r3,r28
 308:	48 00 00 01 	bl      308 <._copy_from_user+0xa8>
			308: R_PPC64_REL24	.__copy_tofrom_user
...

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
---
 arch/powerpc/include/asm/book3s/32/kup.h      | 13 ++++++--
 .../powerpc/include/asm/book3s/64/kup-radix.h | 11 +++----
 arch/powerpc/include/asm/kup.h                | 30 ++++++++++++++-----
 arch/powerpc/include/asm/nohash/32/kup-8xx.h  |  4 +--
 arch/powerpc/include/asm/uaccess.h            |  4 +--
 5 files changed, 43 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index d88008c8eb85..d765515bd1c1 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -102,11 +102,13 @@ static inline void kuap_update_sr(u32 sr, u32 addr, u32 end)
 	isync();	/* Context sync required after mtsrin() */
 }
 
-static inline void allow_user_access(void __user *to, const void __user *from, u32 size)
+static __always_inline void allow_user_access(void __user *to, const void __user *from,
+					      u32 size, unsigned long dir)
 {
 	u32 addr, end;
 
-	if (__builtin_constant_p(to) && to == NULL)
+	BUILD_BUG_ON(!__builtin_constant_p(dir));
+	if (!(dir & KUAP_W))
 		return;
 
 	addr = (__force u32)to;
@@ -119,11 +121,16 @@ static inline void allow_user_access(void __user *to, const void __user *from, u
 	kuap_update_sr(mfsrin(addr) & ~SR_KS, addr, end);	/* Clear Ks */
 }
 
-static inline void prevent_user_access(void __user *to, const void __user *from, u32 size)
+static __always_inline void prevent_user_access(void __user *to, const void __user *from,
+						u32 size, unsigned long dir)
 {
 	u32 addr = (__force u32)to;
 	u32 end = min(addr + size, TASK_SIZE);
 
+	BUILD_BUG_ON(!__builtin_constant_p(dir));
+	if (!(dir & KUAP_W))
+		return;
+
 	if (!addr || addr >= TASK_SIZE || !size)
 		return;
 
diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index dbbd22cb80f5..f11315306d41 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -77,20 +77,21 @@ static inline void set_kuap(unsigned long value)
 	isync();
 }
 
-static inline void allow_user_access(void __user *to, const void __user *from,
-				     unsigned long size)
+static __always_inline void allow_user_access(void __user *to, const void __user *from,
+					      unsigned long size, unsigned long dir)
 {
 	// This is written so we can resolve to a single case at build time
-	if (__builtin_constant_p(to) && to == NULL)
+	BUILD_BUG_ON(!__builtin_constant_p(dir));
+	if (dir == KUAP_R)
 		set_kuap(AMR_KUAP_BLOCK_WRITE);
-	else if (__builtin_constant_p(from) && from == NULL)
+	else if (dir == KUAP_W)
 		set_kuap(AMR_KUAP_BLOCK_READ);
 	else
 		set_kuap(0);
 }
 
 static inline void prevent_user_access(void __user *to, const void __user *from,
-				       unsigned long size)
+				       unsigned long size, unsigned long dir)
 {
 	set_kuap(AMR_KUAP_BLOCKED);
 }
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 812e66f31934..ff57bfcb88f7 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -2,6 +2,10 @@
 #ifndef _ASM_POWERPC_KUP_H_
 #define _ASM_POWERPC_KUP_H_
 
+#define KUAP_R		1
+#define KUAP_W		2
+#define KUAP_RW		(KUAP_R | KUAP_W)
+
 #ifdef CONFIG_PPC64
 #include <asm/book3s/64/kup-radix.h>
 #endif
@@ -42,9 +46,9 @@ void setup_kuap(bool disabled);
 #else
 static inline void setup_kuap(bool disabled) { }
 static inline void allow_user_access(void __user *to, const void __user *from,
-				     unsigned long size) { }
+				     unsigned long size, unsigned long dir) { }
 static inline void prevent_user_access(void __user *to, const void __user *from,
-				       unsigned long size) { }
+				       unsigned long size, unsigned long dir) { }
 static inline bool
 bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 {
@@ -54,24 +58,36 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
 
 static inline void allow_read_from_user(const void __user *from, unsigned long size)
 {
-	allow_user_access(NULL, from, size);
+	allow_user_access(NULL, from, size, KUAP_R);
 }
 
 static inline void allow_write_to_user(void __user *to, unsigned long size)
 {
-	allow_user_access(to, NULL, size);
+	allow_user_access(to, NULL, size, KUAP_W);
+}
+
+static inline void allow_read_write_user(void __user *to, const void __user *from,
+					 unsigned long size)
+{
+	allow_user_access(to, from, size, KUAP_RW);
 }
 
 static inline void prevent_read_from_user(const void __user *from, unsigned long size)
 {
-	prevent_user_access(NULL, from, size);
+	prevent_user_access(NULL, from, size, KUAP_R);
 }
 
 static inline void prevent_write_to_user(void __user *to, unsigned long size)
 {
-	prevent_user_access(to, NULL, size);
+	prevent_user_access(to, NULL, size, KUAP_W);
+}
+
+static inline void prevent_read_write_user(void __user *to, const void __user *from,
+					   unsigned long size)
+{
+	prevent_user_access(to, from, size, KUAP_RW);
 }
 
 #endif /* !__ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_KUP_H_ */
+#endif /* _ASM_POWERPC_KUAP_H_ */
diff --git a/arch/powerpc/include/asm/nohash/32/kup-8xx.h b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
index f2fea603b929..1d70c80366fd 100644
--- a/arch/powerpc/include/asm/nohash/32/kup-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
@@ -35,13 +35,13 @@
 #include <asm/reg.h>
 
 static inline void allow_user_access(void __user *to, const void __user *from,
-				     unsigned long size)
+				     unsigned long size, unsigned long dir)
 {
 	mtspr(SPRN_MD_AP, MD_APG_INIT);
 }
 
 static inline void prevent_user_access(void __user *to, const void __user *from,
-				       unsigned long size)
+				       unsigned long size, unsigned long dir)
 {
 	mtspr(SPRN_MD_AP, MD_APG_KUAP);
 }
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index c92fe7fe9692..cafad1960e76 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -313,9 +313,9 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
 	unsigned long ret;
 
 	barrier_nospec();
-	allow_user_access(to, from, n);
+	allow_read_write_user(to, from, n);
 	ret = __copy_tofrom_user(to, from, n);
-	prevent_user_access(to, from, n);
+	prevent_read_write_user(to, from, n);
 	return ret;
 }
 #endif /* __powerpc64__ */
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/6] powerpc/32s: Drop NULL addr verification
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 2/6] powerpc/32s: Fix bad_kuap_fault() Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 3/6] powerpc/kuap: Fix set direction in allow/prevent_user_access() Christophe Leroy
@ 2020-01-22 17:52 ` Christophe Leroy
  2020-01-22 17:52 ` [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end() Christophe Leroy
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

NULL addr is a user address. Don't waste time checking it. If
someone tries to access it, it will SIGFAULT the same way as for
address 1, so no need to make it special.

The special case is when not doing a write, in that case we want
to drop the entire function. This is now handled by 'dir' param
and not by the nulity of 'to' anymore.

Also make beginning of prevent_user_access() similar
to beginning of allow_user_access(), and tell the compiler
that writing in kernel space or with a 0 length is unlikely

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
---
 arch/powerpc/include/asm/book3s/32/kup.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index d765515bd1c1..3c1798e56b55 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -113,7 +113,7 @@ static __always_inline void allow_user_access(void __user *to, const void __user
 
 	addr = (__force u32)to;
 
-	if (!addr || addr >= TASK_SIZE || !size)
+	if (unlikely(addr >= TASK_SIZE || !size))
 		return;
 
 	end = min(addr + size, TASK_SIZE);
@@ -124,16 +124,18 @@ static __always_inline void allow_user_access(void __user *to, const void __user
 static __always_inline void prevent_user_access(void __user *to, const void __user *from,
 						u32 size, unsigned long dir)
 {
-	u32 addr = (__force u32)to;
-	u32 end = min(addr + size, TASK_SIZE);
+	u32 addr, end;
 
 	BUILD_BUG_ON(!__builtin_constant_p(dir));
 	if (!(dir & KUAP_W))
 		return;
 
-	if (!addr || addr >= TASK_SIZE || !size)
+	addr = (__force u32)to;
+
+	if (unlikely(addr >= TASK_SIZE || !size))
 		return;
 
+	end = min(addr + size, TASK_SIZE);
 	current->thread.kuap = 0;
 	kuap_update_sr(mfsrin(addr) | SR_KS, addr, end);	/* set Ks */
 }
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end()
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
                   ` (2 preceding siblings ...)
  2020-01-22 17:52 ` [PATCH v2 4/6] powerpc/32s: Drop NULL addr verification Christophe Leroy
@ 2020-01-22 17:52 ` Christophe Leroy
  2020-01-23 10:59   ` Michael Ellerman
  2020-01-22 17:52 ` [PATCH v2 6/6] powerpc: Implement user_access_begin and friends Christophe Leroy
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

In preparation of implementing user_access_begin and friends
on powerpc, the book3s/32 version of prevent_user_access() need
to be prepared for user_access_end().

user_access_end() doesn't provide the address and size which
were passed to user_access_begin(), required by prevent_user_access()
to know which segment to modify.

The list of segments which where unprotected by allow_user_access()
are available in current->kuap. But we don't want prevent_user_access()
to read this all the time, especially everytime it is 0 (for instance
because the access was not a write access).

Implement a special direction case named KUAP_SELF. In this case only,
the addr and end are retrieved from current->kuap.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
---
 arch/powerpc/include/asm/book3s/32/kup.h | 25 ++++++++++++++++++------
 arch/powerpc/include/asm/kup.h           |  1 +
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 3c1798e56b55..a99fc3428ac9 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -117,6 +117,7 @@ static __always_inline void allow_user_access(void __user *to, const void __user
 		return;
 
 	end = min(addr + size, TASK_SIZE);
+
 	current->thread.kuap = (addr & 0xf0000000) | ((((end - 1) >> 28) + 1) & 0xf);
 	kuap_update_sr(mfsrin(addr) & ~SR_KS, addr, end);	/* Clear Ks */
 }
@@ -127,15 +128,27 @@ static __always_inline void prevent_user_access(void __user *to, const void __us
 	u32 addr, end;
 
 	BUILD_BUG_ON(!__builtin_constant_p(dir));
-	if (!(dir & KUAP_W))
-		return;
 
-	addr = (__force u32)to;
+	if (dir == KUAP_SELF) {
+		u32 kuap = current->thread.kuap;
 
-	if (unlikely(addr >= TASK_SIZE || !size))
-		return;
+		if (unlikely(!kuap))
+			return;
+
+		addr = kuap & 0xf0000000;
+		end = kuap << 28;
+	} else {
+		if (!(dir & KUAP_W))
+			return;
+
+		addr = (__force u32)to;
+
+		if (unlikely(addr >= TASK_SIZE || !size))
+			return;
+
+		end = min(addr + size, TASK_SIZE);
+	}
 
-	end = min(addr + size, TASK_SIZE);
 	current->thread.kuap = 0;
 	kuap_update_sr(mfsrin(addr) | SR_KS, addr, end);	/* set Ks */
 }
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index ff57bfcb88f7..4229e749dcf4 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -5,6 +5,7 @@
 #define KUAP_R		1
 #define KUAP_W		2
 #define KUAP_RW		(KUAP_R | KUAP_W)
+#define KUAP_SELF	4
 
 #ifdef CONFIG_PPC64
 #include <asm/book3s/64/kup-radix.h>
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/6] powerpc: Implement user_access_begin and friends
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
                   ` (3 preceding siblings ...)
  2020-01-22 17:52 ` [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end() Christophe Leroy
@ 2020-01-22 17:52 ` Christophe Leroy
  2020-01-23 12:05   ` Michael Ellerman
       [not found] ` <CAHk-=wgNQ-rWoLg0OCJYYYbKBnRAUK4NPU-OD+vv-6fWnd=8kA@mail.gmail.com>
  2020-01-23 11:56 ` Michael Ellerman
  6 siblings, 1 reply; 19+ messages in thread
From: Christophe Leroy @ 2020-01-22 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Today, when a function like strncpy_from_user() is called,
the userspace access protection is de-activated and re-activated
for every word read.

By implementing user_access_begin and friends, the protection
is de-activated at the beginning of the copy and re-activated at the
end.

Implement user_access_begin(), user_access_end() and
unsafe_get_user(), unsafe_put_user() and unsafe_copy_to_user()

For the time being, we keep user_access_save() and
user_access_restore() as nops.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
v2: no change
---
 arch/powerpc/include/asm/uaccess.h | 92 ++++++++++++++++++++++++++----
 1 file changed, 82 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index cafad1960e76..ea67bbd56bd4 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -91,9 +91,14 @@ static inline int __access_ok(unsigned long addr, unsigned long size,
 	__put_user_check((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
 
 #define __get_user(x, ptr) \
-	__get_user_nocheck((x), (ptr), sizeof(*(ptr)))
+	__get_user_nocheck((x), (ptr), sizeof(*(ptr)), true)
 #define __put_user(x, ptr) \
-	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
+	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), true)
+
+#define __get_user_allowed(x, ptr) \
+	__get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
+#define __put_user_allowed(x, ptr) \
+	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), false)
 
 #define __get_user_inatomic(x, ptr) \
 	__get_user_nosleep((x), (ptr), sizeof(*(ptr)))
@@ -138,10 +143,9 @@ extern long __put_user_bad(void);
 		: "r" (x), "b" (addr), "i" (-EFAULT), "0" (err))
 #endif /* __powerpc64__ */
 
-#define __put_user_size(x, ptr, size, retval)			\
+#define __put_user_size_allowed(x, ptr, size, retval)		\
 do {								\
 	retval = 0;						\
-	allow_write_to_user(ptr, size);				\
 	switch (size) {						\
 	  case 1: __put_user_asm(x, ptr, retval, "stb"); break;	\
 	  case 2: __put_user_asm(x, ptr, retval, "sth"); break;	\
@@ -149,17 +153,26 @@ do {								\
 	  case 8: __put_user_asm2(x, ptr, retval); break;	\
 	  default: __put_user_bad();				\
 	}							\
+} while (0)
+
+#define __put_user_size(x, ptr, size, retval)			\
+do {								\
+	allow_write_to_user(ptr, size);				\
+	__put_user_size_allowed(x, ptr, size, retval);		\
 	prevent_write_to_user(ptr, size);			\
 } while (0)
 
-#define __put_user_nocheck(x, ptr, size)			\
+#define __put_user_nocheck(x, ptr, size, allow)			\
 ({								\
 	long __pu_err;						\
 	__typeof__(*(ptr)) __user *__pu_addr = (ptr);		\
 	if (!is_kernel_addr((unsigned long)__pu_addr))		\
 		might_fault();					\
 	__chk_user_ptr(ptr);					\
-	__put_user_size((x), __pu_addr, (size), __pu_err);	\
+	if (allow)								\
+		__put_user_size((x), __pu_addr, (size), __pu_err);		\
+	else									\
+		__put_user_size_allowed((x), __pu_addr, (size), __pu_err);	\
 	__pu_err;						\
 })
 
@@ -236,13 +249,12 @@ extern long __get_user_bad(void);
 		: "b" (addr), "i" (-EFAULT), "0" (err))
 #endif /* __powerpc64__ */
 
-#define __get_user_size(x, ptr, size, retval)			\
+#define __get_user_size_allowed(x, ptr, size, retval)		\
 do {								\
 	retval = 0;						\
 	__chk_user_ptr(ptr);					\
 	if (size > sizeof(x))					\
 		(x) = __get_user_bad();				\
-	allow_read_from_user(ptr, size);			\
 	switch (size) {						\
 	case 1: __get_user_asm(x, ptr, retval, "lbz"); break;	\
 	case 2: __get_user_asm(x, ptr, retval, "lhz"); break;	\
@@ -250,6 +262,12 @@ do {								\
 	case 8: __get_user_asm2(x, ptr, retval);  break;	\
 	default: (x) = __get_user_bad();			\
 	}							\
+} while (0)
+
+#define __get_user_size(x, ptr, size, retval)			\
+do {								\
+	allow_read_from_user(ptr, size);			\
+	__get_user_size_allowed(x, ptr, size, retval);		\
 	prevent_read_from_user(ptr, size);			\
 } while (0)
 
@@ -260,7 +278,7 @@ do {								\
 #define __long_type(x) \
 	__typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
 
-#define __get_user_nocheck(x, ptr, size)			\
+#define __get_user_nocheck(x, ptr, size, allow)			\
 ({								\
 	long __gu_err;						\
 	__long_type(*(ptr)) __gu_val;				\
@@ -269,7 +287,10 @@ do {								\
 	if (!is_kernel_addr((unsigned long)__gu_addr))		\
 		might_fault();					\
 	barrier_nospec();					\
-	__get_user_size(__gu_val, __gu_addr, (size), __gu_err);	\
+	if (allow)								\
+		__get_user_size(__gu_val, __gu_addr, (size), __gu_err);		\
+	else									\
+		__get_user_size_allowed(__gu_val, __gu_addr, (size), __gu_err);	\
 	(x) = (__typeof__(*(ptr)))__gu_val;			\
 	__gu_err;						\
 })
@@ -387,6 +408,34 @@ static inline unsigned long raw_copy_to_user(void __user *to,
 	return ret;
 }
 
+static inline unsigned long
+raw_copy_to_user_allowed(void __user *to, const void *from, unsigned long n)
+{
+	unsigned long ret;
+	if (__builtin_constant_p(n) && (n) <= 8) {
+		ret = 1;
+
+		switch (n) {
+		case 1:
+			__put_user_size_allowed(*(u8 *)from, (u8 __user *)to, 1, ret);
+			break;
+		case 2:
+			__put_user_size_allowed(*(u16 *)from, (u16 __user *)to, 2, ret);
+			break;
+		case 4:
+			__put_user_size_allowed(*(u32 *)from, (u32 __user *)to, 4, ret);
+			break;
+		case 8:
+			__put_user_size_allowed(*(u64 *)from, (u64 __user *)to, 8, ret);
+			break;
+		}
+		if (ret == 0)
+			return 0;
+	}
+
+	return __copy_tofrom_user(to, (__force const void __user *)from, n);
+}
+
 static __always_inline unsigned long __must_check
 copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
 {
@@ -428,4 +477,27 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
 extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
 			   size_t len);
 
+static __must_check inline bool user_access_begin(const void __user *ptr, size_t len)
+{
+	if (unlikely(!access_ok(ptr, len)))
+		return false;
+	allow_read_write_user((void __user *)ptr, ptr, len);
+	return true;
+}
+#define user_access_begin	user_access_begin
+
+static inline void user_access_end(void)
+{
+	prevent_user_access(NULL, NULL, ~0UL, KUAP_SELF);
+}
+#define user_access_end		user_access_end
+
+static inline unsigned long user_access_save(void) { return 0UL; }
+static inline void user_access_restore(unsigned long flags) { }
+
+#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
+#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user_allowed(x,p),e)
+#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user_allowed(x,p),e)
+#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(raw_copy_to_user_allowed(d,s,l),e)
+
 #endif	/* _ARCH_POWERPC_UACCESS_H */
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
       [not found] ` <CAHk-=wgNQ-rWoLg0OCJYYYbKBnRAUK4NPU-OD+vv-6fWnd=8kA@mail.gmail.com>
@ 2020-01-22 20:00   ` Linus Torvalds
  2020-01-22 20:15     ` Linus Torvalds
  2020-01-22 20:37     ` Linus Torvalds
  0 siblings, 2 replies; 19+ messages in thread
From: Linus Torvalds @ 2020-01-22 20:00 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM

[-- Attachment #1: Type: text/plain, Size: 1630 bytes --]

On Wed, Jan 22, 2020 at 10:24 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Patch looks better, but those names are horrid.

Hmm.

A bit more re-organization also allows us to do the unsafe_put_user()
unconditionally.

In particular, if we remove 'previous' as a pointer from the filldir
data structure, and replace it with 'prev_reclen', then we can do

        prev_reclen = buf->prev_reclen;
        dirent = buf->current_dir;
        prev = (void __user *) dirent - prev_reclen;
        if (!user_access_begin(prev, reclen + prev_reclen))
                goto efault;

and instead of checking that 'previous' pointer for NULL, we just
check prev_reclen for not being zero.

Yes, it replaces a few other

        lastdirent = buf.previous;

with the slightly more complicated

        lastdirent = (void __user *)buf.current_dir - buf.prev_reclen;

but on the whole it makes the _important_ code more streamlined, and
avoids having to have those if-else cases.

Something like the attached.

COMPLETELY UNTESTED! It compiles for me. The generated assembly looks
ok from a quick look.

Christophe, does this work for you on your ppc test-case?

Side note: I think verify_dirent_name() should check that 'len' is in
the appropriate range too, because right now a corrupted filesystem is
only noticed for a zero length. But a negative one, or one where the
reclen calculations would overflow, is not noticed.

Most filesystems have the source of 'len' being something like an
'unsigned char' so that it's pretty bounded anyway, which is likely
why nobody cared when we added that check, but..

               Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 5989 bytes --]

 fs/readdir.c | 78 +++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 40 insertions(+), 38 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index d26d5ea4de7b..ebdb07dd45fe 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -206,7 +206,7 @@ struct linux_dirent {
 struct getdents_callback {
 	struct dir_context ctx;
 	struct linux_dirent __user * current_dir;
-	struct linux_dirent __user * previous;
+	int prev_reclen;
 	int count;
 	int error;
 };
@@ -214,12 +214,13 @@ struct getdents_callback {
 static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		   loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent __user * dirent;
+	struct linux_dirent __user *dirent, *prev;
 	struct getdents_callback *buf =
 		container_of(ctx, struct getdents_callback, ctx);
 	unsigned long d_ino;
 	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
 		sizeof(long));
+	int prev_reclen;
 
 	buf->error = verify_dirent_name(name, namlen);
 	if (unlikely(buf->error))
@@ -232,28 +233,26 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		buf->error = -EOVERFLOW;
 		return -EOVERFLOW;
 	}
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
-		return -EINTR;
-
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
+	prev_reclen = buf->prev_reclen;
 	dirent = buf->current_dir;
+	prev = (void __user *) dirent - prev_reclen;
+	if (!user_access_begin(prev, reclen + prev_reclen))
+		goto efault;
+	if (prev_reclen) {
+		if (unlikely(signal_pending(current))) {
+			user_access_end();
+			return -EINTR;
+		}
+		unsafe_put_user(offset, &prev->d_off, efault_end);
+	}
 	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
 	user_access_end();
 
-	buf->previous = dirent;
-	dirent = (void __user *)dirent + reclen;
-	buf->current_dir = dirent;
+	buf->current_dir = (void __user *)dirent + reclen;
+	buf->prev_reclen = reclen;
 	buf->count -= reclen;
 	return 0;
 efault_end:
@@ -267,7 +266,6 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 		struct linux_dirent __user *, dirent, unsigned int, count)
 {
 	struct fd f;
-	struct linux_dirent __user * lastdirent;
 	struct getdents_callback buf = {
 		.ctx.actor = filldir,
 		.count = count,
@@ -285,8 +283,10 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 	error = iterate_dir(f.file, &buf.ctx);
 	if (error >= 0)
 		error = buf.error;
-	lastdirent = buf.previous;
-	if (lastdirent) {
+	if (buf.prev_reclen) {
+		struct linux_dirent __user * lastdirent;
+		lastdirent = (void __user *)buf.current_dir - buf.prev_reclen;
+
 		if (put_user(buf.ctx.pos, &lastdirent->d_off))
 			error = -EFAULT;
 		else
@@ -299,7 +299,7 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 struct getdents_callback64 {
 	struct dir_context ctx;
 	struct linux_dirent64 __user * current_dir;
-	struct linux_dirent64 __user * previous;
+	int prev_reclen;
 	int count;
 	int error;
 };
@@ -307,11 +307,12 @@ struct getdents_callback64 {
 static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 		     loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent64 __user *dirent;
+	struct linux_dirent64 __user *dirent, *prev;
 	struct getdents_callback64 *buf =
 		container_of(ctx, struct getdents_callback64, ctx);
 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
 		sizeof(u64));
+	int prev_reclen;
 
 	buf->error = verify_dirent_name(name, namlen);
 	if (unlikely(buf->error))
@@ -319,30 +320,30 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
 		return -EINVAL;
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
-		return -EINTR;
-
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
+	prev_reclen = buf->prev_reclen;
 	dirent = buf->current_dir;
+	prev = (void __user *)dirent - prev_reclen;
+	if (!user_access_begin(prev, reclen + prev_reclen))
+		goto efault;
+	if (prev_reclen) {
+		if (unlikely(signal_pending(current))) {
+			user_access_end();
+			return -EINTR;
+		}
+		unsafe_put_user(offset, &prev->d_off, efault_end);
+	}
 	unsafe_put_user(ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, &dirent->d_type, efault_end);
 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
 	user_access_end();
 
-	buf->previous = dirent;
+	buf->prev_reclen = reclen;
 	dirent = (void __user *)dirent + reclen;
 	buf->current_dir = dirent;
 	buf->count -= reclen;
 	return 0;
+
 efault_end:
 	user_access_end();
 efault:
@@ -354,7 +355,6 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 		    unsigned int count)
 {
 	struct fd f;
-	struct linux_dirent64 __user * lastdirent;
 	struct getdents_callback64 buf = {
 		.ctx.actor = filldir64,
 		.count = count,
@@ -372,9 +372,11 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 	error = iterate_dir(f.file, &buf.ctx);
 	if (error >= 0)
 		error = buf.error;
-	lastdirent = buf.previous;
-	if (lastdirent) {
+	if (buf.prev_reclen) {
+		struct linux_dirent64 __user * lastdirent;
 		typeof(lastdirent->d_off) d_off = buf.ctx.pos;
+
+		lastdirent = (void __user *) buf.current_dir - buf.prev_reclen;
 		if (__put_user(d_off, &lastdirent->d_off))
 			error = -EFAULT;
 		else

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-22 20:00   ` [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Linus Torvalds
@ 2020-01-22 20:15     ` Linus Torvalds
  2020-01-22 20:37     ` Linus Torvalds
  1 sibling, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2020-01-22 20:15 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM

On Wed, Jan 22, 2020 at 12:00 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> A bit more re-organization also allows us to do the unsafe_put_user()
> unconditionally.

I meant the "user_access_begin()", of course.

Code was right, explanation was wrong.

That said, with this model, we _could_ make the

        unsafe_put_user(offset, &prev->d_off, efault_end);

be unconditional too, since now 'prev' will actually be a valid
pointer - it will match 'dirent' if there was no prev.

But since we want to test whether we had a previous entry anyway (for
the signal handling latency issue), making the write to the previous
d_reclen unconditional (and then overwriting it the next iteration)
doesn't actually buy us anything.

It was the user_access_begin() I'd rather have unconditional, since
otherwise it gets duplicated in two (very slightly) different versions
and we have unnecessary code bloat.

           Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-22 20:00   ` [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Linus Torvalds
  2020-01-22 20:15     ` Linus Torvalds
@ 2020-01-22 20:37     ` Linus Torvalds
  2020-01-23  6:27       ` Christophe Leroy
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2020-01-22 20:37 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM

[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]

[ Talking to myself ]

On Wed, Jan 22, 2020 at 12:00 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> COMPLETELY UNTESTED! It compiles for me. The generated assembly looks
> ok from a quick look.

Some more testing shows that objtool is unhappy about how we do that
signal_pending(current) inside the user access region.

I didn't notice because my test builds were with sane kernel
configurations so that I could look at the generated code.

But with KASAN enabled, the signal check causes accesses that KASAN
wants to check, and I get

  objtool: filldir()+0x395: call to __kasan_check_read() with UACCESS enabled

warnings.

So that patch of mine isn't acceptable for silly reasons, and the
signal check itself would need to be done outside of the user access
area.

That actually makes the whole "let's do the &prev->d_off setting
unconditionally" much more interesting.

So here's a slightly updated patch that does exactly that, and avoids
the objtool warning.

It actually generates better code than the last one too, because now
we don't duplicate the user_access_end() for the EINTR case.

So test this one instead, please.

                 Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 5997 bytes --]

 fs/readdir.c | 70 +++++++++++++++++++++++++++++-------------------------------
 1 file changed, 34 insertions(+), 36 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index d26d5ea4de7b..35be4ca6b354 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -206,7 +206,7 @@ struct linux_dirent {
 struct getdents_callback {
 	struct dir_context ctx;
 	struct linux_dirent __user * current_dir;
-	struct linux_dirent __user * previous;
+	int prev_reclen;
 	int count;
 	int error;
 };
@@ -214,12 +214,13 @@ struct getdents_callback {
 static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		   loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent __user * dirent;
+	struct linux_dirent __user *dirent, *prev;
 	struct getdents_callback *buf =
 		container_of(ctx, struct getdents_callback, ctx);
 	unsigned long d_ino;
 	int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
 		sizeof(long));
+	int prev_reclen;
 
 	buf->error = verify_dirent_name(name, namlen);
 	if (unlikely(buf->error))
@@ -232,28 +233,24 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
 		buf->error = -EOVERFLOW;
 		return -EOVERFLOW;
 	}
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
+	prev_reclen = buf->prev_reclen;
+	if (prev_reclen && signal_pending(current))
 		return -EINTR;
-
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
 	dirent = buf->current_dir;
+	prev = (void __user *) dirent - prev_reclen;
+	if (!user_access_begin(prev, reclen + prev_reclen))
+		goto efault;
+
+	/* This might be 'dirent->d_off', but if so it will get overwritten */
+	unsafe_put_user(offset, &prev->d_off, efault_end);
 	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
 	user_access_end();
 
-	buf->previous = dirent;
-	dirent = (void __user *)dirent + reclen;
-	buf->current_dir = dirent;
+	buf->current_dir = (void __user *)dirent + reclen;
+	buf->prev_reclen = reclen;
 	buf->count -= reclen;
 	return 0;
 efault_end:
@@ -267,7 +264,6 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 		struct linux_dirent __user *, dirent, unsigned int, count)
 {
 	struct fd f;
-	struct linux_dirent __user * lastdirent;
 	struct getdents_callback buf = {
 		.ctx.actor = filldir,
 		.count = count,
@@ -285,8 +281,10 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 	error = iterate_dir(f.file, &buf.ctx);
 	if (error >= 0)
 		error = buf.error;
-	lastdirent = buf.previous;
-	if (lastdirent) {
+	if (buf.prev_reclen) {
+		struct linux_dirent __user * lastdirent;
+		lastdirent = (void __user *)buf.current_dir - buf.prev_reclen;
+
 		if (put_user(buf.ctx.pos, &lastdirent->d_off))
 			error = -EFAULT;
 		else
@@ -299,7 +297,7 @@ SYSCALL_DEFINE3(getdents, unsigned int, fd,
 struct getdents_callback64 {
 	struct dir_context ctx;
 	struct linux_dirent64 __user * current_dir;
-	struct linux_dirent64 __user * previous;
+	int prev_reclen;
 	int count;
 	int error;
 };
@@ -307,11 +305,12 @@ struct getdents_callback64 {
 static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 		     loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct linux_dirent64 __user *dirent;
+	struct linux_dirent64 __user *dirent, *prev;
 	struct getdents_callback64 *buf =
 		container_of(ctx, struct getdents_callback64, ctx);
 	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
 		sizeof(u64));
+	int prev_reclen;
 
 	buf->error = verify_dirent_name(name, namlen);
 	if (unlikely(buf->error))
@@ -319,30 +318,28 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 	buf->error = -EINVAL;	/* only used if we fail.. */
 	if (reclen > buf->count)
 		return -EINVAL;
-	dirent = buf->previous;
-	if (dirent && signal_pending(current))
+	prev_reclen = buf->prev_reclen;
+	if (prev_reclen && signal_pending(current))
 		return -EINTR;
-
-	/*
-	 * Note! This range-checks 'previous' (which may be NULL).
-	 * The real range was checked in getdents
-	 */
-	if (!user_access_begin(dirent, sizeof(*dirent)))
-		goto efault;
-	if (dirent)
-		unsafe_put_user(offset, &dirent->d_off, efault_end);
 	dirent = buf->current_dir;
+	prev = (void __user *)dirent - prev_reclen;
+	if (!user_access_begin(prev, reclen + prev_reclen))
+		goto efault;
+
+	/* This might be 'dirent->d_off', but if so it will get overwritten */
+	unsafe_put_user(offset, &prev->d_off, efault_end);
 	unsafe_put_user(ino, &dirent->d_ino, efault_end);
 	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
 	unsafe_put_user(d_type, &dirent->d_type, efault_end);
 	unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
 	user_access_end();
 
-	buf->previous = dirent;
+	buf->prev_reclen = reclen;
 	dirent = (void __user *)dirent + reclen;
 	buf->current_dir = dirent;
 	buf->count -= reclen;
 	return 0;
+
 efault_end:
 	user_access_end();
 efault:
@@ -354,7 +351,6 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 		    unsigned int count)
 {
 	struct fd f;
-	struct linux_dirent64 __user * lastdirent;
 	struct getdents_callback64 buf = {
 		.ctx.actor = filldir64,
 		.count = count,
@@ -372,9 +368,11 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 	error = iterate_dir(f.file, &buf.ctx);
 	if (error >= 0)
 		error = buf.error;
-	lastdirent = buf.previous;
-	if (lastdirent) {
+	if (buf.prev_reclen) {
+		struct linux_dirent64 __user * lastdirent;
 		typeof(lastdirent->d_off) d_off = buf.ctx.pos;
+
+		lastdirent = (void __user *) buf.current_dir - buf.prev_reclen;
 		if (__put_user(d_off, &lastdirent->d_off))
 			error = -EFAULT;
 		else

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-22 20:37     ` Linus Torvalds
@ 2020-01-23  6:27       ` Christophe Leroy
  0 siblings, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-23  6:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM



Le 22/01/2020 à 21:37, Linus Torvalds a écrit :
> [ Talking to myself ]
> 
> On Wed, Jan 22, 2020 at 12:00 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> COMPLETELY UNTESTED! It compiles for me. The generated assembly looks
>> ok from a quick look.
> 
> 
> So here's a slightly updated patch that does exactly that, and avoids
> the objtool warning.
> 
> It actually generates better code than the last one too, because now
> we don't duplicate the user_access_end() for the EINTR case.
> 
> So test this one instead, please.

This patch works on my ppc board, thanks

Christophe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end()
  2020-01-22 17:52 ` [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end() Christophe Leroy
@ 2020-01-23 10:59   ` Michael Ellerman
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2020-01-23 10:59 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> In preparation of implementing user_access_begin and friends
> on powerpc, the book3s/32 version of prevent_user_access() need
> to be prepared for user_access_end().
>
> user_access_end() doesn't provide the address and size which
> were passed to user_access_begin(), required by prevent_user_access()
> to know which segment to modify.
>
> The list of segments which where unprotected by allow_user_access()
> are available in current->kuap. But we don't want prevent_user_access()
> to read this all the time, especially everytime it is 0 (for instance
> because the access was not a write access).
>
> Implement a special direction case named KUAP_SELF. In this case only,
> the addr and end are retrieved from current->kuap.

Can we call it KUAP_CURRENT?

ie. "use the KUAP state in current"

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
                   ` (5 preceding siblings ...)
       [not found] ` <CAHk-=wgNQ-rWoLg0OCJYYYbKBnRAUK4NPU-OD+vv-6fWnd=8kA@mail.gmail.com>
@ 2020-01-23 11:56 ` Michael Ellerman
  2020-01-23 12:00   ` Michael Ellerman
  6 siblings, 1 reply; 19+ messages in thread
From: Michael Ellerman @ 2020-01-23 11:56 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Linus Torvalds, Alexander Viro, Andrew Morton
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Hi Christophe,

This patch is independent of the rest of the series AFAICS, and it looks
like Linus has modified it quite a bit down thread.

So I'll take patches 2-6 via powerpc and assume this patch will go via
Linus or Al or elsewhere.

Also a couple of minor spelling fixes below.

cheers

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Some architectures grand full access to userspace regardless of the
                     ^
                     grant
> address/len passed to user_access_begin(), but other architectures
> only grand access to the requested area.
       ^
       grant
>
> For exemple, on 32 bits powerpc (book3s/32), access is granted by
      ^
      example
> segments of 256 Mbytes.
>
> Modify filldir() and filldir64() to request the real area they need
> to get access to, i.e. the area covering the parent dirent (if any)
> and the contiguous current dirent.
>
> Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> v2: have user_access_begin() cover both parent dirent (if any) and current dirent
> ---
>  fs/readdir.c | 50 ++++++++++++++++++++++++++++----------------------
>  1 file changed, 28 insertions(+), 22 deletions(-)
>
> diff --git a/fs/readdir.c b/fs/readdir.c
> index d26d5ea4de7b..3f9b4488d9b7 100644
> --- a/fs/readdir.c
> +++ b/fs/readdir.c
> @@ -214,7 +214,7 @@ struct getdents_callback {
>  static int filldir(struct dir_context *ctx, const char *name, int namlen,
>  		   loff_t offset, u64 ino, unsigned int d_type)
>  {
> -	struct linux_dirent __user * dirent;
> +	struct linux_dirent __user * dirent, *dirent0;
>  	struct getdents_callback *buf =
>  		container_of(ctx, struct getdents_callback, ctx);
>  	unsigned long d_ino;
> @@ -232,19 +232,22 @@ static int filldir(struct dir_context *ctx, const char *name, int namlen,
>  		buf->error = -EOVERFLOW;
>  		return -EOVERFLOW;
>  	}
> -	dirent = buf->previous;
> -	if (dirent && signal_pending(current))
> +	dirent0 = buf->previous;
> +	if (dirent0 && signal_pending(current))
>  		return -EINTR;
>  
> -	/*
> -	 * Note! This range-checks 'previous' (which may be NULL).
> -	 * The real range was checked in getdents
> -	 */
> -	if (!user_access_begin(dirent, sizeof(*dirent)))
> -		goto efault;
> -	if (dirent)
> -		unsafe_put_user(offset, &dirent->d_off, efault_end);
>  	dirent = buf->current_dir;
> +	if (dirent0) {
> +		int sz = (void __user *)dirent + reclen -
> +			 (void __user *)dirent0;
> +
> +		if (!user_access_begin(dirent0, sz))
> +			goto efault;
> +		unsafe_put_user(offset, &dirent0->d_off, efault_end);
> +	} else {
> +		if (!user_access_begin(dirent, reclen))
> +			goto efault;
> +	}
>  	unsafe_put_user(d_ino, &dirent->d_ino, efault_end);
>  	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
>  	unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end);
> @@ -307,7 +310,7 @@ struct getdents_callback64 {
>  static int filldir64(struct dir_context *ctx, const char *name, int namlen,
>  		     loff_t offset, u64 ino, unsigned int d_type)
>  {
> -	struct linux_dirent64 __user *dirent;
> +	struct linux_dirent64 __user *dirent, *dirent0;
>  	struct getdents_callback64 *buf =
>  		container_of(ctx, struct getdents_callback64, ctx);
>  	int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
> @@ -319,19 +322,22 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
>  	buf->error = -EINVAL;	/* only used if we fail.. */
>  	if (reclen > buf->count)
>  		return -EINVAL;
> -	dirent = buf->previous;
> -	if (dirent && signal_pending(current))
> +	dirent0 = buf->previous;
> +	if (dirent0 && signal_pending(current))
>  		return -EINTR;
>  
> -	/*
> -	 * Note! This range-checks 'previous' (which may be NULL).
> -	 * The real range was checked in getdents
> -	 */
> -	if (!user_access_begin(dirent, sizeof(*dirent)))
> -		goto efault;
> -	if (dirent)
> -		unsafe_put_user(offset, &dirent->d_off, efault_end);
>  	dirent = buf->current_dir;
> +	if (dirent0) {
> +		int sz = (void __user *)dirent + reclen -
> +			 (void __user *)dirent0;
> +
> +		if (!user_access_begin(dirent0, sz))
> +			goto efault;
> +		unsafe_put_user(offset, &dirent0->d_off, efault_end);
> +	} else {
> +		if (!user_access_begin(dirent, reclen))
> +			goto efault;
> +	}
>  	unsafe_put_user(ino, &dirent->d_ino, efault_end);
>  	unsafe_put_user(reclen, &dirent->d_reclen, efault_end);
>  	unsafe_put_user(d_type, &dirent->d_type, efault_end);
> -- 
> 2.25.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-23 11:56 ` Michael Ellerman
@ 2020-01-23 12:00   ` Michael Ellerman
  2020-01-23 12:43     ` Christophe Leroy
  2020-01-23 18:38     ` Linus Torvalds
  0 siblings, 2 replies; 19+ messages in thread
From: Michael Ellerman @ 2020-01-23 12:00 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Linus Torvalds, Alexander Viro, Andrew Morton
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Michael Ellerman <mpe@ellerman.id.au> writes:
> Hi Christophe,
>
> This patch is independent of the rest of the series AFAICS

And of course having hit send I immediately realise that's not true.

> So I'll take patches 2-6 via powerpc and assume this patch will go via
> Linus or Al or elsewhere.

So I guess I'll wait and see what happens with patch 1.

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/6] powerpc: Implement user_access_begin and friends
  2020-01-22 17:52 ` [PATCH v2 6/6] powerpc: Implement user_access_begin and friends Christophe Leroy
@ 2020-01-23 12:05   ` Michael Ellerman
  2020-01-23 12:31     ` Michael Ellerman
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Ellerman @ 2020-01-23 12:05 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> Today, when a function like strncpy_from_user() is called,
> the userspace access protection is de-activated and re-activated
> for every word read.
>
> By implementing user_access_begin and friends, the protection
> is de-activated at the beginning of the copy and re-activated at the
> end.
>
> Implement user_access_begin(), user_access_end() and
> unsafe_get_user(), unsafe_put_user() and unsafe_copy_to_user()
>
> For the time being, we keep user_access_save() and
> user_access_restore() as nops.

That means we will run with user access enabled in a few more places, but
it's only used sparingly AFAICS:

  kernel/trace/trace_branch.c:    unsigned long flags = user_access_save();
  lib/ubsan.c:    unsigned long flags = user_access_save();
  lib/ubsan.c:    unsigned long ua_flags = user_access_save();
  mm/kasan/common.c:      unsigned long flags = user_access_save();

And we don't have objtool checking that user access enablement isn't
leaking in the first place, so I guess it's OK for us not to implement
these to begin with?

cheers


> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index cafad1960e76..ea67bbd56bd4 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -91,9 +91,14 @@ static inline int __access_ok(unsigned long addr, unsigned long size,
>  	__put_user_check((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
>  
>  #define __get_user(x, ptr) \
> -	__get_user_nocheck((x), (ptr), sizeof(*(ptr)))
> +	__get_user_nocheck((x), (ptr), sizeof(*(ptr)), true)
>  #define __put_user(x, ptr) \
> -	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
> +	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), true)
> +
> +#define __get_user_allowed(x, ptr) \
> +	__get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
> +#define __put_user_allowed(x, ptr) \
> +	__put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), false)
>  
>  #define __get_user_inatomic(x, ptr) \
>  	__get_user_nosleep((x), (ptr), sizeof(*(ptr)))
> @@ -138,10 +143,9 @@ extern long __put_user_bad(void);
>  		: "r" (x), "b" (addr), "i" (-EFAULT), "0" (err))
>  #endif /* __powerpc64__ */
>  
> -#define __put_user_size(x, ptr, size, retval)			\
> +#define __put_user_size_allowed(x, ptr, size, retval)		\
>  do {								\
>  	retval = 0;						\
> -	allow_write_to_user(ptr, size);				\
>  	switch (size) {						\
>  	  case 1: __put_user_asm(x, ptr, retval, "stb"); break;	\
>  	  case 2: __put_user_asm(x, ptr, retval, "sth"); break;	\
> @@ -149,17 +153,26 @@ do {								\
>  	  case 8: __put_user_asm2(x, ptr, retval); break;	\
>  	  default: __put_user_bad();				\
>  	}							\
> +} while (0)
> +
> +#define __put_user_size(x, ptr, size, retval)			\
> +do {								\
> +	allow_write_to_user(ptr, size);				\
> +	__put_user_size_allowed(x, ptr, size, retval);		\
>  	prevent_write_to_user(ptr, size);			\
>  } while (0)
>  
> -#define __put_user_nocheck(x, ptr, size)			\
> +#define __put_user_nocheck(x, ptr, size, allow)			\
>  ({								\
>  	long __pu_err;						\
>  	__typeof__(*(ptr)) __user *__pu_addr = (ptr);		\
>  	if (!is_kernel_addr((unsigned long)__pu_addr))		\
>  		might_fault();					\
>  	__chk_user_ptr(ptr);					\
> -	__put_user_size((x), __pu_addr, (size), __pu_err);	\
> +	if (allow)								\
> +		__put_user_size((x), __pu_addr, (size), __pu_err);		\
> +	else									\
> +		__put_user_size_allowed((x), __pu_addr, (size), __pu_err);	\
>  	__pu_err;						\
>  })
>  
> @@ -236,13 +249,12 @@ extern long __get_user_bad(void);
>  		: "b" (addr), "i" (-EFAULT), "0" (err))
>  #endif /* __powerpc64__ */
>  
> -#define __get_user_size(x, ptr, size, retval)			\
> +#define __get_user_size_allowed(x, ptr, size, retval)		\
>  do {								\
>  	retval = 0;						\
>  	__chk_user_ptr(ptr);					\
>  	if (size > sizeof(x))					\
>  		(x) = __get_user_bad();				\
> -	allow_read_from_user(ptr, size);			\
>  	switch (size) {						\
>  	case 1: __get_user_asm(x, ptr, retval, "lbz"); break;	\
>  	case 2: __get_user_asm(x, ptr, retval, "lhz"); break;	\
> @@ -250,6 +262,12 @@ do {								\
>  	case 8: __get_user_asm2(x, ptr, retval);  break;	\
>  	default: (x) = __get_user_bad();			\
>  	}							\
> +} while (0)
> +
> +#define __get_user_size(x, ptr, size, retval)			\
> +do {								\
> +	allow_read_from_user(ptr, size);			\
> +	__get_user_size_allowed(x, ptr, size, retval);		\
>  	prevent_read_from_user(ptr, size);			\
>  } while (0)
>  
> @@ -260,7 +278,7 @@ do {								\
>  #define __long_type(x) \
>  	__typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
>  
> -#define __get_user_nocheck(x, ptr, size)			\
> +#define __get_user_nocheck(x, ptr, size, allow)			\
>  ({								\
>  	long __gu_err;						\
>  	__long_type(*(ptr)) __gu_val;				\
> @@ -269,7 +287,10 @@ do {								\
>  	if (!is_kernel_addr((unsigned long)__gu_addr))		\
>  		might_fault();					\
>  	barrier_nospec();					\
> -	__get_user_size(__gu_val, __gu_addr, (size), __gu_err);	\
> +	if (allow)								\
> +		__get_user_size(__gu_val, __gu_addr, (size), __gu_err);		\
> +	else									\
> +		__get_user_size_allowed(__gu_val, __gu_addr, (size), __gu_err);	\
>  	(x) = (__typeof__(*(ptr)))__gu_val;			\
>  	__gu_err;						\
>  })
> @@ -387,6 +408,34 @@ static inline unsigned long raw_copy_to_user(void __user *to,
>  	return ret;
>  }
>  
> +static inline unsigned long
> +raw_copy_to_user_allowed(void __user *to, const void *from, unsigned long n)
> +{
> +	unsigned long ret;
> +	if (__builtin_constant_p(n) && (n) <= 8) {
> +		ret = 1;
> +
> +		switch (n) {
> +		case 1:
> +			__put_user_size_allowed(*(u8 *)from, (u8 __user *)to, 1, ret);
> +			break;
> +		case 2:
> +			__put_user_size_allowed(*(u16 *)from, (u16 __user *)to, 2, ret);
> +			break;
> +		case 4:
> +			__put_user_size_allowed(*(u32 *)from, (u32 __user *)to, 4, ret);
> +			break;
> +		case 8:
> +			__put_user_size_allowed(*(u64 *)from, (u64 __user *)to, 8, ret);
> +			break;
> +		}
> +		if (ret == 0)
> +			return 0;
> +	}
> +
> +	return __copy_tofrom_user(to, (__force const void __user *)from, n);
> +}
> +
>  static __always_inline unsigned long __must_check
>  copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
>  {
> @@ -428,4 +477,27 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>  			   size_t len);
>  
> +static __must_check inline bool user_access_begin(const void __user *ptr, size_t len)
> +{
> +	if (unlikely(!access_ok(ptr, len)))
> +		return false;
> +	allow_read_write_user((void __user *)ptr, ptr, len);
> +	return true;
> +}
> +#define user_access_begin	user_access_begin
> +
> +static inline void user_access_end(void)
> +{
> +	prevent_user_access(NULL, NULL, ~0UL, KUAP_SELF);
> +}
> +#define user_access_end		user_access_end
> +
> +static inline unsigned long user_access_save(void) { return 0UL; }
> +static inline void user_access_restore(unsigned long flags) { }
> +
> +#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
> +#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user_allowed(x,p),e)
> +#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user_allowed(x,p),e)
> +#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(raw_copy_to_user_allowed(d,s,l),e)
> +
>  #endif	/* _ARCH_POWERPC_UACCESS_H */
> -- 
> 2.25.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/6] powerpc: Implement user_access_begin and friends
  2020-01-23 12:05   ` Michael Ellerman
@ 2020-01-23 12:31     ` Michael Ellerman
  2020-01-24 11:40       ` Christophe Leroy
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Ellerman @ 2020-01-23 12:31 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm

Michael Ellerman <mpe@ellerman.id.au> writes:
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> Today, when a function like strncpy_from_user() is called,
>> the userspace access protection is de-activated and re-activated
>> for every word read.
>>
>> By implementing user_access_begin and friends, the protection
>> is de-activated at the beginning of the copy and re-activated at the
>> end.
>>
>> Implement user_access_begin(), user_access_end() and
>> unsafe_get_user(), unsafe_put_user() and unsafe_copy_to_user()
>>
>> For the time being, we keep user_access_save() and
>> user_access_restore() as nops.
>
> That means we will run with user access enabled in a few more places, but
> it's only used sparingly AFAICS:
>
>   kernel/trace/trace_branch.c:    unsigned long flags = user_access_save();
>   lib/ubsan.c:    unsigned long flags = user_access_save();
>   lib/ubsan.c:    unsigned long ua_flags = user_access_save();
>   mm/kasan/common.c:      unsigned long flags = user_access_save();
>
> And we don't have objtool checking that user access enablement isn't
> leaking in the first place, so I guess it's OK for us not to implement
> these to begin with?

It looks like we can implement them on on all three KUAP
implementations.

For radix and 8xx we just return/set the relevant SPR.

For book3s/32/kup.h I think we'd just need to add a KUAP_CURRENT case to
allow_user_access()?

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-23 12:00   ` Michael Ellerman
@ 2020-01-23 12:43     ` Christophe Leroy
  2020-01-23 18:38     ` Linus Torvalds
  1 sibling, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-23 12:43 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Linus Torvalds, Alexander Viro, Andrew Morton
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm



Le 23/01/2020 à 13:00, Michael Ellerman a écrit :
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Hi Christophe,
>>
>> This patch is independent of the rest of the series AFAICS
> 
> And of course having hit send I immediately realise that's not true.

Without this, book3s/32 fails booting. (And without patch 2, it even 
hangs, looping forever in do_page_fault()).


> 
>> So I'll take patches 2-6 via powerpc and assume this patch will go via
>> Linus or Al or elsewhere.
> 
> So I guess I'll wait and see what happens with patch 1.

We could eventually opt out user_access_begin() for 
CONFIG_PPC_BOOK3S_32, then you could take patches 3 and 6. That's enough 
to have user_access_begin() and stuff for 8xx and RADIX.

Patch 2 should be taken as well as a fix, and can be kept independant of 
the series (once we have patch 1, we normally don't hit the problem 
fixed by patch 2).

Won't don't need patch 4 until we want user_access_begin() supported by 
book3s/32


However, I'm about to send out a v3 with a different approach. It 
modifies the core part where user_access_begin() is returning an opaque 
value used by user_access_end(). And it also tells user_access_begin() 
whether it's a read or a write, so that we can limit unlocking to write 
acccesses on book3s/32, and fine grain rights on book3s/64.

Maybe you would prefer this change on top of first step, in which case 
I'll be able to make a v4 rebasing all this on top of patch 3 and 6 of 
v3 series. Tell me what you prefer.

Christophe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-23 12:00   ` Michael Ellerman
  2020-01-23 12:43     ` Christophe Leroy
@ 2020-01-23 18:38     ` Linus Torvalds
  2020-01-24 10:42       ` Michael Ellerman
  1 sibling, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2020-01-23 18:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM

On Thu, Jan 23, 2020 at 4:00 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> So I guess I'll wait and see what happens with patch 1.

I've committed my fixes to filldir[64]() directly - they really were
fixing me being lazy about the range, and the name length checking
really is a theoretical "access wrong user space pointer" issue with
corrupted filesystems regardless (even though I suspect it's entirely
theoretical - even a corrupt filesystem hopefully won't be passing in
negative directory entry lengths or something like that).

The "pass in read/write" part I'm not entirely convinced about.
Honestly, if this is just for ppc32 and nobody else really needs it,
make the ppc32s thing always just enable both user space reads and
writes. That's the semantics for x86 and arm as is, I'm not convinced
that we should complicate this for a legacy platform.

                Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin()
  2020-01-23 18:38     ` Linus Torvalds
@ 2020-01-24 10:42       ` Michael Ellerman
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2020-01-24 10:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Alexander Viro, Andrew Morton, Linux Kernel Mailing List,
	linuxppc-dev, linux-fsdevel, Linux-MM

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Thu, Jan 23, 2020 at 4:00 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> So I guess I'll wait and see what happens with patch 1.
>
> I've committed my fixes to filldir[64]() directly - they really were
> fixing me being lazy about the range, and the name length checking
> really is a theoretical "access wrong user space pointer" issue with
> corrupted filesystems regardless (even though I suspect it's entirely
> theoretical - even a corrupt filesystem hopefully won't be passing in
> negative directory entry lengths or something like that).

Great, thanks.

> The "pass in read/write" part I'm not entirely convinced about.
> Honestly, if this is just for ppc32 and nobody else really needs it,
> make the ppc32s thing always just enable both user space reads and
> writes. That's the semantics for x86 and arm as is, I'm not convinced
> that we should complicate this for a legacy platform.

We can use the read/write info on Power9 too. That's a niche platform
but hopefully not legacy status yet :P

But it's entirely optional, as you say we can just enable read/write if
we aren't passed the read/write info from the upper-level API.

I think our priority should be getting objtool going on powerpc to check
our user access regions are well contained. Once we have that working
maybe then we can look at plumbing the direction through
user_access_begin() etc.

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/6] powerpc: Implement user_access_begin and friends
  2020-01-23 12:31     ` Michael Ellerman
@ 2020-01-24 11:40       ` Christophe Leroy
  0 siblings, 0 replies; 19+ messages in thread
From: Christophe Leroy @ 2020-01-24 11:40 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-kernel, linuxppc-dev, linux-fsdevel, linux-mm



Le 23/01/2020 à 13:31, Michael Ellerman a écrit :
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>> Today, when a function like strncpy_from_user() is called,
>>> the userspace access protection is de-activated and re-activated
>>> for every word read.
>>>
>>> By implementing user_access_begin and friends, the protection
>>> is de-activated at the beginning of the copy and re-activated at the
>>> end.
>>>
>>> Implement user_access_begin(), user_access_end() and
>>> unsafe_get_user(), unsafe_put_user() and unsafe_copy_to_user()
>>>
>>> For the time being, we keep user_access_save() and
>>> user_access_restore() as nops.
>>
>> That means we will run with user access enabled in a few more places, but
>> it's only used sparingly AFAICS:
>>
>>    kernel/trace/trace_branch.c:    unsigned long flags = user_access_save();
>>    lib/ubsan.c:    unsigned long flags = user_access_save();
>>    lib/ubsan.c:    unsigned long ua_flags = user_access_save();
>>    mm/kasan/common.c:      unsigned long flags = user_access_save();
>>
>> And we don't have objtool checking that user access enablement isn't
>> leaking in the first place, so I guess it's OK for us not to implement
>> these to begin with?
> 
> It looks like we can implement them on on all three KUAP
> implementations.
> 
> For radix and 8xx we just return/set the relevant SPR.
> 
> For book3s/32/kup.h I think we'd just need to add a KUAP_CURRENT case to
> allow_user_access()?

Can't do that, we don't want to keep the info in current->thread.kuap 
after user_access_save(), otherwise we might unexpectedly re-open access 
through an interrupt.

And if we use KUAP_CURRENT case of prevent_user_access(), it means we'll 
read current->thread.kuap twice.

So, just regenerate addr and end from the flags, and use 
allow_user_access() and prevent_user_access() as usual.

I'll have it in v4

Christophe

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-01-24 11:41 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22 17:52 [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Christophe Leroy
2020-01-22 17:52 ` [PATCH v2 2/6] powerpc/32s: Fix bad_kuap_fault() Christophe Leroy
2020-01-22 17:52 ` [PATCH v2 3/6] powerpc/kuap: Fix set direction in allow/prevent_user_access() Christophe Leroy
2020-01-22 17:52 ` [PATCH v2 4/6] powerpc/32s: Drop NULL addr verification Christophe Leroy
2020-01-22 17:52 ` [PATCH v2 5/6] powerpc/32s: prepare prevent_user_access() for user_access_end() Christophe Leroy
2020-01-23 10:59   ` Michael Ellerman
2020-01-22 17:52 ` [PATCH v2 6/6] powerpc: Implement user_access_begin and friends Christophe Leroy
2020-01-23 12:05   ` Michael Ellerman
2020-01-23 12:31     ` Michael Ellerman
2020-01-24 11:40       ` Christophe Leroy
     [not found] ` <CAHk-=wgNQ-rWoLg0OCJYYYbKBnRAUK4NPU-OD+vv-6fWnd=8kA@mail.gmail.com>
2020-01-22 20:00   ` [PATCH v2 1/6] fs/readdir: Fix filldir() and filldir64() use of user_access_begin() Linus Torvalds
2020-01-22 20:15     ` Linus Torvalds
2020-01-22 20:37     ` Linus Torvalds
2020-01-23  6:27       ` Christophe Leroy
2020-01-23 11:56 ` Michael Ellerman
2020-01-23 12:00   ` Michael Ellerman
2020-01-23 12:43     ` Christophe Leroy
2020-01-23 18:38     ` Linus Torvalds
2020-01-24 10:42       ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).