sparclinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCHSET] VM_FAULT_RETRY fixes
@ 2023-01-31 20:02 Al Viro
  2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
                   ` (11 more replies)
  0 siblings, 12 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:02 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

	Page fault handlers generally react to VM_FAULT_RETRY from
handle_mm_fault() by repeating the whole thing (starting with locking
mmap) with FAULT_FLAG_TRIED added to flags.

	However, there are two cases when that's not the right thing
to do:

1) fault has happened in userland and we have a pending signal.  In that
case we'd better return from fault handler immediately.

2) fault has happened in kernel (e.g. in something like copy_from_user())
and we have a pending *fatal* signal.  Solution is to handle that as if
handle_mm_fault() had failed; we have come from kernel mode, so we'd
better have an exception table entry for the fauling insn.  Find it
and deal with it; from the copy_from_user() (or whatever it was that
triggered our fault) caller's POV it's indistinguishable from running
into an unmapped area, so it will fail.  The process is not going to
survive anyway.

Quietly returning from #PF handler in the second case is asking for
livelock - one common case when handle_mm_fault() returns VM_FAULT_RETRY
is when it needs to wait for page lock and gets hit by a fatal signal.
Running into that in any uaccess primitive will end up repeating the
faulting insn again and again, as long as we hit that case in
handle_mm_fault().  Eventually it might get out (e.g. trylock
manages to get page lock without hitting the "wait for it" codepath),
but it's obviously not a good situation.

On x86 it had been noticed and fixed back in 2014, in 26178ec11ef3 "x86:
mm: consolidate VM_FAULT_RETRY handling".  Some of the other architectures
had it dealt with later - e.g. arm in 2017, the fix is 746a272e44141
"ARM: 8692/1: mm: abort uaccess retries upon fatal signal"; xtensa -
in 2021, the fix is 7b9acbb6aad4f "xtensa: fix uaccess-related livelock
in do_page_fault", etc.

However, it never had been done on a bunch of architectures - the
current mainline still has that bug on alpha, hexagon, itanic, m68k,
microblaze, nios2, openrisc, parisc, riscv and sparc (both sparc32 and
sparc64).  Fixes are trivial, but I've no way to test them for most
of those architectures.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/10] alpha: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
@ 2023-01-31 20:03 ` Al Viro
  2023-03-07  0:48   ` patchwork-bot+linux-riscv
  2023-01-31 20:03 ` [PATCH 02/10] hexagon: " Al Viro
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:03 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

alpha equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index ef427a6bdd1a..7b01ae4f3bc6 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -152,8 +152,11 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
 	   the fault.  */
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/10] hexagon: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
  2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
@ 2023-01-31 20:03 ` Al Viro
  2023-02-10  2:59   ` Brian Cain
  2023-01-31 20:04 ` [PATCH 03/10] ia64: " Al Viro
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:03 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

hexagon equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/hexagon/mm/vm_fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c
index f73c7cbfe326..4b578d02fd01 100644
--- a/arch/hexagon/mm/vm_fault.c
+++ b/arch/hexagon/mm/vm_fault.c
@@ -93,8 +93,11 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs)
 
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/10] ia64: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
  2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
  2023-01-31 20:03 ` [PATCH 02/10] hexagon: " Al Viro
@ 2023-01-31 20:04 ` Al Viro
  2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:04 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

ia64 equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/ia64/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index ef78c2d66cdd..85c4d9ac8686 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -136,8 +136,11 @@ ia64_do_page_fault (unsigned long address, unsigned long isr, struct pt_regs *re
 	 */
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/10] m68k: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (2 preceding siblings ...)
  2023-01-31 20:04 ` [PATCH 03/10] ia64: " Al Viro
@ 2023-01-31 20:04 ` Al Viro
  2023-02-05  6:18   ` Finn Thain
  2023-02-06 12:08   ` Geert Uytterhoeven
  2023-01-31 20:05 ` [PATCH 05/10] microblaze: " Al Viro
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:04 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

m68k equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index 4d2837eb3e2a..228128e45c67 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -138,8 +138,11 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	fault = handle_mm_fault(vma, address, flags, regs);
 	pr_debug("handle_mm_fault returns %x\n", fault);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return 0;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/10] microblaze: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (3 preceding siblings ...)
  2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
@ 2023-01-31 20:05 ` Al Viro
  2023-01-31 20:05 ` [PATCH 06/10] nios2: " Al Viro
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:05 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

microblaze equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/microblaze/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
index 5c40c3ebe52f..32d9717039ba 100644
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -219,8 +219,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long address,
 	 */
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/10] nios2: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (4 preceding siblings ...)
  2023-01-31 20:05 ` [PATCH 05/10] microblaze: " Al Viro
@ 2023-01-31 20:05 ` Al Viro
  2023-01-31 20:06 ` [PATCH 07/10] openrisc: " Al Viro
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:05 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

nios2 equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/nios2/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index edaca0a6c1c1..ca64eccea551 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -136,8 +136,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 	 */
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/10] openrisc: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (5 preceding siblings ...)
  2023-01-31 20:05 ` [PATCH 06/10] nios2: " Al Viro
@ 2023-01-31 20:06 ` Al Viro
  2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:06 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

openrisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/openrisc/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c
index b4762d66e9ef..6734fee3134f 100644
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -162,8 +162,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/10] parisc: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (6 preceding siblings ...)
  2023-01-31 20:06 ` [PATCH 07/10] openrisc: " Al Viro
@ 2023-01-31 20:06 ` Al Viro
  2023-02-06 16:58   ` Helge Deller
  2023-02-28 15:22   ` Guenter Roeck
  2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:06 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/parisc/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
index 869204e97ec9..bb30ff6a3e19 100644
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -308,8 +308,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
 
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/10] riscv: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (7 preceding siblings ...)
  2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
@ 2023-01-31 20:06 ` Al Viro
  2023-02-06 20:06   ` Björn Töpel
  2023-02-07 16:11   ` Geert Uytterhoeven
  2023-01-31 20:07 ` [PATCH 10/10] sparc: " Al Viro
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:06 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

riscv equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/riscv/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index d86f7cebd4a7..c91d85349d39 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -324,8 +324,11 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 	 * signal first. We do not need to release the mmap_lock because it
 	 * would already be released in __lock_page_or_retry in mm/filemap.c.
 	 */
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!user_mode(regs))
+			no_context(regs, addr);
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/10] sparc: fix livelock in uaccess
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (8 preceding siblings ...)
  2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
@ 2023-01-31 20:07 ` Al Viro
  2023-01-31 20:24 ` [RFC][PATCHSET] VM_FAULT_RETRY fixes Linus Torvalds
  2023-02-01 10:50 ` Mark Rutland
  11 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-01-31 20:07 UTC (permalink / raw)
  To: linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

sparc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
to page tables.  In such case we must *not* return to the faulting insn -
that would repeat the entire thing without making any progress; what we need
instead is to treat that as failed (user) memory access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/mm/fault_32.c | 5 ++++-
 arch/sparc/mm/fault_64.c | 7 ++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index 91259f291c54..179295b14664 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -187,8 +187,11 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 	 */
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (!from_user)
+			goto no_context;
 		return;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index 4acc12eafbf5..d91305de694c 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -424,8 +424,13 @@ asmlinkage void __kprobes do_sparc64_fault(struct pt_regs *regs)
 
 	fault = handle_mm_fault(vma, address, flags, regs);
 
-	if (fault_signal_pending(fault, regs))
+	if (fault_signal_pending(fault, regs)) {
+		if (regs->tstate & TSTATE_PRIV) {
+			insn = get_fault_insn(regs, insn);
+			goto handle_kernel_fault;
+		}
 		goto exit_exception;
+	}
 
 	/* The fault is fully completed (including releasing mmap lock) */
 	if (fault & VM_FAULT_COMPLETED)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (9 preceding siblings ...)
  2023-01-31 20:07 ` [PATCH 10/10] sparc: " Al Viro
@ 2023-01-31 20:24 ` Linus Torvalds
  2023-01-31 21:10   ` Al Viro
  2023-02-01 10:50 ` Mark Rutland
  11 siblings, 1 reply; 40+ messages in thread
From: Linus Torvalds @ 2023-01-31 20:24 UTC (permalink / raw)
  To: Al Viro, Peter Xu
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux

[ Adding Peter Xu to the participants, he did a lot of "add generic
helpers" code a few years ago. Peter, see

    https://lore.kernel.org/linux-arch/Y9lz6yk113LmC9SI@ZenIV/

  for context ]

On Tue, Jan 31, 2023 at 12:03 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> However, it never had been done on a bunch of architectures - the
> current mainline still has that bug on alpha, hexagon, itanic, m68k,
> microblaze, nios2, openrisc, parisc, riscv and sparc (both sparc32 and
> sparc64).  Fixes are trivial, but I've no way to test them for most
> of those architectures.

Gaah. The patches look simple and "trivially correct", but I hate how
we duplicate basically the same (complicated) logic across
architectures.

Peter improved a bit on things, but I think we could do better.

Yes, all architectures basically have their own special code for
special error registers etc, and for various errata and/or special
handling for vmalloc addresses of vdso's etc.

And they aren't always translated to the generic flags, looking at
alpha, for example, we have code like this:

        si_code = SEGV_ACCERR;
        if (cause < 0) {
                if (!(vma->vm_flags & VM_EXEC))
                        goto bad_area;
        } else if (!cause) {
                /* Allow reads even for write-only mappings */
                if (!(vma->vm_flags & (VM_READ | VM_WRITE)))
                        goto bad_area;
        } else {
                if (!(vma->vm_flags & VM_WRITE))
                        goto bad_area;
                flags |= FAULT_FLAG_WRITE;
        }

because it uses the alpha internal "cause < 0" logic, instead of
having translated it into FAULT_FLAG_INSTRUCTION.

But *if* the alpha code were to just translate it into the
FAULT_FLAG_xyz namespace, apretty much *all* of the alpha
do_page_fault() could have been then done by a completely generic
"generic_page_fault()" that has all of the retry logic, all of the
si_code logic, etc etc.

So in a perfect world, we'd only have the special errata handling (*)
and the "translate to FAULT_FLAG_xyz" code in the architecture code,
and then just call that generic_page_fault() function for what really
is pretty much generic.

And then we would never again have these kinds of "architecture got
the retry wrong" issues.

Would anybody be interested in trying that? Just converting one or two
architectures to a new world order?

But if not, Al's patches all look "obviously fine" to me, but that's
because they basically all do the same thing except for that odd
special TSTATE_PRIV thing for sparc-64 - why can't that use
'!user_mode(regs)' like everybody else?)

                   Linus

(*) we really have a *lot* of architectures that have gotten
'prefetch' wrong and have errata for prefetch: alpha, arm and x86 all
have magic "bogus fault on prefetch" cases.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 20:24 ` [RFC][PATCHSET] VM_FAULT_RETRY fixes Linus Torvalds
@ 2023-01-31 21:10   ` Al Viro
  2023-01-31 21:19     ` Linus Torvalds
  0 siblings, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-01-31 21:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Xu, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Tue, Jan 31, 2023 at 12:24:54PM -0800, Linus Torvalds wrote:

> But *if* the alpha code were to just translate it into the
> FAULT_FLAG_xyz namespace, apretty much *all* of the alpha
> do_page_fault() could have been then done by a completely generic
> "generic_page_fault()" that has all of the retry logic, all of the
> si_code logic, etc etc.

Umm...  What about the semantics of get_user() of unmapped address?
Some architectures do quiet EFAULT; some (including alpha) hit
the sucker with SIGBUS, no matter what.

Examples:
arm
        if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
		return 0;

	/*
	 * If we are in kernel mode at this point, we
	 * have no context to handle this fault with.
	 */
	if (!user_mode(regs))
		goto no_context;
	...
alpha
 do_sigbus:
        mmap_read_unlock(mm);
        /* Send a sigbus, regardless of whether we were in kernel
           or user mode.  */
        force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *) address);
        if (!user_mode(regs))
                goto no_context;
        return;

Are we free to modify that behaviour, or is that part of arch-specific
ABI?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 21:10   ` Al Viro
@ 2023-01-31 21:19     ` Linus Torvalds
  2023-01-31 21:49       ` Al Viro
  2023-02-01  8:21       ` Helge Deller
  0 siblings, 2 replies; 40+ messages in thread
From: Linus Torvalds @ 2023-01-31 21:19 UTC (permalink / raw)
  To: Al Viro
  Cc: Peter Xu, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Tue, Jan 31, 2023 at 1:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Umm...  What about the semantics of get_user() of unmapped address?
> Some architectures do quiet EFAULT; some (including alpha) hit
> the sucker with SIGBUS, no matter what.

I think we should strive to just make this all common.

The reason alpha is different is almost certainly not intentional, but
a combination of "pure accident" and "nobody actually cares".

> Are we free to modify that behaviour, or is that part of arch-specific
> ABI?

I'd just unify this all, probably with a preference for existing
semantics on x86 (because of "biggest and most varied user base").

That whole "send SIGBUS even for kernel faults" is certainly bogus and
against the usual rules. And I may well be to blame for it (I have
this memory of disliking how EFAULT as a return code didn't actually
return the faulting address). And realistically, it's also just not
something that any normal application will ever hit.  Giving invalid
addresses to system calls is basically always a bug, although there
are always special software that do all the crazy corner cases (ie
things like emulators tend to do odd things).

I doubt such special software exists on Linux/alpha, though.

So I wouldn't worry about those kinds of oddities overmuch.

*If* somebody then finds a load that cares, we can always fix it
later, and I'll go "mea culpa, I didn't think it would matter, and I
was wrong".

         Linus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 21:19     ` Linus Torvalds
@ 2023-01-31 21:49       ` Al Viro
  2023-02-01  0:00         ` Linus Torvalds
  2023-02-01  8:21       ` Helge Deller
  1 sibling, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-01-31 21:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Xu, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Tue, Jan 31, 2023 at 01:19:59PM -0800, Linus Torvalds wrote:
> On Tue, Jan 31, 2023 at 1:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Umm...  What about the semantics of get_user() of unmapped address?
> > Some architectures do quiet EFAULT; some (including alpha) hit
> > the sucker with SIGBUS, no matter what.
> 
> I think we should strive to just make this all common.
> 
> The reason alpha is different is almost certainly not intentional, but
> a combination of "pure accident" and "nobody actually cares".
> 
> > Are we free to modify that behaviour, or is that part of arch-specific
> > ABI?
> 
> I'd just unify this all, probably with a preference for existing
> semantics on x86 (because of "biggest and most varied user base").
> 
> That whole "send SIGBUS even for kernel faults" is certainly bogus and
> against the usual rules. And I may well be to blame for it (I have
> this memory of disliking how EFAULT as a return code didn't actually
> return the faulting address). And realistically, it's also just not
> something that any normal application will ever hit.  Giving invalid
> addresses to system calls is basically always a bug, although there
> are always special software that do all the crazy corner cases (ie
> things like emulators tend to do odd things).
> 
> I doubt such special software exists on Linux/alpha, though.
> 
> So I wouldn't worry about those kinds of oddities overmuch.
> 
> *If* somebody then finds a load that cares, we can always fix it
> later, and I'll go "mea culpa, I didn't think it would matter, and I
> was wrong".

FWIW, from digging through the current tree:

alpha, openrisc, sparc and xtensa send SIGBUS.
m68k: not sure, do_page_fault() callers there are delicate.
mips: really interesting -
        /* Kernel mode? Handle exceptions or die */
        if (!user_mode(regs))
                goto no_context;

        /*
         * Send a sigbus, regardless of whether we were in kernel
         * or user mode.
... which is obviously a rudiment of SIGBUS variant, but nowadays
it's EFAULT.

Everything else seems to be going with EFAULT.

PS: mips used to be SIGBUS, until this
commit 1d50e5e7a6e0325b1a652c4be296a71dc54a6e96
Author: Andrew Morton <akpm@osdl.org>
Date:   Fri Feb 20 01:33:18 2004 -0800

    [PATCH] MIPS mega-patch
    
    From: Ralf Baechle <ralf@linux-mips.org>
    
    Below following 125547 lines of patches, all to arch/mips and
    include/asm-mips.  I'm going to send the remaining stuff of which the one
    or other bit may need to be discussed in smaller bits.

IOW, details are buried somewhere in historical mips tree, assuming
it survives...

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 21:49       ` Al Viro
@ 2023-02-01  0:00         ` Linus Torvalds
  2023-02-01 19:48           ` Peter Xu
  0 siblings, 1 reply; 40+ messages in thread
From: Linus Torvalds @ 2023-02-01  0:00 UTC (permalink / raw)
  To: Al Viro
  Cc: Peter Xu, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Tue, Jan 31, 2023 at 1:49 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Everything else seems to be going with EFAULT.

So I think fo kernel faults it's always basically up to the exception
handler, and sending a signal regardless of that is just wrong.

Of course, an exception handler might choose to send a signal, but it
just shouldn't be the do_page_fault() handler that does it.

For user faults, I think the rule ends up being that "if there's no
mapping, or if there is a protection fault, then we should do
SIGSEGV".

If there's an actual mapping in place, and the mapping has the right
permissions for the access, but fails for some *other* reason, then we
send SIGBUS.

So a shared mmap past the end of the file (but within the area that
was used for mmap) would SIGBUS, while a write to a read-only mapping
would SIGSEGV.

Things like unaligned accesses also might SIGBUS rather than SIGSEGV.

And I'm not surprised that there are exceptions to the rule, because
almost nothing really cares. In most situations, it's a fatal error.
And even when catching them, the end result is generally the same
(either "print out helpful message and die", or "longjump to some safe
code").

So most of the time it's probably not going to matter all that much
which signal gets sent in practice.

            Linus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 21:19     ` Linus Torvalds
  2023-01-31 21:49       ` Al Viro
@ 2023-02-01  8:21       ` Helge Deller
  2023-02-01 19:51         ` Linus Torvalds
  1 sibling, 1 reply; 40+ messages in thread
From: Helge Deller @ 2023-02-01  8:21 UTC (permalink / raw)
  To: Linus Torvalds, Al Viro
  Cc: Peter Xu, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On 1/31/23 22:19, Linus Torvalds wrote:
> On Tue, Jan 31, 2023 at 1:10 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>>
>> Umm...  What about the semantics of get_user() of unmapped address?
>> Some architectures do quiet EFAULT; some (including alpha) hit
>> the sucker with SIGBUS, no matter what.
>
> I think we should strive to just make this all common.
>
> The reason alpha is different is almost certainly not intentional, but
> a combination of "pure accident" and "nobody actually cares".
>
>> Are we free to modify that behaviour, or is that part of arch-specific
>> ABI?
>
> I'd just unify this all, probably with a preference for existing
> semantics on x86 (because of "biggest and most varied user base").
>
> That whole "send SIGBUS even for kernel faults" is certainly bogus and
> against the usual rules. And I may well be to blame for it (I have
> this memory of disliking how EFAULT as a return code didn't actually
> return the faulting address). And realistically, it's also just not
> something that any normal application will ever hit.  Giving invalid
> addresses to system calls is basically always a bug, although there
> are always special software that do all the crazy corner cases (ie
> things like emulators tend to do odd things).
>
> I doubt such special software exists on Linux/alpha, though.
>
> So I wouldn't worry about those kinds of oddities overmuch.
>
> *If* somebody then finds a load that cares, we can always fix it
> later, and I'll go "mea culpa, I didn't think it would matter, and I
> was wrong".

AFAICS, the only applications which really care about the return
code are
- testsuites like LTP (i.e. the fstat05 testcase)
- some (not just debian) packages execute tests at the end of the
   build process and verify the return codes, i.e. libsigsegv.

The differences between the architectures is currently often taken care
of via #ifdefs, so if the return code is harmonized across platforms
it would at least help to simplify such testcases.

Helge

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
                   ` (10 preceding siblings ...)
  2023-01-31 20:24 ` [RFC][PATCHSET] VM_FAULT_RETRY fixes Linus Torvalds
@ 2023-02-01 10:50 ` Mark Rutland
  2023-02-06 12:08   ` Geert Uytterhoeven
  11 siblings, 1 reply; 40+ messages in thread
From: Mark Rutland @ 2023-02-01 10:50 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Tue, Jan 31, 2023 at 08:02:51PM +0000, Al Viro wrote:
> On x86 it had been noticed and fixed back in 2014, in 26178ec11ef3 "x86:
> mm: consolidate VM_FAULT_RETRY handling".  Some of the other architectures
> had it dealt with later - e.g. arm in 2017, the fix is 746a272e44141
> "ARM: 8692/1: mm: abort uaccess retries upon fatal signal"; xtensa -
> in 2021, the fix is 7b9acbb6aad4f "xtensa: fix uaccess-related livelock
> in do_page_fault", etc.
> 
> However, it never had been done on a bunch of architectures - the
> current mainline still has that bug on alpha, hexagon, itanic, m68k,
> microblaze, nios2, openrisc, parisc, riscv and sparc (both sparc32 and
> sparc64).  Fixes are trivial, but I've no way to test them for most
> of those architectures.

FWIW, when I fixed arm and arm64 back in 2017, I did report the issue here with
a test case (and again in 2021, with maintainers all explciitly Cc'd):

  https://lore.kernel.org/lkml/20170822102527.GA14671@leverpostej/
  https://lore.kernel.org/linux-arch/20210121123140.GD48431@C02TD0UTHF1T.local/

... so if anyone has access to those architectures, that test might be useful
for verifying the fix.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01  0:00         ` Linus Torvalds
@ 2023-02-01 19:48           ` Peter Xu
  2023-02-01 22:18             ` Al Viro
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Xu @ 2023-02-01 19:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Tue, Jan 31, 2023 at 04:00:22PM -0800, Linus Torvalds wrote:
> So most of the time it's probably not going to matter all that much
> which signal gets sent in practice.

I do also see a common pattern of the possibility to have a generic fault
handler like generic_page_fault().

It probably should start with taking the mmap_sem until providing some
retval that is much easier to digest further by the arch-dependent code, so
it can directly do something rather than parsing the bitmask in a
duplicated way (hence the new retval should hopefully not a bitmask anymore
but a "what to do").

Maybe it can be something like:

/**
 * enum page_fault_retval - Higher level fault retval, generalized from
 * vm_fault_reason above that is only used by hardware page fault handlers.
 * It generalizes the bitmask-versioned retval into something that the arch
 * dependent code should react upon.
 *
 * @PF_RET_COMPLETED:		The page fault is completed successfully
 * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
 *				(e.g., vma not found, expand_stack() fails..)
 * @PF_RET_ACCESS_ERR:		The page fault has access errors
 *				(e.g., write fault on !VM_WRITE vmas)
 * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
 *				(e.g., during copy_to_user() but fault failed?)
 * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
 * @PF_RET_SIGNAL:		The page fault encountered poisoned pages
 * ...
 */
enum page_fault_retval {
	PF_RET_DONE = 0,
	PF_RET_BAD_AREA,
	PF_RET_ACCESS_ERR,
	PF_RET_KERN_FIXUP,
        PF_RET_HWPOISON,
        PF_RET_SIGNAL,
	...
};

As a start we may still want to return some more information (perhaps still
the vm_fault_t alongside?  Or another union that will provide different
information based on different PF_RET_*).  One major thing is I see how we
handle VM_FAULT_HWPOISON and also the fact that we encode something more
into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).

So the generic helper could, hopefully, hide the complexity of:

  - Taking and releasing of mmap lock
  - find_vma(), and also relevant checks on access or stack handling
  - handle_mm_fault() itself (of course...)
  - detect signals
  - handle page fault retries (so, in the new layer of retval there should
    have nothing telling it to retry; it should always be the ultimate result)
  - parse different errors into "what the arch code should do", and
    generalize the common ones, e.g.
    - OOM, do pagefault_out_of_memory() for user-mode
    - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
    - ...

It'll simplify things if we can unify some small details like whether the
-EFAULT above should contain a sigbus.

A trivial detail I found when I was looking at this is, x86_64 passes in
different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
there're three call sites and each of them pass over a differerent signal.
IIUC that will only make a difference if there's a nested page fault during
the vsyscall emulation (but I may be wrong too because I'm new to this
code), and I have no idea when it'll happen and whether that needs to be
strictly followed.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01  8:21       ` Helge Deller
@ 2023-02-01 19:51         ` Linus Torvalds
  0 siblings, 0 replies; 40+ messages in thread
From: Linus Torvalds @ 2023-02-01 19:51 UTC (permalink / raw)
  To: Helge Deller
  Cc: Al Viro, Peter Xu, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Wed, Feb 1, 2023 at 12:21 AM Helge Deller <deller@gmx.de> wrote:
>
> AFAICS, the only applications which really care about the return
> code are
> - testsuites like LTP (i.e. the fstat05 testcase)

Those have actually shown issues with various library implementations,
exactly because real system calls act very differently in this area
from library wrappers.

Things like the vdso implementation of gettimeofday() get a SIGSEGV if
the timeval or timezone pointer is invalid, while the "real system
call" version returns -1/EFAULT instead.

And very similar things happen when glibc ends up wrapping system
calls and converting buffers manually. At some point, glibc had a
special 'struct stat' and basically converted the native system call
to it, so you did 'stat()' on something, and it ended up actually
using a private on-stack buffer for the system call, followed by a
"convert that kernel 'struct stat' to the glibc 'struct stat'" phase.
So once again, instead of -1/EFAULT, you'd first have a successful
system call, and then get a SIGSEGV  in glibc.

And as you say, test suites would notice. But no actual normal app
would ever care.

Of course, there's always the abnormal apps. There _are_ the odd cases
that actually catch faults and fix them up, and can then be confused
by changes like that.

It's very very rare, but it happens - things like emulators do tend to
do some really strange things.

         Linus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01 19:48           ` Peter Xu
@ 2023-02-01 22:18             ` Al Viro
  2023-02-02  0:57               ` Al Viro
  2023-02-02 22:56               ` Peter Xu
  0 siblings, 2 replies; 40+ messages in thread
From: Al Viro @ 2023-02-01 22:18 UTC (permalink / raw)
  To: Peter Xu
  Cc: Linus Torvalds, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Wed, Feb 01, 2023 at 02:48:22PM -0500, Peter Xu wrote:

> I do also see a common pattern of the possibility to have a generic fault
> handler like generic_page_fault().
> 
> It probably should start with taking the mmap_sem until providing some
> retval that is much easier to digest further by the arch-dependent code, so
> it can directly do something rather than parsing the bitmask in a
> duplicated way (hence the new retval should hopefully not a bitmask anymore
> but a "what to do").
> 
> Maybe it can be something like:
> 
> /**
>  * enum page_fault_retval - Higher level fault retval, generalized from
>  * vm_fault_reason above that is only used by hardware page fault handlers.
>  * It generalizes the bitmask-versioned retval into something that the arch
>  * dependent code should react upon.
>  *
>  * @PF_RET_COMPLETED:		The page fault is completed successfully
>  * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
>  *				(e.g., vma not found, expand_stack() fails..)

FWIW, there's a fun discrepancy - VM_FAULT_SIGSEGV may yield SEGV_MAPERR
or SEGV_ACCERR; depends upon the architecture.  Not that there'd been
many places that return VM_FAULT_SIGSEGV these days...  Good thing, too,
since otherwise e.g. csky would oops...

>  * @PF_RET_ACCESS_ERR:		The page fault has access errors
>  *				(e.g., write fault on !VM_WRITE vmas)
>  * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
>  *				(e.g., during copy_to_user() but fault failed?)
>  * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
>  * @PF_RET_SIGNAL:		The page fault encountered poisoned pages

??

>  * ...
>  */
> enum page_fault_retval {
> 	PF_RET_DONE = 0,
> 	PF_RET_BAD_AREA,
> 	PF_RET_ACCESS_ERR,
> 	PF_RET_KERN_FIXUP,
>         PF_RET_HWPOISON,
>         PF_RET_SIGNAL,
> 	...
> };
> 
> As a start we may still want to return some more information (perhaps still
> the vm_fault_t alongside?  Or another union that will provide different
> information based on different PF_RET_*).  One major thing is I see how we
> handle VM_FAULT_HWPOISON and also the fact that we encode something more
> into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).
> 
> So the generic helper could, hopefully, hide the complexity of:
> 
>   - Taking and releasing of mmap lock
>   - find_vma(), and also relevant checks on access or stack handling

Umm...  arm is a bit special here:
                if (addr < FIRST_USER_ADDRESS)
			return VM_FAULT_BADMAP;
with no counterparts elsewhere.

>   - handle_mm_fault() itself (of course...)
>   - detect signals
>   - handle page fault retries (so, in the new layer of retval there should
>     have nothing telling it to retry; it should always be the ultimate result)

agreed.

    - unlock mmap; don't leave that to caller.

>   - parse different errors into "what the arch code should do", and
>     generalize the common ones, e.g.
>     - OOM, do pagefault_out_of_memory() for user-mode
>     - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
>     - ...

AFAICS, all errors in kernel mode => no_context.

> It'll simplify things if we can unify some small details like whether the
> -EFAULT above should contain a sigbus.
> 
> A trivial detail I found when I was looking at this is, x86_64 passes in
> different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
> there're three call sites and each of them pass over a differerent signal.
> IIUC that will only make a difference if there's a nested page fault during
> the vsyscall emulation (but I may be wrong too because I'm new to this
> code), and I have no idea when it'll happen and whether that needs to be
> strictly followed.

From my (very incomplete so far) dig through that pile:
	Q: do we still have the cases when handle_mm_fault() does
not return any of VM_FAULT_COMPLETED | VM_FAULT_RETRY | VM_FAULT_ERROR?
That gets treated as unlock + VM_FAULT_COMPLETED, but do we still need
that?
	Q: can VM_FAULT_RETRY be mixed with anything in VM_FAULT_ERROR?
What locking, if that happens?
	* details of storing the fault details (for ptrace, mostly)
vary a lot; no chance to unify, AFAICS.
	* requirements for vma flags also differ; e.g. read fault on
alpha is explicitly OK with absence of VM_READ if VM_WRITE is there.
Probably should go by way of arm and pass the mask that must
have non-empty intersection with vma->vm_flags?  Because *that*
is very likely to be a part of ABI - mmap(2) callers that rely
upon the flags being OK for given architecture are quite possible.
	* mmap lock is also quite variable in how it's taken;
x86 and arm have fun dance with trylock/search for exception handler/etc.
Other architectures do not; OTOH, there's a prefetch stuck in itanic
variant, with comment about mmap_sem being performance-critical...
	* logics for stack expansion includes this twist:
        if (!(vma->vm_flags & VM_GROWSDOWN))
                goto map_err;
        if (user_mode(regs)) {
                /* Accessing the stack below usp is always a bug.  The
                   "+ 256" is there due to some instructions doing
                   pre-decrement on the stack and that doesn't show up
                   until later.  */
                if (address + 256 < rdusp())
                        goto map_err;
        }
        if (expand_stack(vma, address))
                goto map_err;
That's m68k; ISTR similar considerations elsewhere, but I could be
wrong.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01 22:18             ` Al Viro
@ 2023-02-02  0:57               ` Al Viro
  2023-02-02 22:56               ` Peter Xu
  1 sibling, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-02-02  0:57 UTC (permalink / raw)
  To: Peter Xu
  Cc: Linus Torvalds, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Wed, Feb 01, 2023 at 10:18:11PM +0000, Al Viro wrote:
> 	* logics for stack expansion includes this twist:
>         if (!(vma->vm_flags & VM_GROWSDOWN))
>                 goto map_err;
>         if (user_mode(regs)) {
>                 /* Accessing the stack below usp is always a bug.  The
>                    "+ 256" is there due to some instructions doing
>                    pre-decrement on the stack and that doesn't show up
>                    until later.  */
>                 if (address + 256 < rdusp())
>                         goto map_err;
>         }
>         if (expand_stack(vma, address))
>                 goto map_err;
> That's m68k; ISTR similar considerations elsewhere, but I could be
> wrong.

Hell, yes - 
        if (!(vma->vm_flags & VM_GROWSDOWN))
                goto bad_area;
        if (!(fault_code & FAULT_CODE_WRITE)) {
                /* Non-faulting loads shouldn't expand stack. */
                insn = get_fault_insn(regs, insn);
                if ((insn & 0xc0800000) == 0xc0800000) {
                        unsigned char asi;

                        if (insn & 0x2000)
                                asi = (regs->tstate >> 24);
                        else
                                asi = (insn >> 5);
                        if ((asi & 0xf2) == 0x82)
                                goto bad_area;
                }
        }
        if (expand_stack(vma, address))
                goto bad_area;

Note that it's very much not a bug - it's a nonfaulting (== speculative)
load, and the place where we are heading from bad_area in this case is
this in do_kernel_fault():
        if (!(fault_code & (FAULT_CODE_WRITE|FAULT_CODE_ITLB)) &&
            (insn & 0xc0800000) == 0xc0800000) {
                if (insn & 0x2000)
                        asi = (regs->tstate >> 24);
                else  
                        asi = (insn >> 5);
                if ((asi & 0xf2) == 0x82) {
                        if (insn & 0x1000000) {
                                handle_ldf_stq(insn, regs);
                        } else {
                                /* This was a non-faulting load. Just clear the
                                 * destination register(s) and continue with the next
                                 * instruction. -jj
                                 */
                                handle_ld_nf(insn, regs);
                        }
                        return;

(the name is misguiding - it covers userland stuff as well; in this
particular case the triggering instruction is non-priveleged)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01 22:18             ` Al Viro
  2023-02-02  0:57               ` Al Viro
@ 2023-02-02 22:56               ` Peter Xu
  2023-02-04  0:26                 ` Al Viro
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Xu @ 2023-02-02 22:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Wed, Feb 01, 2023 at 10:18:11PM +0000, Al Viro wrote:
> On Wed, Feb 01, 2023 at 02:48:22PM -0500, Peter Xu wrote:
> 
> > I do also see a common pattern of the possibility to have a generic fault
> > handler like generic_page_fault().
> > 
> > It probably should start with taking the mmap_sem until providing some
> > retval that is much easier to digest further by the arch-dependent code, so
> > it can directly do something rather than parsing the bitmask in a
> > duplicated way (hence the new retval should hopefully not a bitmask anymore
> > but a "what to do").
> > 
> > Maybe it can be something like:
> > 
> > /**
> >  * enum page_fault_retval - Higher level fault retval, generalized from
> >  * vm_fault_reason above that is only used by hardware page fault handlers.
> >  * It generalizes the bitmask-versioned retval into something that the arch
> >  * dependent code should react upon.
> >  *
> >  * @PF_RET_COMPLETED:		The page fault is completed successfully
> >  * @PF_RET_BAD_AREA:		The page fault address falls in a bad area
> >  *				(e.g., vma not found, expand_stack() fails..)
> 
> FWIW, there's a fun discrepancy - VM_FAULT_SIGSEGV may yield SEGV_MAPERR
> or SEGV_ACCERR; depends upon the architecture.  Not that there'd been
> many places that return VM_FAULT_SIGSEGV these days...  Good thing, too,
> since otherwise e.g. csky would oops...
> 
> >  * @PF_RET_ACCESS_ERR:		The page fault has access errors
> >  *				(e.g., write fault on !VM_WRITE vmas)
> >  * @PF_RET_KERN_FIXUP:		The page fault requires kernel fixups
> >  *				(e.g., during copy_to_user() but fault failed?)
> >  * @PF_RET_HWPOISON:		The page fault encountered poisoned pages
> >  * @PF_RET_SIGNAL:		The page fault encountered poisoned pages
> 
> ??
> 
> >  * ...
> >  */
> > enum page_fault_retval {
> > 	PF_RET_DONE = 0,
> > 	PF_RET_BAD_AREA,
> > 	PF_RET_ACCESS_ERR,
> > 	PF_RET_KERN_FIXUP,
> >         PF_RET_HWPOISON,
> >         PF_RET_SIGNAL,
> > 	...
> > };
> > 
> > As a start we may still want to return some more information (perhaps still
> > the vm_fault_t alongside?  Or another union that will provide different
> > information based on different PF_RET_*).  One major thing is I see how we
> > handle VM_FAULT_HWPOISON and also the fact that we encode something more
> > into the bitmask on page sizes (VM_FAULT_HINDEX_MASK).
> > 
> > So the generic helper could, hopefully, hide the complexity of:
> > 
> >   - Taking and releasing of mmap lock
> >   - find_vma(), and also relevant checks on access or stack handling
> 
> Umm...  arm is a bit special here:
>                 if (addr < FIRST_USER_ADDRESS)
> 			return VM_FAULT_BADMAP;
> with no counterparts elsewhere.

For this specific case IIUC it's the same as bad_area.  VM_FAULT_BADMAP is
further handled later in do_page_fault() there for either arm/arm64.

This reminded me this, on how arm defines the private retvals, while the
generic ones grows and probably no one noticed they can collapse already..

#define VM_FAULT_BADMAP		0x010000
#define VM_FAULT_BADACCESS	0x020000

enum vm_fault_reason {
        ...
	VM_FAULT_HINDEX_MASK    = (__force vm_fault_t)0x0f0000,
};

VM_FAULT_HINDEX_MASK is only used by VM_FAULT_HWPOISON_LARGE, so I think
arm[64] could expect some surprise when it hit hugetlb hwpoison pages...
maybe I should prepare a patch for arm.

> 
> >   - handle_mm_fault() itself (of course...)
> >   - detect signals
> >   - handle page fault retries (so, in the new layer of retval there should
> >     have nothing telling it to retry; it should always be the ultimate result)
> 
> agreed.
> 
>     - unlock mmap; don't leave that to caller.
> 
> >   - parse different errors into "what the arch code should do", and
> >     generalize the common ones, e.g.
> >     - OOM, do pagefault_out_of_memory() for user-mode
> >     - VM_FAULT_SIGSEGV, which should be able to merge into PF_RET_BAD_AREA?
> >     - ...
> 
> AFAICS, all errors in kernel mode => no_context.
> 
> > It'll simplify things if we can unify some small details like whether the
> > -EFAULT above should contain a sigbus.
> > 
> > A trivial detail I found when I was looking at this is, x86_64 passes in
> > different signals to kernelmode_fixup_or_oops() - in do_user_addr_fault()
> > there're three call sites and each of them pass over a differerent signal.
> > IIUC that will only make a difference if there's a nested page fault during
> > the vsyscall emulation (but I may be wrong too because I'm new to this
> > code), and I have no idea when it'll happen and whether that needs to be
> > strictly followed.
> 
> From my (very incomplete so far) dig through that pile:
> 	Q: do we still have the cases when handle_mm_fault() does
> not return any of VM_FAULT_COMPLETED | VM_FAULT_RETRY | VM_FAULT_ERROR?
> That gets treated as unlock + VM_FAULT_COMPLETED, but do we still need
> that?
> 	Q: can VM_FAULT_RETRY be mixed with anything in VM_FAULT_ERROR?
> What locking, if that happens?

For this one, I don't think they can be mixed.  IMHO RETRY only binds with
a wait, so if we didn't wait and found issue, we return ERROR; if we
decided to wait, we will try nothing more besides return after wait with
the RETRY.  We should just never check any error at all if the wait
happened.  Otherwise there's a bug of potential deadlock.

I'll skip some details in this email either above or below; I agree
there're so many trivial details to take care of to not break a thing.

IMHO it'll be merely impossible to merge things across most (if not to say,
all) archs.  It will need to be start from one or at least a few that still
shares a major common base - I would still rely on x86 as a start - then we
try to use the helper in as much archs as possible.

Even on x86, I do also see challenges so I'm not sure whether a common
enough routine can be abstracted indeed.  But I believe there's a way to do
this because obviously we still see tons of duplicated logics falling
around.  It may definitely need time to think out where's the best spot to
start, and how to gradually move towards covering more archs starting from
one.

Thanks,

> 	* details of storing the fault details (for ptrace, mostly)
> vary a lot; no chance to unify, AFAICS.
> 	* requirements for vma flags also differ; e.g. read fault on
> alpha is explicitly OK with absence of VM_READ if VM_WRITE is there.
> Probably should go by way of arm and pass the mask that must
> have non-empty intersection with vma->vm_flags?  Because *that*
> is very likely to be a part of ABI - mmap(2) callers that rely
> upon the flags being OK for given architecture are quite possible.
> 	* mmap lock is also quite variable in how it's taken;
> x86 and arm have fun dance with trylock/search for exception handler/etc.
> Other architectures do not; OTOH, there's a prefetch stuck in itanic
> variant, with comment about mmap_sem being performance-critical...
> 	* logics for stack expansion includes this twist:
>         if (!(vma->vm_flags & VM_GROWSDOWN))
>                 goto map_err;
>         if (user_mode(regs)) {
>                 /* Accessing the stack below usp is always a bug.  The
>                    "+ 256" is there due to some instructions doing
>                    pre-decrement on the stack and that doesn't show up
>                    until later.  */
>                 if (address + 256 < rdusp())
>                         goto map_err;
>         }
>         if (expand_stack(vma, address))
>                 goto map_err;
> That's m68k; ISTR similar considerations elsewhere, but I could be
> wrong.
> 

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-02 22:56               ` Peter Xu
@ 2023-02-04  0:26                 ` Al Viro
  2023-02-05  5:10                   ` Al Viro
  0 siblings, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-02-04  0:26 UTC (permalink / raw)
  To: Peter Xu
  Cc: Linus Torvalds, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Thu, Feb 02, 2023 at 05:56:22PM -0500, Peter Xu wrote:

> IMHO it'll be merely impossible to merge things across most (if not to say,
> all) archs.  It will need to be start from one or at least a few that still
> shares a major common base - I would still rely on x86 as a start - then we
> try to use the helper in as much archs as possible.
> 
> Even on x86, I do also see challenges so I'm not sure whether a common
> enough routine can be abstracted indeed.  But I believe there's a way to do
> this because obviously we still see tons of duplicated logics falling
> around.  It may definitely need time to think out where's the best spot to
> start, and how to gradually move towards covering more archs starting from
> one.

FWIW, after going through everything from alpha to loongarch (in alphabetic
order, skipping the itanic) the following seems to be suitable for all of
them:

generic_fault(address, flags, vm_flags, regs)
{
	struct mm_struct *mm = current->mm;
	struct vm_area_struct *vma;
	vm_fault_t fault;

	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);

	if (unlikely(!mmap_read_trylock(mm))) {
		if (!(flags & FAULT_FLAG_USER) &&
		    !search_exception_tables(instruction_pointer(regs))) {
			/*
			 * Fault from code in kernel from
			 * which we do not expect faults.
			 */
			return KERN;
		}
retry:
		mmap_read_lock(mm);
	} else {
		might_sleep();
#ifdef CONFIG_DEBUG_VM
		if (!(flags & FAULT_FLAG_USER) &&
		    !search_exception_tables(instruction_pointer(regs)))
			return KERN;
#endif
	}
	vma = find_vma(mm, address);
	if (!vma)
		goto Eunmapped;
	if (unlikely(vma->vm_start > address)) {
		if (!(vma->vm_flags & VM_GROWSDOWN))
			goto Eunmapped;
		if (addr < FIRST_USER_ADDRESS)
			goto Eunmapped;
		if (expand_stack(vma, address))
			goto Eunmapped;
	}

	/* Ok, we have a good vm_area for this memory access, so
	   we can handle it.  */
	if (!(vma->vm_flags & vm_flags))
		goto Eaccess;

	/* If for any reason at all we couldn't handle the fault,
	   make sure we exit gracefully rather than endlessly redo
	   the fault.  */
	fault = handle_mm_fault(vma, address, flags, regs);

	if (unlikely(fault & VM_FAULT_RETRY)) {
		if (!(flags & FAULT_FLAG_USER)) {
			if (fatal_signal_pending(current))
				return KERN;
		} else {
			if (signal_pending(current))
				return FOAD;
		}
		flags |= FAULT_FLAG_TRIED;
		goto retry;
	}

	if (fault & VM_FAULT_COMPLETED)
		return DONE;

	mmap_read_unlock(mm);

	if (likely(!(fault & VM_FAULT_ERROR)))
		return DONE;

	if (!(flags & FAULT_FLAG_USER))
		return KERN;

	if (fault & VM_FAULT_OOM) {
		pagefault_out_of_memory();
		return FOAD;
	}

	if (fault & VM_FAULT_SIGSEGV)
		return SIGSEGV;

	if (fault & VM_FAULT_SIGBUS)
		return SIGBUS;

	if (fault & VM_FAULT_HWPOISON)
		return POISON + PAGE_SHIFT;	// POISON == 256

	if (fault & VM_FAULT_HWPOISON_LARGE)
		return POISON + hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault));

	BUG();

Eunmapped:
	mmap_read_unlock(mm);
	return flags & FAULT_FLAG_USER ? MAPERR : KERN;
Eaccess:
	mmap_read_unlock(mm);
	return flags & FAULT_FLAG_USER ? ACCERR : KERN;
}

possible return values (and that's obviously not the identifiers to be
used for real; for now I'm just looking for feasibility of it all):
	DONE		success, nothing else to be done
	FOAD		OOM/fatal signal with VM_FAULT_RETRY/
			signal with VM_FAULT_RETRY from userland - nothing
			to be done here.
	KERN		kernel mode failed fault, fixup or oops
	MAPERR		unmapped address, SIGSEGV/SEGV_MAPERR for you
	ACCERR		nothing in vm_flags present in ->vm_flags of vma;
			SIGSEGV/SEGV_ACCERR
	SIGSEGV		VM_FAULT_SIGSEGV; some architectures treat that
			as SEGV_MAPERR, some as SEGV_ACCERR.
	SIGBUS		VM_FAULT_SIGBUS; SIGBUS/BUS_ADRERR
	POISON + shift	VM_FAULT_HWPOISON and VM_FAULT_HWPOISON_LARGE, with
			log2(affected page size) encoded into return value.

This is obviously not even close to final helper, but... alpha, arc, arm, arm64,
csky, hexagon, loongarch convert to that cleanly.

Itanic very much does not (due to weird dual stacks, awful address space layout,
etc.), but then git rm arch/ia64 is long overdue.

Fairly typical look after conversion:

arc: 
{
	struct task_struct *tsk = current;
	struct mm_struct *mm = tsk->mm;
	unsigned int mask;
	unsigned int flags;
	unsigned int res;

	/*
	 * NOTE! We MUST NOT take any locks for this case. We may
	 * be in an interrupt or a critical region, and should
	 * only copy the information from the master page table,
	 * nothing more.
	 */
	if (address >= VMALLOC_START && !user_mode(regs)) {
		if (unlikely(handle_kernel_vaddr_fault(address)))
			goto no_context;
		else
			return;
	}

	/*
	 * If we're in an interrupt or have no user
	 * context, we must not take the fault..
	 */
	if (faulthandler_disabled() || !mm)
		goto no_context;

	flags = FAULT_FLAG_DEFAULT;
	if (user_mode(regs))
		flags |= FAULT_FLAG_USER;
	mask = VM_READ;
	if (regs->ecr_cause & ECR_C_PROTV_STORE) {	/* ST/EX */
		flags |= FAULT_FLAG_WRITE;
		mask = VM_WRITE;
	} else if ((regs->ecr_vec == ECR_V_PROTV) &&
	         (regs->ecr_cause == ECR_C_PROTV_INST_FETCH)) {
		mask = VM_EXEC;
	}

	res = generic_fault(address, flags, mask, regs);
	if (likely(res == DONE))
		return;
	if (res == FOAD)
		return;
	if (res == KERN) {
no_context:
		if (fixup_exception(regs))
			return;
		die("Oops", regs, address);
	}

	tsk->thread.fault_address = address;
	if (res == SIGBUS)
		force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *) address);
	else
		force_sig_fault(SIGSEGV, res == ACCERR ? SEGV_ACCERR : SEGV_MAPERR,
				(void __user *) address);
}

Or this arm64:

{
	const struct fault_info *inf;
	struct mm_struct *mm = current->mm;
	unsigned long vm_flags;
	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
	unsigned long addr = untagged_addr(far);
	unsigned int res;

	if (kprobe_page_fault(regs, esr))
		return 0;

	/*
	 * If we're in an interrupt or have no user context, we must not take
	 * the fault.
	 */
	if (faulthandler_disabled() || !mm)
		goto no_context;

	if (user_mode(regs))
		mm_flags |= FAULT_FLAG_USER;

	/*
	 * vm_flags tells us what bits we must have in vma->vm_flags
	 * for the fault to be benign, __do_page_fault() would check
	 * vma->vm_flags & vm_flags and returns an error if the
	 * intersection is empty
	 */
	if (is_el0_instruction_abort(esr)) {
		/* It was exec fault */
		vm_flags = VM_EXEC;
		mm_flags |= FAULT_FLAG_INSTRUCTION;
	} else if (is_write_abort(esr)) {
		/* It was write fault */
		vm_flags = VM_WRITE;
		mm_flags |= FAULT_FLAG_WRITE;
	} else {
		/* It was read fault */
		vm_flags = VM_READ;
		/* Write implies read */
		vm_flags |= VM_WRITE;
		/* If EPAN is absent then exec implies read */
		if (!cpus_have_const_cap(ARM64_HAS_EPAN))
			vm_flags |= VM_EXEC;
	}

	if (is_ttbr0_addr(addr) && is_el1_permission_fault(addr, esr, regs)) {
		if (is_el1_instruction_abort(esr))
			die_kernel_fault("execution of user memory",
					 addr, esr, regs);

		if (!search_exception_tables(regs->pc))
			die_kernel_fault("access to user memory outside uaccess routines",
					 addr, esr, regs);
	}

	res = generic_fault(addr, mm_flags, vm_flags, regs);
	if (likely(res == DONE))
		return 0;
	if (res == FOAD)
		return 0;
	if (res == KERN) {
no_context:
		__do_kernel_fault(addr, esr, regs);
		return 0;
	}
	inf = esr_to_fault_info(esr);
	set_thread_esr(addr, esr);
	if (res == SIGBUS) {
		/*
		 * We had some memory, but were unable to successfully fix up
		 * this page fault.
		 */
		arm64_force_sig_fault(SIGBUS, BUS_ADRERR, far, inf->name);
	} else if (res > POISON) {
		arm64_force_sig_mceerr(BUS_MCEERR_AR, far, res - POISON, inf->name);
	} else {
		/*
		 * Something tried to access memory that isn't in our memory
		 * map.
		 */
		arm64_force_sig_fault(SIGSEGV,
				      res == ACCERR ? SEGV_ACCERR : SEGV_MAPERR,
				      far, inf->name);
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-04  0:26                 ` Al Viro
@ 2023-02-05  5:10                   ` Al Viro
  0 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-02-05  5:10 UTC (permalink / raw)
  To: Peter Xu
  Cc: Linus Torvalds, linux-arch, linux-alpha, linux-ia64,
	linux-hexagon, linux-m68k, Michal Simek, Dinh Nguyen, openrisc,
	linux-parisc, linux-riscv, sparclinux

On Sat, Feb 04, 2023 at 12:26:39AM +0000, Al Viro wrote:
> On Thu, Feb 02, 2023 at 05:56:22PM -0500, Peter Xu wrote:
> 
> > IMHO it'll be merely impossible to merge things across most (if not to say,
> > all) archs.  It will need to be start from one or at least a few that still
> > shares a major common base - I would still rely on x86 as a start - then we
> > try to use the helper in as much archs as possible.
> > 
> > Even on x86, I do also see challenges so I'm not sure whether a common
> > enough routine can be abstracted indeed.  But I believe there's a way to do
> > this because obviously we still see tons of duplicated logics falling
> > around.  It may definitely need time to think out where's the best spot to
> > start, and how to gradually move towards covering more archs starting from
> > one.
> 
> FWIW, after going through everything from alpha to loongarch (in alphabetic
> order, skipping the itanic) the following seems to be suitable for all of
> them:
> 
> generic_fault(address, flags, vm_flags, regs)
[snip]

After looking through other architectures: that needs changes.

AFAICS, the right approach would be to add a pointer to (uninitialized)
kernel_siginfo.  And let it pass the signal number, etc. through that.
That way all "we want to raise a signal" return values fold together.
*IF* the page fault wants to do something extra on SIGSEGV, but not on
SIGBUS (we have several such), it can key that upon info.si_signo.

Speaking of SIGSEGV, there's a fun situation with VM_FAULT_SIGSEGV:

1) Only x86 and ppc generate VM_FAULT_SIGSEGV in handle_mm_fault()
without hitting WARN_ON_ONCE().  And neither actually returns it
to page fault handler - the conditions that would've led to that
return value (pkey stuff) are checked in the page fault handler
and handle_mm_fault() is not called in such cases.

2) on alpha, hexagon, itanic, loongarch, microblaze, mips, nios2,
openrisc, sparc, um and xtensa #PF handler would end up with SEGV_ACCERR
if it saw VM_FAULT_SIGNAL.

3) on arm, arm64, arc, m68k, powerpc, s390, sh and x86 - SEGV_MAPERR
(except that neither x86 nor powerpc #PF ever see it)

4) on csky and riscv it would end up with BUG()

5) on parisc it's complicated^Wbroken - it tries to decide which
one to use after having unlocked mmap and looking at vma in process.

As it is, VM_FAULT_SIGSEGV had ended up as "we need to report some
error to get_user_pages() and similar callers in cases when
page fault handler would've detected pkey problem and refused
to call handle_mm_fault()", with several places where it's
"we had been called with bogus arguments; printk and return
something to indicate the things had gone wrong".  It used to
have other uses, but this is what it had become now - grep and
you'll see.

AFAICS, all checks for it in page fault handlers are pretty much
dead code...

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
@ 2023-02-05  6:18   ` Finn Thain
  2023-02-05 18:51     ` Linus Torvalds
  2023-02-05 20:39     ` Al Viro
  2023-02-06 12:08   ` Geert Uytterhoeven
  1 sibling, 2 replies; 40+ messages in thread
From: Finn Thain @ 2023-02-05  6:18 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

Hello Al,

On Tue, 31 Jan 2023, Al Viro wrote:

> m68k equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY 
> handling" If e.g. get_user() triggers a page fault and a fatal signal is 
> caught, we might end up with handle_mm_fault() returning VM_FAULT_RETRY 
> and not doing anything to page tables.  In such case we must *not* 
> return to the faulting insn - that would repeat the entire thing without 
> making any progress; what we need instead is to treat that as failed 
> (user) memory access.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

That could be a bug I was chasing back in 2021 but never found. The mmap 
stressors in stress-ng were triggering a crash on a Mac Quadras, though 
only rarely. Sometimes it would run all day without a failure.

Last year when I started using GCC 12 to build the kernel, I saw the same 
workload fail again but the failure mode had become a silent hang/livelock 
instead of the oopses I got with GCC 6.

When I press the NMI button after the livelock I always see 
do_page_fault() in the backtrace. So I've been testing your patch. I've 
been running the same stress-ng reproducer for about 12 hours now with no 
failures which looks promising.

In case that stress-ng testing is of use:
Tested-by: Finn Thain <fthain@linux-m68k.org>

BTW, how did you identify that bug in do_page_fault()? If its the same bug 
I was chasing, it could be an old one. The stress-ng logs I collected last 
year include a crash from a v4.14 build.

> ---
>  arch/m68k/mm/fault.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
> index 4d2837eb3e2a..228128e45c67 100644
> --- a/arch/m68k/mm/fault.c
> +++ b/arch/m68k/mm/fault.c
> @@ -138,8 +138,11 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>  	fault = handle_mm_fault(vma, address, flags, regs);
>  	pr_debug("handle_mm_fault returns %x\n", fault);
>  
> -	if (fault_signal_pending(fault, regs))
> +	if (fault_signal_pending(fault, regs)) {
> +		if (!user_mode(regs))
> +			goto no_context;
>  		return 0;
> +	}
>  
>  	/* The fault is fully completed (including releasing mmap lock) */
>  	if (fault & VM_FAULT_COMPLETED)
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-02-05  6:18   ` Finn Thain
@ 2023-02-05 18:51     ` Linus Torvalds
  2023-02-07  3:07       ` Finn Thain
  2023-02-05 20:39     ` Al Viro
  1 sibling, 1 reply; 40+ messages in thread
From: Linus Torvalds @ 2023-02-05 18:51 UTC (permalink / raw)
  To: Finn Thain
  Cc: Al Viro, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Sat, Feb 4, 2023 at 10:16 PM Finn Thain <fthain@linux-m68k.org> wrote:
>
> That could be a bug I was chasing back in 2021 but never found. The mmap
> stressors in stress-ng were triggering a crash on a Mac Quadras, though
> only rarely. Sometimes it would run all day without a failure.
>
> Last year when I started using GCC 12 to build the kernel, I saw the same
> workload fail again but the failure mode had become a silent hang/livelock
> instead of the oopses I got with GCC 6.
>
> When I press the NMI button after the livelock I always see
> do_page_fault() in the backtrace. So I've been testing your patch. I've
> been running the same stress-ng reproducer for about 12 hours now with no
> failures which looks promising.
>
> In case that stress-ng testing is of use:
> Tested-by: Finn Thain <fthain@linux-m68k.org>

Could you test the thing that Mark Rutland pointed to? He had an
actual test-case for this for the arm64 fixes some years ago.

See

   https://lore.kernel.org/all/Y9pD+TMP+%2FSyfeJm@FVFF77S0Q05N/

for his email with links to his old test-case?

                Linus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-02-05  6:18   ` Finn Thain
  2023-02-05 18:51     ` Linus Torvalds
@ 2023-02-05 20:39     ` Al Viro
  2023-02-05 20:41       ` Linus Torvalds
  1 sibling, 1 reply; 40+ messages in thread
From: Al Viro @ 2023-02-05 20:39 UTC (permalink / raw)
  To: Finn Thain
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Sun, Feb 05, 2023 at 05:18:08PM +1100, Finn Thain wrote:

> That could be a bug I was chasing back in 2021 but never found. The mmap 
> stressors in stress-ng were triggering a crash on a Mac Quadras, though 
> only rarely. Sometimes it would run all day without a failure.
> 
> Last year when I started using GCC 12 to build the kernel, I saw the same 
> workload fail again but the failure mode had become a silent hang/livelock 
> instead of the oopses I got with GCC 6.
> 
> When I press the NMI button after the livelock I always see 
> do_page_fault() in the backtrace. So I've been testing your patch. I've 
> been running the same stress-ng reproducer for about 12 hours now with no 
> failures which looks promising.
> 
> In case that stress-ng testing is of use:
> Tested-by: Finn Thain <fthain@linux-m68k.org>
> 
> BTW, how did you identify that bug in do_page_fault()? If its the same bug 
> I was chasing, it could be an old one. The stress-ng logs I collected last 
> year include a crash from a v4.14 build.

Went to reread the current state of mm/gup.c, decided to reread handle_mm_fault()
and its callers, noticed fault_signal_pending() which hadn't been there back
when I last crawled through that area, realized what it had replaced, went
to check if everything had been converted (arch/um got missed, BTW).  Noticed
the difference between the architectures (the first hit was on alpha, without
the "sod off to no_context if it's a user fault" logics, the last - xtensa, with
it).  Checked the log for xtensa, found the commit from 2021 adding that part;
looked on arm and arm64, found commits from 2017 doing the same thing, then,
on x86, Linus' commit from 2014 adding the x86 counterpart...  Figuring out
what all of those had been for wasn't particularly hard, and it was easy
to check which architectures still needed the same thing...

BTW, since these patches would be much easier to backport than any unification
work, I think the right thing to do would be to have further unification done on
top of them.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-02-05 20:39     ` Al Viro
@ 2023-02-05 20:41       ` Linus Torvalds
  0 siblings, 0 replies; 40+ messages in thread
From: Linus Torvalds @ 2023-02-05 20:41 UTC (permalink / raw)
  To: Al Viro
  Cc: Finn Thain, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, linux-parisc, linux-riscv,
	sparclinux

On Sun, Feb 5, 2023 at 12:39 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> BTW, since these patches would be much easier to backport than any unification
> work, I think the right thing to do would be to have further unification done on
> top of them.

Ack. I'm not NAKing the patches, I was just hoping that we also have
some way forward.

So "fix the issues, then unify" sounds like the right thing to do to me.

               Linus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC][PATCHSET] VM_FAULT_RETRY fixes
  2023-02-01 10:50 ` Mark Rutland
@ 2023-02-06 12:08   ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-02-06 12:08 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Al Viro, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux, Linus Torvalds

Hi Mark,

On Wed, Feb 1, 2023 at 11:52 AM Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Jan 31, 2023 at 08:02:51PM +0000, Al Viro wrote:
> > On x86 it had been noticed and fixed back in 2014, in 26178ec11ef3 "x86:
> > mm: consolidate VM_FAULT_RETRY handling".  Some of the other architectures
> > had it dealt with later - e.g. arm in 2017, the fix is 746a272e44141
> > "ARM: 8692/1: mm: abort uaccess retries upon fatal signal"; xtensa -
> > in 2021, the fix is 7b9acbb6aad4f "xtensa: fix uaccess-related livelock
> > in do_page_fault", etc.
> >
> > However, it never had been done on a bunch of architectures - the
> > current mainline still has that bug on alpha, hexagon, itanic, m68k,
> > microblaze, nios2, openrisc, parisc, riscv and sparc (both sparc32 and
> > sparc64).  Fixes are trivial, but I've no way to test them for most
> > of those architectures.
>
> FWIW, when I fixed arm and arm64 back in 2017, I did report the issue here with
> a test case (and again in 2021, with maintainers all explciitly Cc'd):
>
>   https://lore.kernel.org/lkml/20170822102527.GA14671@leverpostej/
>   https://lore.kernel.org/linux-arch/20210121123140.GD48431@C02TD0UTHF1T.local/
>
> ... so if anyone has access to those architectures, that test might be useful
> for verifying the fix.

Thanks a lot! This showed the problem on m68k, and confirmed Al's fix.

I'll give it a try on a few more systems later...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
  2023-02-05  6:18   ` Finn Thain
@ 2023-02-06 12:08   ` Geert Uytterhoeven
  1 sibling, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-02-06 12:08 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Tue, Jan 31, 2023 at 9:06 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> m68k equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/10] parisc: fix livelock in uaccess
  2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
@ 2023-02-06 16:58   ` Helge Deller
  2023-02-28 17:34     ` Al Viro
  2023-02-28 15:22   ` Guenter Roeck
  1 sibling, 1 reply; 40+ messages in thread
From: Helge Deller @ 2023-02-06 16:58 UTC (permalink / raw)
  To: Al Viro, linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

Hi Al,

On 1/31/23 21:06, Al Viro wrote:
> parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>   arch/parisc/mm/fault.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
> index 869204e97ec9..bb30ff6a3e19 100644
> --- a/arch/parisc/mm/fault.c
> +++ b/arch/parisc/mm/fault.c
> @@ -308,8 +308,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
>
>   	fault = handle_mm_fault(vma, address, flags, regs);
>
> -	if (fault_signal_pending(fault, regs))
> +	if (fault_signal_pending(fault, regs)) {
> +		if (!user_mode(regs))
> +			goto no_context;
>   		return;
> +	}

The testcase in
   https://lore.kernel.org/lkml/20170822102527.GA14671@leverpostej/
   https://lore.kernel.org/linux-arch/20210121123140.GD48431@C02TD0UTHF1T.local/
does hang with and without above patch on parisc.
It does not consume CPU in that state and can be killed with ^C.

Any idea?

Helge

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/10] riscv: fix livelock in uaccess
  2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
@ 2023-02-06 20:06   ` Björn Töpel
  2023-02-07 16:11   ` Geert Uytterhoeven
  1 sibling, 0 replies; 40+ messages in thread
From: Björn Töpel @ 2023-02-06 20:06 UTC (permalink / raw)
  To: Al Viro, linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds, Mark Rutland

Al Viro <viro@zeniv.linux.org.uk> writes:

> riscv equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Reproduced with Mark's userland program -- thanks!

Tested-by: Björn Töpel <bjorn@kernel.org>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/10] m68k: fix livelock in uaccess
  2023-02-05 18:51     ` Linus Torvalds
@ 2023-02-07  3:07       ` Finn Thain
  0 siblings, 0 replies; 40+ messages in thread
From: Finn Thain @ 2023-02-07  3:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, Michal Simek, Dinh Nguyen, openrisc, linux-parisc,
	linux-riscv, sparclinux

On Sun, 5 Feb 2023, Linus Torvalds wrote:

> On Sat, Feb 4, 2023 at 10:16 PM Finn Thain <fthain@linux-m68k.org> wrote:
> >
> > That could be a bug I was chasing back in 2021 but never found. The mmap
> > stressors in stress-ng were triggering a crash on a Mac Quadras, though
> > only rarely. Sometimes it would run all day without a failure.
> >
> > Last year when I started using GCC 12 to build the kernel, I saw the same
> > workload fail again but the failure mode had become a silent hang/livelock
> > instead of the oopses I got with GCC 6.
> >
> > When I press the NMI button after the livelock I always see
> > do_page_fault() in the backtrace. So I've been testing your patch. I've
> > been running the same stress-ng reproducer for about 12 hours now with no
> > failures which looks promising.
> >
> > In case that stress-ng testing is of use:
> > Tested-by: Finn Thain <fthain@linux-m68k.org>
> 
> Could you test the thing that Mark Rutland pointed to? He had an
> actual test-case for this for the arm64 fixes some years ago.
> 
> See
> 
>    https://lore.kernel.org/all/Y9pD+TMP+%2FSyfeJm@FVFF77S0Q05N/
> 
> for his email with links to his old test-case?
> 

With qemu-system-m68k v7.2 runing Linux v6.1, killing Mark's program did 
not trigger any failure.

With my Quadra 650 running the same binaries, killing Mark's program 
produces the BUG splat below. I can confirm that Al's patch fixes it.

[  199.060000] BUG: scheduling while atomic: test/48/0x0003e2e6
[  199.060000] Modules linked in:
[  199.060000] CPU: 0 PID: 48 Comm: test Tainted: G        W          6.1.0-mac #3
[  199.060000] Stack from 00a6204c:
[  199.060000]         00a6204c 004c1979 004c1979 00000000 00a6215e 00a6206c 00408d62 004c1979
[  199.060000]         00a62074 0003ad4a 00a620a8 004099e2 00983480 00000000 00a6215e 00000000
[  199.060000]         00000002 00983480 00a62000 0040970a 00a6209c 00a6209c 00a620c0 00a620c0
[  199.060000]         00409b94 00000000 00bcf0c0 00a62198 00a357fc 00a6216c 00120e5c 00bcf0d0
[  199.060000]         00000003 00000001 00000001 0123f039 00000000 00a357fc 00aead38 005870dc
[  199.060000]         005870dc 00a31200 00000000 00000000 00000000 00010000 0000c000 00000000
[  199.060000] Call Trace: [<00408d62>] dump_stack+0x10/0x16
[  199.060000]  [<0003ad4a>] __schedule_bug+0x5e/0x70
[  199.060000]  [<004099e2>] __schedule+0x2d8/0x446
[  199.060000]  [<0040970a>] __schedule+0x0/0x446
[  199.060000]  [<00409b94>] schedule+0x44/0x8e
[  199.060000]  [<00120e5c>] handle_userfault+0x298/0x3de
[  199.060000]  [<00010000>] zer_rp2+0x14/0x18
[  199.060000]  [<0000c000>] cu_dmrs+0x0/0x16
[  199.060000]  [<00001200>] kernel_pg_dir+0x200/0x1000
[  199.060000]  [<00010000>] zer_rp2+0x14/0x18
[  199.060000]  [<0000c000>] cu_dmrs+0x0/0x16
[  199.060000]  [<00001200>] kernel_pg_dir+0x200/0x1000
[  199.060000]  [<00010000>] zer_rp2+0x14/0x18
[  199.060000]  [<0000c000>] cu_dmrs+0x0/0x16
[  199.060000]  [<000ab0a0>] handle_mm_fault+0xa34/0xa56
[  199.060000]  [<000c0000>] __alloc_pages_bulk+0x26/0x3f8
[  199.060000]  [<00007056>] do_page_fault+0xd8/0x28a
[  199.060000]  [<00006098>] buserr_c+0x1a6/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<000056fc>] do_040writeback1+0x0/0x1d8
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<000062cc>] buserr_c+0x3da/0x6b0
[  199.060000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[  199.060000]  [<00002ac0>] buserr+0x20/0x28
[  199.060000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[  199.060000]  [<00008001>] mac_irq_disable+0x3b/0x98
[  199.060000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[  199.060000]  [<00008001>] mac_irq_disable+0x3b/0x98
[  199.060000] 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/10] riscv: fix livelock in uaccess
  2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
  2023-02-06 20:06   ` Björn Töpel
@ 2023-02-07 16:11   ` Geert Uytterhoeven
  1 sibling, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2023-02-07 16:11 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Tue, Jan 31, 2023 at 9:07 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> riscv equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH 02/10] hexagon: fix livelock in uaccess
  2023-01-31 20:03 ` [PATCH 02/10] hexagon: " Al Viro
@ 2023-02-10  2:59   ` Brian Cain
  0 siblings, 0 replies; 40+ messages in thread
From: Brian Cain @ 2023-02-10  2:59 UTC (permalink / raw)
  To: Al Viro, linux-arch
  Cc: linux-alpha, linux-ia64, linux-hexagon, linux-m68k, Michal Simek,
	Dinh Nguyen, openrisc, linux-parisc, linux-riscv, sparclinux,
	Linus Torvalds

> -----Original Message-----
> From: Al Viro <viro@ftp.linux.org.uk> On Behalf Of Al Viro
> Sent: Tuesday, January 31, 2023 2:04 PM
> To: linux-arch@vger.kernel.org
> Cc: linux-alpha@vger.kernel.org; linux-ia64@vger.kernel.org; linux-
> hexagon@vger.kernel.org; linux-m68k@lists.linux-m68k.org; Michal Simek
> <monstr@monstr.eu>; Dinh Nguyen <dinguyen@kernel.org>;
> openrisc@lists.librecores.org; linux-parisc@vger.kernel.org; linux-
> riscv@lists.infradead.org; sparclinux@vger.kernel.org; Linus Torvalds
> <torvalds@linux-foundation.org>
> Subject: [PATCH 02/10] hexagon: fix livelock in uaccess
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
> 
> hexagon equivalent of 26178ec11ef3 "x86: mm: consolidate
> VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing
> anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we
> need
> instead is to treat that as failed (user) memory access.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  arch/hexagon/mm/vm_fault.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c
> index f73c7cbfe326..4b578d02fd01 100644
> --- a/arch/hexagon/mm/vm_fault.c
> +++ b/arch/hexagon/mm/vm_fault.c
> @@ -93,8 +93,11 @@ void do_page_fault(unsigned long address, long cause,
> struct pt_regs *regs)
> 
>         fault = handle_mm_fault(vma, address, flags, regs);
> 
> -       if (fault_signal_pending(fault, regs))
> +       if (fault_signal_pending(fault, regs)) {
> +               if (!user_mode(regs))
> +                       goto no_context;
>                 return;
> +       }
> 
>         /* The fault is fully completed (including releasing mmap lock) */
>         if (fault & VM_FAULT_COMPLETED)
> --
> 2.30.2

Acked-by: Brian Cain <bcain@quicinc.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/10] parisc: fix livelock in uaccess
  2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
  2023-02-06 16:58   ` Helge Deller
@ 2023-02-28 15:22   ` Guenter Roeck
  2023-02-28 19:18     ` Michael Schmitz
  1 sibling, 1 reply; 40+ messages in thread
From: Guenter Roeck @ 2023-02-28 15:22 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Tue, Jan 31, 2023 at 08:06:27PM +0000, Al Viro wrote:
> parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  arch/parisc/mm/fault.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
> index 869204e97ec9..bb30ff6a3e19 100644
> --- a/arch/parisc/mm/fault.c
> +++ b/arch/parisc/mm/fault.c
> @@ -308,8 +308,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
>  
>  	fault = handle_mm_fault(vma, address, flags, regs);
>  
> -	if (fault_signal_pending(fault, regs))
> +	if (fault_signal_pending(fault, regs)) {
> +		if (!user_mode(regs))
> +			goto no_context;

0-day rightfully complains that this leaves 'msg' uninitialized.

arch/parisc/mm/fault.c:427 do_page_fault() error: uninitialized symbol 'msg'

Guenter

>  		return;
> +	}
>  
>  	/* The fault is fully completed (including releasing mmap lock) */
>  	if (fault & VM_FAULT_COMPLETED)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/10] parisc: fix livelock in uaccess
  2023-02-06 16:58   ` Helge Deller
@ 2023-02-28 17:34     ` Al Viro
  0 siblings, 0 replies; 40+ messages in thread
From: Al Viro @ 2023-02-28 17:34 UTC (permalink / raw)
  To: Helge Deller
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

On Mon, Feb 06, 2023 at 05:58:02PM +0100, Helge Deller wrote:
> Hi Al,
> 
> On 1/31/23 21:06, Al Viro wrote:
> > parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> > If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> > end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> > to page tables.  In such case we must *not* return to the faulting insn -
> > that would repeat the entire thing without making any progress; what we need
> > instead is to treat that as failed (user) memory access.
> > 
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > ---
> >   arch/parisc/mm/fault.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
> > index 869204e97ec9..bb30ff6a3e19 100644
> > --- a/arch/parisc/mm/fault.c
> > +++ b/arch/parisc/mm/fault.c
> > @@ -308,8 +308,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
> > 
> >   	fault = handle_mm_fault(vma, address, flags, regs);
> > 
> > -	if (fault_signal_pending(fault, regs))
> > +	if (fault_signal_pending(fault, regs)) {
> > +		if (!user_mode(regs))
> > +			goto no_context;
> >   		return;
> > +	}
> 
> The testcase in
>   https://lore.kernel.org/lkml/20170822102527.GA14671@leverpostej/
>   https://lore.kernel.org/linux-arch/20210121123140.GD48431@C02TD0UTHF1T.local/
> does hang with and without above patch on parisc.
> It does not consume CPU in that state and can be killed with ^C.
> 
> Any idea?

	Still trying to resurrect the parisc box to test on it...
FWIW, right now I've locally confirmed that mainline has the bug
in question and that patch fixes it for alpha, sparc32 and sparc64;
hexagon, m68k and riscv got acks from other folks; microblaze,
nios2 and openrisc I can't test at all (no hardware, no qemu setup);
same for parisc64.  Itanic and parisc32 I might be able to test,
if I manage to resurrect the hardware.

	Just to confirm: your "can be killed with ^C" had been on the
mainline parisc kernel (with userfaultfd enable, of course, or it wouldn't
hang up at all), right?  Was it 32bit or 64bit kernel?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/10] parisc: fix livelock in uaccess
  2023-02-28 15:22   ` Guenter Roeck
@ 2023-02-28 19:18     ` Michael Schmitz
  0 siblings, 0 replies; 40+ messages in thread
From: Michael Schmitz @ 2023-02-28 19:18 UTC (permalink / raw)
  To: Guenter Roeck, Al Viro
  Cc: linux-arch, linux-alpha, linux-ia64, linux-hexagon, linux-m68k,
	Michal Simek, Dinh Nguyen, openrisc, linux-parisc, linux-riscv,
	sparclinux, Linus Torvalds

Guenter,

On 1/03/23 04:22, Guenter Roeck wrote:
> On Tue, Jan 31, 2023 at 08:06:27PM +0000, Al Viro wrote:
>> parisc equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
>> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
>> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
>> to page tables.  In such case we must *not* return to the faulting insn -
>> that would repeat the entire thing without making any progress; what we need
>> instead is to treat that as failed (user) memory access.
>>
>> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
>> ---
>>   arch/parisc/mm/fault.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
>> index 869204e97ec9..bb30ff6a3e19 100644
>> --- a/arch/parisc/mm/fault.c
>> +++ b/arch/parisc/mm/fault.c
>> @@ -308,8 +308,11 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
>>   
>>   	fault = handle_mm_fault(vma, address, flags, regs);
>>   
>> -	if (fault_signal_pending(fault, regs))
>> +	if (fault_signal_pending(fault, regs)) {
>> +		if (!user_mode(regs))
>> +			goto no_context;
> 0-day rightfully complains that this leaves 'msg' uninitialized.
>
> arch/parisc/mm/fault.c:427 do_page_fault() error: uninitialized symbol 'msg'
>
> Guenter

What happens if you initialize msg to "Page fault: no context" right at 
the start of do_page_fault (and drop the assignment a few lines down as 
that's now redundant)?

(Wondering if the zero page access on parisc could cause a trip right 
back into do_page_fault, ad infinitum...)

Cheers,

     Michael


>>   		return;
>> +	}
>>   
>>   	/* The fault is fully completed (including releasing mmap lock) */
>>   	if (fault & VM_FAULT_COMPLETED)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/10] alpha: fix livelock in uaccess
  2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
@ 2023-03-07  0:48   ` patchwork-bot+linux-riscv
  0 siblings, 0 replies; 40+ messages in thread
From: patchwork-bot+linux-riscv @ 2023-03-07  0:48 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-riscv, linux-arch, linux-alpha, linux-ia64, linux-hexagon,
	linux-m68k, monstr, dinguyen, openrisc, linux-parisc, sparclinux,
	torvalds

Hello:

This series was applied to riscv/linux.git (fixes)
by Al Viro <viro@zeniv.linux.org.uk>:

On Tue, 31 Jan 2023 20:03:28 +0000 you wrote:
> alpha equivalent of 26178ec11ef3 "x86: mm: consolidate VM_FAULT_RETRY handling"
> If e.g. get_user() triggers a page fault and a fatal signal is caught, we might
> end up with handle_mm_fault() returning VM_FAULT_RETRY and not doing anything
> to page tables.  In such case we must *not* return to the faulting insn -
> that would repeat the entire thing without making any progress; what we need
> instead is to treat that as failed (user) memory access.
> 
> [...]

Here is the summary with links:
  - [01/10] alpha: fix livelock in uaccess
    (no matching commit)
  - [02/10] hexagon: fix livelock in uaccess
    (no matching commit)
  - [03/10] ia64: fix livelock in uaccess
    (no matching commit)
  - [04/10] m68k: fix livelock in uaccess
    (no matching commit)
  - [05/10] microblaze: fix livelock in uaccess
    (no matching commit)
  - [06/10] nios2: fix livelock in uaccess
    (no matching commit)
  - [07/10] openrisc: fix livelock in uaccess
    (no matching commit)
  - [08/10] parisc: fix livelock in uaccess
    (no matching commit)
  - [09/10] riscv: fix livelock in uaccess
    https://git.kernel.org/riscv/c/d835eb3a57de
  - [10/10] sparc: fix livelock in uaccess
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-03-07  0:49 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-31 20:02 [RFC][PATCHSET] VM_FAULT_RETRY fixes Al Viro
2023-01-31 20:03 ` [PATCH 01/10] alpha: fix livelock in uaccess Al Viro
2023-03-07  0:48   ` patchwork-bot+linux-riscv
2023-01-31 20:03 ` [PATCH 02/10] hexagon: " Al Viro
2023-02-10  2:59   ` Brian Cain
2023-01-31 20:04 ` [PATCH 03/10] ia64: " Al Viro
2023-01-31 20:04 ` [PATCH 04/10] m68k: " Al Viro
2023-02-05  6:18   ` Finn Thain
2023-02-05 18:51     ` Linus Torvalds
2023-02-07  3:07       ` Finn Thain
2023-02-05 20:39     ` Al Viro
2023-02-05 20:41       ` Linus Torvalds
2023-02-06 12:08   ` Geert Uytterhoeven
2023-01-31 20:05 ` [PATCH 05/10] microblaze: " Al Viro
2023-01-31 20:05 ` [PATCH 06/10] nios2: " Al Viro
2023-01-31 20:06 ` [PATCH 07/10] openrisc: " Al Viro
2023-01-31 20:06 ` [PATCH 08/10] parisc: " Al Viro
2023-02-06 16:58   ` Helge Deller
2023-02-28 17:34     ` Al Viro
2023-02-28 15:22   ` Guenter Roeck
2023-02-28 19:18     ` Michael Schmitz
2023-01-31 20:06 ` [PATCH 09/10] riscv: " Al Viro
2023-02-06 20:06   ` Björn Töpel
2023-02-07 16:11   ` Geert Uytterhoeven
2023-01-31 20:07 ` [PATCH 10/10] sparc: " Al Viro
2023-01-31 20:24 ` [RFC][PATCHSET] VM_FAULT_RETRY fixes Linus Torvalds
2023-01-31 21:10   ` Al Viro
2023-01-31 21:19     ` Linus Torvalds
2023-01-31 21:49       ` Al Viro
2023-02-01  0:00         ` Linus Torvalds
2023-02-01 19:48           ` Peter Xu
2023-02-01 22:18             ` Al Viro
2023-02-02  0:57               ` Al Viro
2023-02-02 22:56               ` Peter Xu
2023-02-04  0:26                 ` Al Viro
2023-02-05  5:10                   ` Al Viro
2023-02-01  8:21       ` Helge Deller
2023-02-01 19:51         ` Linus Torvalds
2023-02-01 10:50 ` Mark Rutland
2023-02-06 12:08   ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).