linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier
@ 2021-05-04 13:42 Michael Ellerman
  2021-05-04 13:42 ` [PATCH 2/2] powerpc/64s: Fix crashes when toggling entry flush barrier Michael Ellerman
  2021-05-04 22:44 ` [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Nathan Lynch
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Ellerman @ 2021-05-04 13:42 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: nathanl, anton, npiggin, dja

The STF (store-to-load forwarding) barrier mitigation can be
enabled/disabled at runtime via a debugfs file (stf_barrier), which
causes the kernel to patch itself to enable/disable the relevant
mitigations.

However depending on which mitigation we're using, it may not be safe to
do that patching while other CPUs are active. For example the following
crash:

  User access of kernel address (c00000003fff5af0) - exploit attempt? (uid: 0)
  segfault (11) at c00000003fff5af0 nip 7fff8ad12198 lr 7fff8ad121f8 code 1
  code: 40820128 e93c00d0 e9290058 7c292840 40810058 38600000 4bfd9a81 e8410018
  code: 2c030006 41810154 3860ffb6 e9210098 <e94d8ff0> 7d295279 39400000 40820a3c

Shows that we returned to userspace without restoring the user r13
value, due to executing the partially patched STF exit code.

Fix it by doing the patching under stop machine. The CPUs that aren't
doing the patching will be spinning in the core of the stop machine
logic. That is currently sufficient for our purposes, because none of
the patching we do is to that code or anywhere in the vicinity.

Fixes: a048a07d7f45 ("powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/lib/feature-fixups.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 1fd31b4b0e13..8f8c8c98a6ac 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -14,6 +14,7 @@
 #include <linux/string.h>
 #include <linux/init.h>
 #include <linux/sched/mm.h>
+#include <linux/stop_machine.h>
 #include <asm/cputable.h>
 #include <asm/code-patching.h>
 #include <asm/page.h>
@@ -227,11 +228,25 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
 		                                           : "unknown");
 }
 
-
-void do_stf_barrier_fixups(enum stf_barrier_type types)
+static int __do_stf_barrier_fixups(void *data)
 {
+	enum stf_barrier_type types = (enum stf_barrier_type)data;
+
 	do_stf_entry_barrier_fixups(types);
 	do_stf_exit_barrier_fixups(types);
+
+	return 0;
+}
+
+void do_stf_barrier_fixups(enum stf_barrier_type types)
+{
+	/*
+	 * The call to the fallback entry flush, and the fallback/sync-ori exit
+	 * flush can not be safely patched in/out while other CPUs are executing
+	 * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs
+	 * spin in the stop machine core with interrupts hard disabled.
+	 */
+	stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL);
 }
 
 void do_uaccess_flush_fixups(enum l1d_flush_type types)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/64s: Fix crashes when toggling entry flush barrier
  2021-05-04 13:42 [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Michael Ellerman
@ 2021-05-04 13:42 ` Michael Ellerman
  2021-05-04 22:44 ` [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Nathan Lynch
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2021-05-04 13:42 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: nathanl, anton, npiggin, dja

The entry flush mitigation can be enabled/disabled at runtime via a
debugfs file (entry_flush), which causes the kernel to patch itself to
enable/disable the relevant mitigations.

However depending on which mitigation we're using, it may not be safe to
do that patching while other CPUs are active. For example the following
crash:

  sleeper[15639]: segfault (11) at c000000000004c20 nip c000000000004c20 lr c000000000004c20

Shows that we returned to userspace with a corrupted LR that points into
the kernel, due to executing the partially patched call to the fallback
entry flush (ie. we missed the LR restore).

Fix it by doing the patching under stop machine. The CPUs that aren't
doing the patching will be spinning in the core of the stop machine
logic. That is currently sufficient for our purposes, because none of
the patching we do is to that code or anywhere in the vicinity.

Fixes: f79643787e0a ("powerpc/64s: flush L1D on kernel entry")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/lib/feature-fixups.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 8f8c8c98a6ac..679833564e19 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -299,8 +299,9 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
 						: "unknown");
 }
 
-void do_entry_flush_fixups(enum l1d_flush_type types)
+static int __do_entry_flush_fixups(void *data)
 {
+	enum l1d_flush_type types = (enum l1d_flush_type)data;
 	unsigned int instrs[3], *dest;
 	long *start, *end;
 	int i;
@@ -369,6 +370,19 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
 							: "ori type" :
 		(types &  L1D_FLUSH_MTTRIG)     ? "mttrig type"
 						: "unknown");
+
+	return 0;
+}
+
+void do_entry_flush_fixups(enum l1d_flush_type types)
+{
+	/*
+	 * The call to the fallback flush can not be safely patched in/out while
+	 * other CPUs are executing it. So call __do_entry_flush_fixups() on one
+	 * CPU while all other CPUs spin in the stop machine core with interrupts
+	 * hard disabled.
+	 */
+	stop_machine_cpuslocked(__do_entry_flush_fixups, (void *)types, NULL);
 }
 
 void do_rfi_flush_fixups(enum l1d_flush_type types)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier
  2021-05-04 13:42 [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Michael Ellerman
  2021-05-04 13:42 ` [PATCH 2/2] powerpc/64s: Fix crashes when toggling entry flush barrier Michael Ellerman
@ 2021-05-04 22:44 ` Nathan Lynch
  2021-05-05  2:48   ` Michael Ellerman
  1 sibling, 1 reply; 5+ messages in thread
From: Nathan Lynch @ 2021-05-04 22:44 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, anton, npiggin, dja

Michael Ellerman <mpe@ellerman.id.au> writes:
> -void do_stf_barrier_fixups(enum stf_barrier_type types)
> +static int __do_stf_barrier_fixups(void *data)
>  {
> +	enum stf_barrier_type types = (enum stf_barrier_type)data;
> +
>  	do_stf_entry_barrier_fixups(types);
>  	do_stf_exit_barrier_fixups(types);
> +
> +	return 0;
> +}
> +
> +void do_stf_barrier_fixups(enum stf_barrier_type types)
> +{
> +	/*
> +	 * The call to the fallback entry flush, and the fallback/sync-ori exit
> +	 * flush can not be safely patched in/out while other CPUs are executing
> +	 * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs
> +	 * spin in the stop machine core with interrupts hard disabled.
> +	 */
> +	stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL);

Would it be preferable to avoid the explicit casts:

	stop_machine_cpuslocked(__do_stf_barrier_fixups, &types, NULL);

...

static int __do_stf_barrier_fixups(void *data)
{
	enum stf_barrier_type *types = data;

 	do_stf_entry_barrier_fixups(*types);
 	do_stf_exit_barrier_fixups(*types);

?

post_mobility_fixup() does cpus_read_unlock() before calling
pseries_setup_security_mitigations(), I think that will need to be
changed?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier
  2021-05-04 22:44 ` [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Nathan Lynch
@ 2021-05-05  2:48   ` Michael Ellerman
  2021-05-05  2:55     ` Nathan Lynch
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2021-05-05  2:48 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: linuxppc-dev, anton, npiggin, dja

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> -void do_stf_barrier_fixups(enum stf_barrier_type types)
>> +static int __do_stf_barrier_fixups(void *data)
>>  {
>> +	enum stf_barrier_type types = (enum stf_barrier_type)data;
>> +
>>  	do_stf_entry_barrier_fixups(types);
>>  	do_stf_exit_barrier_fixups(types);
>> +
>> +	return 0;
>> +}
>> +
>> +void do_stf_barrier_fixups(enum stf_barrier_type types)
>> +{
>> +	/*
>> +	 * The call to the fallback entry flush, and the fallback/sync-ori exit
>> +	 * flush can not be safely patched in/out while other CPUs are executing
>> +	 * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs
>> +	 * spin in the stop machine core with interrupts hard disabled.
>> +	 */
>> +	stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL);
>
> Would it be preferable to avoid the explicit casts:
>
> 	stop_machine_cpuslocked(__do_stf_barrier_fixups, &types, NULL);
>
> ...
>
> static int __do_stf_barrier_fixups(void *data)
> {
> 	enum stf_barrier_type *types = data;
>
>  	do_stf_entry_barrier_fixups(*types);
>  	do_stf_exit_barrier_fixups(*types);
>
> ?

Yes.

That will also avoid the pesky issue of undefined behaviour :facepalm:

> post_mobility_fixup() does cpus_read_unlock() before calling
> pseries_setup_security_mitigations(), I think that will need to be
> changed?

I don't think so.

I'm using stop_machine_cpuslocked() but that's because I'm a goose and
forgot to switch to stop_machine() after I reworked the code to not take
cpus_read_lock() by hand. I really shouldn't send patches after 11pm.

I don't think it's important to keep the cpus lock held from where we
take it in post_mobility_fixup(). If some CPUs come or go between there
and here that's fine.

I'll send a v2.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier
  2021-05-05  2:48   ` Michael Ellerman
@ 2021-05-05  2:55     ` Nathan Lynch
  0 siblings, 0 replies; 5+ messages in thread
From: Nathan Lynch @ 2021-05-05  2:55 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, anton, npiggin, dja

Michael Ellerman <mpe@ellerman.id.au> writes:
> Nathan Lynch <nathanl@linux.ibm.com> writes:
>> post_mobility_fixup() does cpus_read_unlock() before calling
>> pseries_setup_security_mitigations(), I think that will need to be
>> changed?
>
> I don't think so.
>
> I'm using stop_machine_cpuslocked() but that's because I'm a goose and
> forgot to switch to stop_machine() after I reworked the code to not take
> cpus_read_lock() by hand. I really shouldn't send patches after 11pm.
>
> I don't think it's important to keep the cpus lock held from where we
> take it in post_mobility_fixup(). If some CPUs come or go between there
> and here that's fine.

Yes, agreed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-05  2:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-04 13:42 [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Michael Ellerman
2021-05-04 13:42 ` [PATCH 2/2] powerpc/64s: Fix crashes when toggling entry flush barrier Michael Ellerman
2021-05-04 22:44 ` [PATCH 1/2] powerpc/64s: Fix crashes when toggling stf barrier Nathan Lynch
2021-05-05  2:48   ` Michael Ellerman
2021-05-05  2:55     ` Nathan Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).