All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim
@ 2017-06-30  0:44 Gustavo Romero
  2017-06-30  0:44 ` [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception Gustavo Romero
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Gustavo Romero @ 2017-06-30  0:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leitao, Gustavo Romero

Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.

Later, we recheckpoint trusting that the live state of FP and VEC are ok
depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
means the FP registers checkpointed before entering are correct and so
after a treclaim. we can trust the FP live state, and similarly on VEC regs.
However if tm_reclaim() does not return a sane state then tm_recheckpoint()
will recheckpoint a corrupted instate back to the checkpoint area.

That commit fixes the corruption by restoring vs0 and vs32 from the
ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
respectively.

The effect of the issue described above is observed, for instance, once a
VSX unavailable exception is caught in the middle of a transaction with
MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.

The issue does occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state and
ckvr_state are both copied from fp_state and vr_state, respectively, and
on recheckpointing both states will be restored from these thread
structures and not from the live state.

The issue does no occur also if MSR.FP = 1 and MSR.VEC = 1 because it
implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
happen in the middle of the transactional block.

Finally, that commit also fixes the MSR used to check if FP or VEC bit are
enabled once we are in tm_reclaim_thread(). Checking ckpt_regs.msr is not
correct because giveup_all(), which copies regs->msr to ckpt_regs.msr, was
not called before and so the ckpt_regs.msr at that point is not valid, i.e.
it does not reflect the MSR state in userspace.

No regression was observed on powerpc/tm selftests after this fix.

Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c | 15 ++++++++-------
 arch/powerpc/kernel/tm.S      | 16 ++++++++++++++++
 2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 2ad725e..df8e348 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -870,21 +870,22 @@ static void tm_reclaim_thread(struct thread_struct *thr,
 	 * state is the same as the live state. We need to copy the
 	 * live state to the checkpointed state so that when the
 	 * transaction is restored, the checkpointed state is correct
-	 * and the aborted transaction sees the correct state. We use
-	 * ckpt_regs.msr here as that's what tm_reclaim will use to
-	 * determine if it's going to write the checkpointed state or
-	 * not. So either this will write the checkpointed registers,
-	 * or reclaim will. Similarly for VMX.
+	 * and the aborted transaction sees the correct state. So
+	 * either this will write the checkpointed registers, or
+	 * reclaim will. Similarly for VMX.
 	 */
-	if ((thr->ckpt_regs.msr & MSR_FP) == 0)
+	if ((thr->regs->msr & MSR_FP) == 0)
 		memcpy(&thr->ckfp_state, &thr->fp_state,
 		       sizeof(struct thread_fp_state));
-	if ((thr->ckpt_regs.msr & MSR_VEC) == 0)
+	if ((thr->regs->msr & MSR_VEC) == 0)
 		memcpy(&thr->ckvr_state, &thr->vr_state,
 		       sizeof(struct thread_vr_state));
 
 	giveup_all(container_of(thr, struct task_struct, thread));
 
+	/* After giveup_all(), ckpt_regs.msr contains the same value
+	 * of regs->msr when we entered in kernel space.
+	 */
 	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
 }
 
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index 3a2d041..67b56af 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -259,9 +259,18 @@ _GLOBAL(tm_reclaim)
 
 	addi	r7, r3, THREAD_CKVRSTATE
 	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
+
+	/* Corrupting v0 (vs32). Should restore it later. */
 	mfvscr	v0
 	li	r6, VRSTATE_VSCR
 	stvx	v0, r7, r6
+
+	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
+	 * recheckpoint the wrong live value.
+	 */
+	lxvx	vs32, 0, r7
+	xxswapd	vs32, vs32
+
 dont_backup_vec:
 	mfspr	r0, SPRN_VRSAVE
 	std	r0, THREAD_CKVRSAVE(r3)
@@ -272,9 +281,16 @@ dont_backup_vec:
 	addi	r7, r3, THREAD_CKFPSTATE
 	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp state */
 
+	/* Corrupting fr0 (vs0). Should restore it later. */
 	mffs    fr0
 	stfd    fr0,FPSTATE_FPSCR(r7)
 
+	/* Restore fr0 (vs32) from ckfp_state.fpr[0], otherwise we might
+	 * recheckpoint the wrong live value.
+	 */
+	lxvx	vs0, 0, r7
+	xxswapd vs0, vs0
+
 dont_backup_fp:
 
 	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception
  2017-06-30  0:44 [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Gustavo Romero
@ 2017-06-30  0:44 ` Gustavo Romero
  2017-07-04  0:49   ` Cyril Bur
  2017-06-30 16:41 ` [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Breno Leitao
  2017-07-04  0:19 ` Cyril Bur
  2 siblings, 1 reply; 12+ messages in thread
From: Gustavo Romero @ 2017-06-30  0:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leitao, Gustavo Romero

Add a test to check if FP/VSX registers are sane (restored correctly) after
a VSX unavailable exception is caught in the middle of a transaction.

Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 tools/testing/selftests/powerpc/tm/Makefile        |   3 +-
 .../testing/selftests/powerpc/tm/tm-vsx-unavail.c  | 144 +++++++++++++++++++++
 2 files changed, 146 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c

diff --git a/tools/testing/selftests/powerpc/tm/Makefile b/tools/testing/selftests/powerpc/tm/Makefile
index 958c11c..d0ffbb8 100644
--- a/tools/testing/selftests/powerpc/tm/Makefile
+++ b/tools/testing/selftests/powerpc/tm/Makefile
@@ -2,7 +2,7 @@ SIGNAL_CONTEXT_CHK_TESTS := tm-signal-context-chk-gpr tm-signal-context-chk-fpu
 	tm-signal-context-chk-vmx tm-signal-context-chk-vsx
 
 TEST_GEN_PROGS := tm-resched-dscr tm-syscall tm-signal-msr-resv tm-signal-stack \
-	tm-vmxcopy tm-fork tm-tar tm-tmspr tm-vmx-unavail \
+	tm-vmxcopy tm-fork tm-tar tm-tmspr tm-vmx-unavail tm-vsx-unavail \
 	$(SIGNAL_CONTEXT_CHK_TESTS)
 
 include ../../lib.mk
@@ -15,6 +15,7 @@ $(OUTPUT)/tm-syscall: tm-syscall-asm.S
 $(OUTPUT)/tm-syscall: CFLAGS += -I../../../../../usr/include
 $(OUTPUT)/tm-tmspr: CFLAGS += -pthread
 $(OUTPUT)/tm-vmx-unavail: CFLAGS += -pthread -m64
+$(OUTPUT)/tm-vsx-unavail: CFLAGS += -pthread -m64
 
 SIGNAL_CONTEXT_CHK_TESTS := $(patsubst %,$(OUTPUT)/%,$(SIGNAL_CONTEXT_CHK_TESTS))
 $(SIGNAL_CONTEXT_CHK_TESTS): tm-signal.S
diff --git a/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c b/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c
new file mode 100644
index 0000000..7ff933a
--- /dev/null
+++ b/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c
@@ -0,0 +1,144 @@
+/*
+ * Copyright 2017, Gustavo Romero and Breno Leitao, IBM Corp.
+ * Licensed under GPLv2.
+ *
+ * Force VSX unavailable exception during a transaction and see
+ * if it corrupts the checkpointed FP registers state after the abort.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <pthread.h>
+#include <sched.h>
+
+#include "tm.h"
+#include "utils.h"
+
+int passed;
+
+void *ping(void *not_used)
+{
+	asm goto(
+		// r3 = 0x5555555555555555
+		"lis  3, 0x5555    ;"
+		"ori  3, 3, 0x5555 ;"
+		"sldi 3, 3, 32     ;"
+		"oris 3, 3, 0x5555 ;"
+		"ori  3, 3, 0x5555 ;"
+
+		//r4 = 0xFFFFFFFFFFFFFFFF
+		"lis  4, 0xFFFF    ;"
+		"ori  4, 4, 0xFFFF ;"
+		"sldi 4, 4, 32     ;"
+		"oris 4, 4, 0xFFFF ;"
+		"ori  4, 4, 0xFFFF ;"
+
+		// vs33 and vs34 will just be used to construct vs0 from r3 and
+		// r4. Both won't be used in any other place after that.
+		"mtvsrd 33, 3      ;"
+		"mtvsrd 34, 4      ;"
+
+		// vs0 = (r3 || r4) = 0x5555555555555555FFFFFFFFFFFFFFFF
+		"xxmrghd 0, 33, 34 ;"
+
+
+		// Wait ~8s so we have a sufficient amount of context
+		// switches so load_fp and load_vec overflow and MSR.FP, MSR.VEC
+		// and MSR.VSX are disabled.
+		"       lis	 7, 0x1       ;"
+		"       ori      7, 7, 0xBFFE ;"
+		"       sldi     7, 7, 15     ;"
+		"1:	addi     7, 7, -1     ;"
+		"       cmpdi    7, 0         ;"
+		"       bne      1b           ;"
+
+		// Any floating-point instruction in here.
+		// N.B. 'fmr' is *not touching* any previously set register,
+		// i.e. it's not touching vs0.
+		"fmr    10, 10     ;"
+
+		// vs0 is *still* 0x5555555555555555FFFFFFFFFFFFFFFF, right?
+		// Get in a transaction and cause a VSX unavailable exception.
+		"2:	tbegin.               ;" // Begin HTM
+		"       beq      3f           ;" // Failure handler
+		"       xxmrghd  10, 10, 10   ;" // VSX unavailable in TM
+		"       tend.                 ;" // End HTM
+		"3:     nop                   ;" // Fall through to code below
+
+		// Immediately after a transaction failure we save vs0 to two
+		// general purpose registers to check its value. We need to have
+		// the same value as before we entered in transactional state.
+
+		// vs0 should be *still* 0x5555555555555555FFFFFFFFFFFFFFFF
+
+		// Save high half - MSB (64bit).
+		"mfvsrd  5, 0                 ;"
+
+		// Save low half - LSB (64bit).
+		// We mess with vs3, but it's not important.
+		"xxsldwi 3, 0, 0, 2           ;"
+		"mfvsrd  6, 3                 ;"
+
+		// N.B. r3 and r4 never changed since they were used to
+		// construct the initial vs0 value, hence we can use them to do
+		// the comparison. r3 and r4 will be destroy but it's ok.
+		"cmpd  3, 5                ;" // compare r3 to r5
+		"bne   %[value_mismatch]      ;"
+		"cmpd  4, 6                ;" // compare r4 to r6
+		"bne   %[value_mismatch]      ;"
+		"b     %[value_ok]            ;"
+		:
+		:
+		: "r3", "r4", "vs33", "vs34", "vs0",
+		  "vs10", "fr10", "r7", "r5", "r6", "vs3"
+		: value_mismatch, value_ok
+		);
+value_mismatch:
+		passed = 0;
+		return NULL;
+value_ok:
+		passed = 1;
+		return NULL;
+}
+
+void *pong(void *not_used)
+{
+	while (1)
+		sched_yield(); // will be classed as interactive-like thread
+}
+
+int tm_vsx_unavail_test(void)
+{
+	pthread_t t0, t1;
+	pthread_attr_t attr;
+	cpu_set_t cpuset;
+
+	// Set only CPU 0 in the mask. Both threads will be bound to cpu 0
+	CPU_ZERO(&cpuset);
+	CPU_SET(0, &cpuset);
+
+	// Init pthread attribute
+	pthread_attr_init(&attr);
+
+	// Set CPU 0 mask into the pthread attribute
+	pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpuset);
+
+	// 'pong' thread used to induce context switches on 'ping' thread
+	pthread_create(&t1, &attr /* bound to cpu 0 */, pong, NULL);
+
+	printf("Checking if FP/VSX is sane after a VSX exception in TM...\n");
+
+	pthread_create(&t0, &attr /* bound to cpu 0 as well */, ping, NULL);
+	pthread_join(t0, NULL);
+
+	return passed ? EXIT_SUCCESS : EXIT_FAILURE;
+}
+
+int main(int argc, char **argv)
+{
+	return test_harness(tm_vsx_unavail_test, "tm_vsx_unavail_test");
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-06-30  0:44 [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Gustavo Romero
  2017-06-30  0:44 ` [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception Gustavo Romero
@ 2017-06-30 16:41 ` Breno Leitao
  2017-07-04  0:37   ` Cyril Bur
  2017-07-04  0:19 ` Cyril Bur
  2 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2017-06-30 16:41 UTC (permalink / raw)
  To: Gustavo Romero; +Cc: linuxppc-dev

Thanks Gustavo for the patch.

On Thu, Jun 29, 2017 at 08:39:23PM -0400, Gustavo Romero wrote:
> Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 
> Later, we recheckpoint trusting that the live state of FP and VEC are ok
> depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> means the FP registers checkpointed before entering are correct and so
> after a treclaim. we can trust the FP live state, and similarly on VEC regs.
> However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> will recheckpoint a corrupted instate back to the checkpoint area.
> 
> That commit fixes the corruption by restoring vs0 and vs32 from the
> ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> respectively.
> 
> The effect of the issue described above is observed, for instance, once a
> VSX unavailable exception is caught in the middle of a transaction with
> MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> 
> The issue does occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state and
> ckvr_state are both copied from fp_state and vr_state, respectively, and
> on recheckpointing both states will be restored from these thread
> structures and not from the live state.

Just complementing what Gustavo said, currently the tm_recheckpoint()
function grabs the values to be re-checkpoint from two places, depending
on the MSR configuration.

If VEC or FP are disabled on MSR argument when tm_recheckpoint() is
called, then it restorese the values from the thread ckvr/ckfp area and
recheckpoint these values into processor transactional area.

On the other side, if VEC or FP are disabled, it does not restore the
vectors and fp registers from the thread ckvec/ckfp area, and it will
recheckpoint what is currently on the live registers. If the registers
changed after the reclaim, we will send these new values recheckpointed,
and later on userspace when the transaction fails.

This second scenario is what is causing the error reported on this
email thread. In fact, I am wondering if we can hit another hidden bug that
changes the fp and vec values between the tm_reclaim() and
tm_recheckpoint() invokations, as for example, an interrupt that calls
memcpy() in this small mean time.

That said, I am wondering if we shouldn't always rely on thread ckfp and
ckvec during tm_recheckpoint() (and never rely on the live registers). This
should simplify the reclaim/recheckpoint mechanism, and make it less
error prone.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-06-30  0:44 [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Gustavo Romero
  2017-06-30  0:44 ` [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception Gustavo Romero
  2017-06-30 16:41 ` [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Breno Leitao
@ 2017-07-04  0:19 ` Cyril Bur
  2017-07-04 20:45   ` [PATCH] " Gustavo Romero
  2 siblings, 1 reply; 12+ messages in thread
From: Cyril Bur @ 2017-07-04  0:19 UTC (permalink / raw)
  To: Gustavo Romero, linuxppc-dev; +Cc: leitao

On Thu, 2017-06-29 at 20:44 -0400, Gustavo Romero wrote:
> Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 
> Later, we recheckpoint trusting that the live state of FP and VEC are ok
> depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> means the FP registers checkpointed before entering are correct and so
> after a treclaim. we can trust the FP live state, and similarly on VEC regs.
> However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> will recheckpoint a corrupted instate back to the checkpoint area.
> 
> That commit fixes the corruption by restoring vs0 and vs32 from the
> ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> respectively.
> 
> The effect of the issue described above is observed, for instance, once a
> VSX unavailable exception is caught in the middle of a transaction with
> MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> 
> The issue does occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state and
> ckvr_state are both copied from fp_state and vr_state, respectively, and
> on recheckpointing both states will be restored from these thread
> structures and not from the live state.
> 
> The issue does no occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> happen in the middle of the transactional block.
> 
> Finally, that commit also fixes the MSR used to check if FP or VEC bit are
> enabled once we are in tm_reclaim_thread(). Checking ckpt_regs.msr is not
> correct because giveup_all(), which copies regs->msr to ckpt_regs.msr, was
> not called before and so the ckpt_regs.msr at that point is not valid, i.e.
> it does not reflect the MSR state in userspace.
> 
> No regression was observed on powerpc/tm selftests after this fix.
> 
> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  arch/powerpc/kernel/process.c | 15 ++++++++-------
>  arch/powerpc/kernel/tm.S      | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 2ad725e..df8e348 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -870,21 +870,22 @@ static void tm_reclaim_thread(struct thread_struct *thr,
>  	 * state is the same as the live state. We need to copy the
>  	 * live state to the checkpointed state so that when the
>  	 * transaction is restored, the checkpointed state is correct
> -	 * and the aborted transaction sees the correct state. We use
> -	 * ckpt_regs.msr here as that's what tm_reclaim will use to
> -	 * determine if it's going to write the checkpointed state or
> -	 * not. So either this will write the checkpointed registers,
> -	 * or reclaim will. Similarly for VMX.
> +	 * and the aborted transaction sees the correct state. So
> +	 * either this will write the checkpointed registers, or
> +	 * reclaim will. Similarly for VMX.
>  	 */
> -	if ((thr->ckpt_regs.msr & MSR_FP) == 0)
> +	if ((thr->regs->msr & MSR_FP) == 0)
>  		memcpy(&thr->ckfp_state, &thr->fp_state,
>  		       sizeof(struct thread_fp_state));
> -	if ((thr->ckpt_regs.msr & MSR_VEC) == 0)
> +	if ((thr->regs->msr & MSR_VEC) == 0)
>  		memcpy(&thr->ckvr_state, &thr->vr_state,
>  		       sizeof(struct thread_vr_state));
>  

So the name of that variable is horrifying - I don't know why it is
called that I expect to save 'space' but its not helping anyone. Poor
variable names not withstanding I believe the original code is correct.

What the chkp_regs.msr means here is, the MSR that the thread had when
it when it came off the processor. The reason for this is that
giveup_fpu (and friends) will modify thread.regs->msr when doing the
giveup so when reclaiming we can't trust it to know what the live state
really was. check_if_tm_restore_required() copies the thread.regs->msr
into the checkpointed structure so we know if the thread was really
using FP/VMX/VSX or not. check_if_tm_restore_required() is called
before any giveup operation.

I do wonder if it would make more sense for the giveup_all() below to
be above. I don't think there are any code pathes that can get here
without having already done a giveup_all() but, perhaps it is possible.
I don't think it will hurt to move it up, it feels more correct

>  	giveup_all(container_of(thr, struct task_struct, thread));
>  
> +	/* After giveup_all(), ckpt_regs.msr contains the same value
> +	 * of regs->msr when we entered in kernel space.
> +	 */
>  	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
>  }
>  
> diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> index 3a2d041..67b56af 100644
> --- a/arch/powerpc/kernel/tm.S
> +++ b/arch/powerpc/kernel/tm.S
> @@ -259,9 +259,18 @@ _GLOBAL(tm_reclaim)
>  
>  	addi	r7, r3, THREAD_CKVRSTATE
>  	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> +
> +	/* Corrupting v0 (vs32). Should restore it later. */
>  	mfvscr	v0
>  	li	r6, VRSTATE_VSCR
>  	stvx	v0, r7, r6
> +
> +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> +	 * recheckpoint the wrong live value.
> +	 */
> +	lxvx	vs32, 0, r7
> +	xxswapd	vs32, vs32

Yes good find!

I feel like the crux of the problem is that we don't always
tm_recheckpoint() with the same msr as we tm_reclaimed() with, is THIS
the core of the problem? In the traps.c case it is an optimisation,
perhaps a pointless one, if I had spare time I would benchmark if it is
worth it.

This code absolutely can't hurt and does solve a real problem.

Perhaps use the macro, I think what you want is:

/* r0 evaluates to literal zero pp 489 ISA 3.0 */
LXVD2X_ROT(32, r0, r7)

> +
>  dont_backup_vec:
>  	mfspr	r0, SPRN_VRSAVE
>  	std	r0, THREAD_CKVRSAVE(r3)
> @@ -272,9 +281,16 @@ dont_backup_vec:
>  	addi	r7, r3, THREAD_CKFPSTATE
>  	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp state */
>  
> +	/* Corrupting fr0 (vs0). Should restore it later. */
>  	mffs    fr0
>  	stfd    fr0,FPSTATE_FPSCR(r7)
>  
> +	/* Restore fr0 (vs32) from ckfp_state.fpr[0], otherwise we might
> +	 * recheckpoint the wrong live value.
> +	 */
> +	lxvx	vs0, 0, r7
> +	xxswapd vs0, vs0

/* r0 evaluates to literal zero pp 489 ISA 3.0 */
LXVD2X_ROT(0, r0, r7)


> +
>  dont_backup_fp:
>  
>  	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-06-30 16:41 ` [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Breno Leitao
@ 2017-07-04  0:37   ` Cyril Bur
  0 siblings, 0 replies; 12+ messages in thread
From: Cyril Bur @ 2017-07-04  0:37 UTC (permalink / raw)
  To: Breno Leitao, Gustavo Romero; +Cc: linuxppc-dev

On Fri, 2017-06-30 at 13:41 -0300, Breno Leitao wrote:
> Thanks Gustavo for the patch.
> 
> On Thu, Jun 29, 2017 at 08:39:23PM -0400, Gustavo Romero wrote:
> > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> > 
> > Later, we recheckpoint trusting that the live state of FP and VEC are ok
> > depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> > means the FP registers checkpointed before entering are correct and so
> > after a treclaim. we can trust the FP live state, and similarly on VEC regs.
> > However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> > will recheckpoint a corrupted instate back to the checkpoint area.
> > 
> > That commit fixes the corruption by restoring vs0 and vs32 from the
> > ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> > respectively.
> > 
> > The effect of the issue described above is observed, for instance, once a
> > VSX unavailable exception is caught in the middle of a transaction with
> > MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> > space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> > 
> > The issue does occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state and
> > ckvr_state are both copied from fp_state and vr_state, respectively, and
> > on recheckpointing both states will be restored from these thread
> > structures and not from the live state.
> 
> Just complementing what Gustavo said, currently the tm_recheckpoint()
> function grabs the values to be re-checkpoint from two places, depending
> on the MSR configuration.
> 
> If VEC or FP are disabled on MSR argument when tm_recheckpoint() is
> called, then it restorese the values from the thread ckvr/ckfp area and
> recheckpoint these values into processor transactional area.
> 
> On the other side, if VEC or FP are disabled, it does not restore the
> vectors and fp registers from the thread ckvec/ckfp area, and it will
> recheckpoint what is currently on the live registers. If the registers
> changed after the reclaim, we will send these new values recheckpointed,
> and later on userspace when the transaction fails.
> 
> This second scenario is what is causing the error reported on this
> email thread. In fact, I am wondering if we can hit another hidden bug that
> changes the fp and vec values between the tm_reclaim() and
> tm_recheckpoint() invokations, as for example, an interrupt that calls
> memcpy() in this small mean time.
> 
> That said, I am wondering if we shouldn't always rely on thread ckfp and
> ckvec during tm_recheckpoint() (and never rely on the live registers). This
> should simplify the reclaim/recheckpoint mechanism, and make it less
> error prone.

I think you're absolutely correct - its a mess. Personally I think that
the assembly functions should do the bare minimum. So the two cases
that you describe which are handled in assembly could easily be handled
in C either before the call to the assembly or after.

Cyril

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception
  2017-06-30  0:44 ` [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception Gustavo Romero
@ 2017-07-04  0:49   ` Cyril Bur
  0 siblings, 0 replies; 12+ messages in thread
From: Cyril Bur @ 2017-07-04  0:49 UTC (permalink / raw)
  To: Gustavo Romero, linuxppc-dev; +Cc: leitao

On Thu, 2017-06-29 at 20:44 -0400, Gustavo Romero wrote:
> Add a test to check if FP/VSX registers are sane (restored correctly) after
> a VSX unavailable exception is caught in the middle of a transaction.
> 
> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

Looks good, its a nice test for a very difficult (difficult on purpose
but still possible by accident!) to hit bug.

Reviewed-by: Cyril Bur <cyrilbur@gmail.com>

> ---
>  tools/testing/selftests/powerpc/tm/Makefile        |   3 +-
>  .../testing/selftests/powerpc/tm/tm-vsx-unavail.c  | 144 +++++++++++++++++++++
>  2 files changed, 146 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c
> 
> diff --git a/tools/testing/selftests/powerpc/tm/Makefile b/tools/testing/selftests/powerpc/tm/Makefile
> index 958c11c..d0ffbb8 100644
> --- a/tools/testing/selftests/powerpc/tm/Makefile
> +++ b/tools/testing/selftests/powerpc/tm/Makefile
> @@ -2,7 +2,7 @@ SIGNAL_CONTEXT_CHK_TESTS := tm-signal-context-chk-gpr tm-signal-context-chk-fpu
>  	tm-signal-context-chk-vmx tm-signal-context-chk-vsx
>  
>  TEST_GEN_PROGS := tm-resched-dscr tm-syscall tm-signal-msr-resv tm-signal-stack \
> -	tm-vmxcopy tm-fork tm-tar tm-tmspr tm-vmx-unavail \
> +	tm-vmxcopy tm-fork tm-tar tm-tmspr tm-vmx-unavail tm-vsx-unavail \
>  	$(SIGNAL_CONTEXT_CHK_TESTS)
>  
>  include ../../lib.mk
> @@ -15,6 +15,7 @@ $(OUTPUT)/tm-syscall: tm-syscall-asm.S
>  $(OUTPUT)/tm-syscall: CFLAGS += -I../../../../../usr/include
>  $(OUTPUT)/tm-tmspr: CFLAGS += -pthread
>  $(OUTPUT)/tm-vmx-unavail: CFLAGS += -pthread -m64
> +$(OUTPUT)/tm-vsx-unavail: CFLAGS += -pthread -m64
>  
>  SIGNAL_CONTEXT_CHK_TESTS := $(patsubst %,$(OUTPUT)/%,$(SIGNAL_CONTEXT_CHK_TESTS))
>  $(SIGNAL_CONTEXT_CHK_TESTS): tm-signal.S
> diff --git a/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c b/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c
> new file mode 100644
> index 0000000..7ff933a
> --- /dev/null
> +++ b/tools/testing/selftests/powerpc/tm/tm-vsx-unavail.c
> @@ -0,0 +1,144 @@
> +/*
> + * Copyright 2017, Gustavo Romero and Breno Leitao, IBM Corp.
> + * Licensed under GPLv2.
> + *
> + * Force VSX unavailable exception during a transaction and see
> + * if it corrupts the checkpointed FP registers state after the abort.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <unistd.h>
> +#include <inttypes.h>
> +#include <pthread.h>
> +#include <sched.h>
> +
> +#include "tm.h"
> +#include "utils.h"
> +
> +int passed;
> +
> +void *ping(void *not_used)
> +{
> +	asm goto(
> +		// r3 = 0x5555555555555555
> +		"lis  3, 0x5555    ;"
> +		"ori  3, 3, 0x5555 ;"
> +		"sldi 3, 3, 32     ;"
> +		"oris 3, 3, 0x5555 ;"
> +		"ori  3, 3, 0x5555 ;"
> +
> +		//r4 = 0xFFFFFFFFFFFFFFFF
> +		"lis  4, 0xFFFF    ;"
> +		"ori  4, 4, 0xFFFF ;"
> +		"sldi 4, 4, 32     ;"
> +		"oris 4, 4, 0xFFFF ;"
> +		"ori  4, 4, 0xFFFF ;"
> +
> +		// vs33 and vs34 will just be used to construct vs0 from r3 and
> +		// r4. Both won't be used in any other place after that.
> +		"mtvsrd 33, 3      ;"
> +		"mtvsrd 34, 4      ;"
> +
> +		// vs0 = (r3 || r4) = 0x5555555555555555FFFFFFFFFFFFFFFF
> +		"xxmrghd 0, 33, 34 ;"
> +
> +
> +		// Wait ~8s so we have a sufficient amount of context
> +		// switches so load_fp and load_vec overflow and MSR.FP, MSR.VEC
> +		// and MSR.VSX are disabled.
> +		"       lis	 7, 0x1       ;"
> +		"       ori      7, 7, 0xBFFE ;"
> +		"       sldi     7, 7, 15     ;"
> +		"1:	addi     7, 7, -1     ;"
> +		"       cmpdi    7, 0         ;"
> +		"       bne      1b           ;"
> +
> +		// Any floating-point instruction in here.
> +		// N.B. 'fmr' is *not touching* any previously set register,
> +		// i.e. it's not touching vs0.
> +		"fmr    10, 10     ;"
> +
> +		// vs0 is *still* 0x5555555555555555FFFFFFFFFFFFFFFF, right?
> +		// Get in a transaction and cause a VSX unavailable exception.
> +		"2:	tbegin.               ;" // Begin HTM
> +		"       beq      3f           ;" // Failure handler
> +		"       xxmrghd  10, 10, 10   ;" // VSX unavailable in TM
> +		"       tend.                 ;" // End HTM
> +		"3:     nop                   ;" // Fall through to code below
> +
> +		// Immediately after a transaction failure we save vs0 to two
> +		// general purpose registers to check its value. We need to have
> +		// the same value as before we entered in transactional state.
> +
> +		// vs0 should be *still* 0x5555555555555555FFFFFFFFFFFFFFFF
> +
> +		// Save high half - MSB (64bit).
> +		"mfvsrd  5, 0                 ;"
> +
> +		// Save low half - LSB (64bit).
> +		// We mess with vs3, but it's not important.
> +		"xxsldwi 3, 0, 0, 2           ;"
> +		"mfvsrd  6, 3                 ;"
> +
> +		// N.B. r3 and r4 never changed since they were used to
> +		// construct the initial vs0 value, hence we can use them to do
> +		// the comparison. r3 and r4 will be destroy but it's ok.
> +		"cmpd  3, 5                ;" // compare r3 to r5
> +		"bne   %[value_mismatch]      ;"
> +		"cmpd  4, 6                ;" // compare r4 to r6
> +		"bne   %[value_mismatch]      ;"
> +		"b     %[value_ok]            ;"
> +		:
> +		:
> +		: "r3", "r4", "vs33", "vs34", "vs0",
> +		  "vs10", "fr10", "r7", "r5", "r6", "vs3"
> +		: value_mismatch, value_ok
> +		);
> +value_mismatch:
> +		passed = 0;
> +		return NULL;
> +value_ok:
> +		passed = 1;
> +		return NULL;
> +}
> +
> +void *pong(void *not_used)
> +{
> +	while (1)
> +		sched_yield(); // will be classed as interactive-like thread
> +}
> +
> +int tm_vsx_unavail_test(void)
> +{
> +	pthread_t t0, t1;
> +	pthread_attr_t attr;
> +	cpu_set_t cpuset;
> +
> +	// Set only CPU 0 in the mask. Both threads will be bound to cpu 0
> +	CPU_ZERO(&cpuset);
> +	CPU_SET(0, &cpuset);
> +
> +	// Init pthread attribute
> +	pthread_attr_init(&attr);
> +
> +	// Set CPU 0 mask into the pthread attribute
> +	pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpuset);
> +
> +	// 'pong' thread used to induce context switches on 'ping' thread
> +	pthread_create(&t1, &attr /* bound to cpu 0 */, pong, NULL);
> +
> +	printf("Checking if FP/VSX is sane after a VSX exception in TM...\n");
> +
> +	pthread_create(&t0, &attr /* bound to cpu 0 as well */, ping, NULL);
> +	pthread_join(t0, NULL);
> +
> +	return passed ? EXIT_SUCCESS : EXIT_FAILURE;
> +}
> +
> +int main(int argc, char **argv)
> +{
> +	return test_harness(tm_vsx_unavail_test, "tm_vsx_unavail_test");
> +}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-07-04  0:19 ` Cyril Bur
@ 2017-07-04 20:45   ` Gustavo Romero
  2017-07-05  1:02     ` Michael Neuling
  2022-03-11 16:27     ` Christophe Leroy
  0 siblings, 2 replies; 12+ messages in thread
From: Gustavo Romero @ 2017-07-04 20:45 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gustavo Romero, Breno Leitao

Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.

Later, we recheckpoint trusting that the live state of FP and VEC are ok
depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
means the FP registers checkpointed when we entered in TM are correct and
after a treclaim. we can trust the FP live state. Similarly to VEC regs.
However if tm_reclaim() does not return a sane state then tm_recheckpoint()
will recheckpoint a corrupted state from live state back to the checkpoint
area.

That commit fixes the corruption by restoring vs0 and vs32 from the
ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
respectively.

The effect of the issue described above is observed, for instance, once a
VSX unavailable exception is caught in the middle of a transaction with
MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.

The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state
and ckvr_state are both copied from fp_state and vr_state, respectively,
and on recheckpointing both states will be restored from these thread
structures and not from the live state.

The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it
implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
happen in the middle of the transactional block.

Finally, that commit also fixes the MSR used to check if FP and VEC bits
are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only
if giveup_all() is called *before* using ckpt_regs.msr for checks because
check_if_tm_restore_required() in giveup_all() will copy regs->msr to
ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread
had when it came off the processor.

No regression was observed on powerpc/tm selftests after this fix.

Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 arch/powerpc/kernel/process.c |  9 +++++++--
 arch/powerpc/kernel/tm.S      | 14 ++++++++++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 2ad725e..ac1fc51 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct *thr,
 	if (!MSR_TM_SUSPENDED(mfmsr()))
 		return;
 
+	/* Copy regs->msr to ckpt_regs.msr making the last valid for
+	 * the checks below. check_if_tm_restore_required() in
+	 * giveup_all() will take care of it. Also update fp_state
+	 * and vr_state from live state if the live state is valid.
+	 */
+	giveup_all(container_of(thr, struct task_struct, thread));
+
 	/*
 	 * If we are in a transaction and FP is off then we can't have
 	 * used FP inside that transaction. Hence the checkpointed
@@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *thr,
 		memcpy(&thr->ckvr_state, &thr->vr_state,
 		       sizeof(struct thread_vr_state));
 
-	giveup_all(container_of(thr, struct task_struct, thread));
-
 	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
 }
 
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
index 3a2d041..5dfbddb 100644
--- a/arch/powerpc/kernel/tm.S
+++ b/arch/powerpc/kernel/tm.S
@@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim)
 
 	addi	r7, r3, THREAD_CKVRSTATE
 	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
+
+	/* Corrupting v0 (vs32). Should restore it later. */
 	mfvscr	v0
 	li	r6, VRSTATE_VSCR
 	stvx	v0, r7, r6
+
+	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
+	 * recheckpoint the wrong live value.
+	 */
+	LXVD2X_ROT(32, R0, R7)
+
 dont_backup_vec:
 	mfspr	r0, SPRN_VRSAVE
 	std	r0, THREAD_CKVRSAVE(r3)
@@ -272,9 +280,15 @@ dont_backup_vec:
 	addi	r7, r3, THREAD_CKFPSTATE
 	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp state */
 
+	/* Corrupting fr0 (vs0). Should restore it later. */
 	mffs    fr0
 	stfd    fr0,FPSTATE_FPSCR(r7)
 
+	/* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might
+	 * recheckpoint the wrong live value.
+	 */
+	LXVD2X_ROT(0, R0, R7)
+
 dont_backup_fp:
 
 	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-07-04 20:45   ` [PATCH] " Gustavo Romero
@ 2017-07-05  1:02     ` Michael Neuling
  2017-07-05 20:57       ` Breno Leitao
  2017-10-26  4:57       ` Cyril Bur
  2022-03-11 16:27     ` Christophe Leroy
  1 sibling, 2 replies; 12+ messages in thread
From: Michael Neuling @ 2017-07-05  1:02 UTC (permalink / raw)
  To: Gustavo Romero, linuxppc-dev; +Cc: Breno Leitao, Cyril Bur

On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote:
> Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.

tm_reclaim() should have no state live in the registers once it returns.  I=
t
should all be saved in the thread struct. The above is not an issue in my b=
ook.

Having a quick look at the code, I think there's and issue but we need some=
thing
more like this (completely untested).

When we recheckpoint inside an fp unavail, we need to recheckpoint vec if i=
t was
enabled.  Currently we only ever recheckpoint the FP which seems like a bug=
.=20
Visa versa for the other way around.

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index d4e545d27e..d1184264e2 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
         * If VMX is in use, the VRs now hold checkpointed values,
         * so we don't want to load the VRs from the thread_struct.
         */
-       tm_recheckpoint(&current->thread, MSR_FP);
+       tm_recheckpoint(&current->thread, regs->msr);
=20
        /* If VMX is in use, get the transactional values back */
        if (regs->msr & MSR_VEC) {
@@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
                 regs->nip, regs->msr);
        tm_reclaim_current(TM_CAUSE_FAC_UNAV);
        regs->msr |=3D MSR_VEC;
-       tm_recheckpoint(&current->thread, MSR_VEC);
+       tm_recheckpoint(&current->thread, regs->msr);
        current->thread.used_vr =3D 1;
=20
        if (regs->msr & MSR_FP) {


> Later, we recheckpoint trusting that the live state of FP and VEC are ok
> depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> means the FP registers checkpointed when we entered in TM are correct and
> after a treclaim. we can trust the FP live state. Similarly to VEC regs.
> However if tm_reclaim() does not return a sane state then tm_recheckpoint=
()
> will recheckpoint a corrupted state from live state back to the checkpoin=
t
> area.




> That commit fixes the corruption by restoring vs0 and vs32 from the
> ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> respectively.
>=20
> The effect of the issue described above is observed, for instance, once a
> VSX unavailable exception is caught in the middle of a transaction with
> MSR.FP =3D 1 or MSR.VEC =3D 1. If MSR.FP =3D 1, then after getting back t=
o user
> space FP state is corrupted. If MSR.VEC =3D 1, then VEC state is corrupte=
d.
>=20
> The issue does not occur if MSR.FP =3D 0 and MSR.VEC =3D 0 because ckfp_s=
tate
> and ckvr_state are both copied from fp_state and vr_state, respectively,
> and on recheckpointing both states will be restored from these thread
> structures and not from the live state.
>=20
> The issue does not occur also if MSR.FP =3D 1 and MSR.VEC =3D 1 because i=
t
> implies MSR.VSX =3D 1 and in that case the VSX unavailable exception does=
 not
> happen in the middle of the transactional block.
>=20
> Finally, that commit also fixes the MSR used to check if FP and VEC bits
> are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid on=
ly
> if giveup_all() is called *before* using ckpt_regs.msr for checks because
> check_if_tm_restore_required() in giveup_all() will copy regs->msr to
> ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thre=
ad
> had when it came off the processor.
>=20
> No regression was observed on powerpc/tm selftests after this fix.
>=20
> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> =C2=A0arch/powerpc/kernel/process.c |=C2=A0=C2=A09 +++++++--
> =C2=A0arch/powerpc/kernel/tm.S=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0| 14 ++=
++++++++++++
> =C2=A02 files changed, 21 insertions(+), 2 deletions(-)
>=20
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.=
c
> index 2ad725e..ac1fc51 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct *=
thr,
> =C2=A0	if (!MSR_TM_SUSPENDED(mfmsr()))
> =C2=A0		return;
> =C2=A0
> +	/* Copy regs->msr to ckpt_regs.msr making the last valid for
> +	=C2=A0* the checks below. check_if_tm_restore_required() in
> +	=C2=A0* giveup_all() will take care of it. Also update fp_state
> +	=C2=A0* and vr_state from live state if the live state is valid.
> +	=C2=A0*/
> +	giveup_all(container_of(thr, struct task_struct, thread));
> +
> =C2=A0	/*
> =C2=A0	=C2=A0* If we are in a transaction and FP is off then we can't hav=
e
> =C2=A0	=C2=A0* used FP inside that transaction. Hence the checkpointed
> @@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *t=
hr,
> =C2=A0		memcpy(&thr->ckvr_state, &thr->vr_state,
> =C2=A0		=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0sizeof(struct thread_vr=
_state));
> =C2=A0
> -	giveup_all(container_of(thr, struct task_struct, thread));
> -
> =C2=A0	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
> =C2=A0}
> =C2=A0
> diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> index 3a2d041..5dfbddb 100644
> --- a/arch/powerpc/kernel/tm.S
> +++ b/arch/powerpc/kernel/tm.S
> @@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim)
> =C2=A0
> =C2=A0	addi	r7, r3, THREAD_CKVRSTATE
> =C2=A0	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> +
> +	/* Corrupting v0 (vs32). Should restore it later. */
> =C2=A0	mfvscr	v0
> =C2=A0	li	r6, VRSTATE_VSCR
> =C2=A0	stvx	v0, r7, r6
> +
> +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> +	=C2=A0* recheckpoint the wrong live value.
> +	=C2=A0*/
> +	LXVD2X_ROT(32, R0, R7)
> +
> =C2=A0dont_backup_vec:
> =C2=A0	mfspr	r0, SPRN_VRSAVE
> =C2=A0	std	r0, THREAD_CKVRSAVE(r3)
> @@ -272,9 +280,15 @@ dont_backup_vec:
> =C2=A0	addi	r7, r3, THREAD_CKFPSTATE
> =C2=A0	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp
> state */
> =C2=A0
> +	/* Corrupting fr0 (vs0). Should restore it later. */
> =C2=A0	mffs=C2=A0=C2=A0=C2=A0=C2=A0fr0
> =C2=A0	stfd=C2=A0=C2=A0=C2=A0=C2=A0fr0,FPSTATE_FPSCR(r7)
> =C2=A0
> +	/* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might
> +	=C2=A0* recheckpoint the wrong live value.
> +	=C2=A0*/
> +	LXVD2X_ROT(0, R0, R7)
> +
> =C2=A0dont_backup_fp:
> =C2=A0
> =C2=A0	/* TM regs, incl TEXASR -- these live in thread_struct.=C2=A0=C2=
=A0Note they've

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-07-05  1:02     ` Michael Neuling
@ 2017-07-05 20:57       ` Breno Leitao
  2017-10-26  4:57       ` Cyril Bur
  1 sibling, 0 replies; 12+ messages in thread
From: Breno Leitao @ 2017-07-05 20:57 UTC (permalink / raw)
  To: Michael Neuling; +Cc: Gustavo Romero, linuxppc-dev, Cyril Bur

Hi Michael,

On Wed, Jul 05, 2017 at 11:02:41AM +1000, Michael Neuling wrote:
> On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote:
> > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 
> tm_reclaim() should have no state live in the registers once it returns.  It
> should all be saved in the thread struct. The above is not an issue in my book.

Right, but we will always recheckpoint from the live anyway, so, if we do not
force the MSR_VEC and/or MSR_FP in tm_recheckpoint(), then we will
inevitably put the live registers into the checkpoint area.

It might not be a problem for VEC/FP if they are disabled, since a later
VEC/FP touch will raise a fp/vec_unavailable() exception which will fill
out the registers properly, replacing the old state brought from the
checkpoint area.

> When we recheckpoint inside an fp unavail, we need to recheckpoint vec if it was
> enabled.  Currently we only ever recheckpoint the FP which seems like a bug. 
> Visa versa for the other way around.

This seems to be another problem that also exists in the code, but it is
essentially different from the one in this thread, which happens on the
VSX unavailable exception path. Although essentially different, the solution
might be similar. So, a fix that would resolve all the issues reported here
would sound like. What do you think?

---
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index d4e545d..76a35ab 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
         * If VMX is in use, the VRs now hold checkpointed values,
         * so we don't want to load the VRs from the thread_struct.
         */
-       tm_recheckpoint(&current->thread, MSR_FP);
+       tm_recheckpoint(&current->thread, regs->msr);
 
        /* If VMX is in use, get the transactional values back */
        if (regs->msr & MSR_VEC) {
@@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
                 regs->nip, regs->msr);
        tm_reclaim_current(TM_CAUSE_FAC_UNAV);
        regs->msr |= MSR_VEC;
-       tm_recheckpoint(&current->thread, MSR_VEC);
+       tm_recheckpoint(&current->thread, regs->msr );
        current->thread.used_vr = 1;
 
        if (regs->msr & MSR_FP) {
@@ -1653,7 +1653,7 @@ void vsx_unavailable_tm(struct pt_regs *regs)
        /* This loads & recheckpoints FP and VRs; but we have
         * to be sure not to overwrite previously-valid state.
         */
-       tm_recheckpoint(&current->thread, regs->msr & ~orig_msr);
+       tm_recheckpoint(&current->thread, regs->msr);
 
        msr_check_and_set(orig_msr & (MSR_FP | MSR_VEC));

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9f3e2c9..c6abad1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -880,10 +880,10 @@ static void tm_reclaim_thread(struct thread_struct *thr,
         * not. So either this will write the checkpointed registers,
         * or reclaim will. Similarly for VMX.
         */
-       if ((thr->ckpt_regs.msr & MSR_FP) == 0)
+       if ((thr->regs->msr & MSR_FP) == 0)
                memcpy(&thr->ckfp_state, &thr->fp_state,
                       sizeof(struct thread_fp_state));
-       if ((thr->ckpt_regs.msr & MSR_VEC) == 0)
+       if ((thr->regs->msr & MSR_VEC) == 0)
                memcpy(&thr->ckvr_state, &thr->vr_state,
                       sizeof(struct thread_vr_state));

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-07-05  1:02     ` Michael Neuling
  2017-07-05 20:57       ` Breno Leitao
@ 2017-10-26  4:57       ` Cyril Bur
  2017-10-26 17:31         ` Breno Leitao
  1 sibling, 1 reply; 12+ messages in thread
From: Cyril Bur @ 2017-10-26  4:57 UTC (permalink / raw)
  To: Michael Neuling, Gustavo Romero, linuxppc-dev; +Cc: Breno Leitao

On Wed, 2017-07-05 at 11:02 +1000, Michael Neuling wrote:
> On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote:
> > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 

Hi Mikey,

This completely fell off my radar, we do need something merged!

For what its worth I like the original patch.

> tm_reclaim() should have no state live in the registers once it returns.  It
> should all be saved in the thread struct. The above is not an issue in my book.
> 

Yeah, this is something I agree with, however, if that is the case then
why have tm_recheckpoint() do partial reloads?

A partial reload only makes sense if we can be sure that reclaim will
have left the state at least (partially) correct - not with (as is the
case today) one corrupted fp or Altivec reg.

> Having a quick look at the code, I think there's and issue but we need something
> more like this (completely untested).
> 
> When we recheckpoint inside an fp unavail, we need to recheckpoint vec if it was
> enabled.  Currently we only ever recheckpoint the FP which seems like a bug. 
> Visa versa for the other way around.
> 

In your example, we don't need to reload VEC if we can trust that
reclaim left the checkpointed regs on the CPU correctly - this patch
achieves this.

Of course I'm more than happy to reduce complexity and not have this
optimisation at all but then we should remove the entire parameter to
tm_recheckpoint(). Any in between feels dangerous.


Cyril


> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index d4e545d27e..d1184264e2 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
>          * If VMX is in use, the VRs now hold checkpointed values,
>          * so we don't want to load the VRs from the thread_struct.
>          */
> -       tm_recheckpoint(&current->thread, MSR_FP);
> +       tm_recheckpoint(&current->thread, regs->msr);
>  
>         /* If VMX is in use, get the transactional values back */
>         if (regs->msr & MSR_VEC) {
> @@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
>                  regs->nip, regs->msr);
>         tm_reclaim_current(TM_CAUSE_FAC_UNAV);
>         regs->msr |= MSR_VEC;
> -       tm_recheckpoint(&current->thread, MSR_VEC);
> +       tm_recheckpoint(&current->thread, regs->msr);
>         current->thread.used_vr = 1;
>  
>         if (regs->msr & MSR_FP) {
> 
> 
> > Later, we recheckpoint trusting that the live state of FP and VEC are ok
> > depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> > means the FP registers checkpointed when we entered in TM are correct and
> > after a treclaim. we can trust the FP live state. Similarly to VEC regs.
> > However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> > will recheckpoint a corrupted state from live state back to the checkpoint
> > area.
> 
> 
> 
> 
> > That commit fixes the corruption by restoring vs0 and vs32 from the
> > ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> > respectively.
> > 
> > The effect of the issue described above is observed, for instance, once a
> > VSX unavailable exception is caught in the middle of a transaction with
> > MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> > space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> > 
> > The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state
> > and ckvr_state are both copied from fp_state and vr_state, respectively,
> > and on recheckpointing both states will be restored from these thread
> > structures and not from the live state.
> > 
> > The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> > implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> > happen in the middle of the transactional block.
> > 
> > Finally, that commit also fixes the MSR used to check if FP and VEC bits
> > are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only
> > if giveup_all() is called *before* using ckpt_regs.msr for checks because
> > check_if_tm_restore_required() in giveup_all() will copy regs->msr to
> > ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread
> > had when it came off the processor.
> > 
> > No regression was observed on powerpc/tm selftests after this fix.
> > 
> > Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> > ---
> >  arch/powerpc/kernel/process.c |  9 +++++++--
> >  arch/powerpc/kernel/tm.S      | 14 ++++++++++++++
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 2ad725e..ac1fc51 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> >  	if (!MSR_TM_SUSPENDED(mfmsr()))
> >  		return;
> >  
> > +	/* Copy regs->msr to ckpt_regs.msr making the last valid for
> > +	 * the checks below. check_if_tm_restore_required() in
> > +	 * giveup_all() will take care of it. Also update fp_state
> > +	 * and vr_state from live state if the live state is valid.
> > +	 */
> > +	giveup_all(container_of(thr, struct task_struct, thread));
> > +
> >  	/*
> >  	 * If we are in a transaction and FP is off then we can't have
> >  	 * used FP inside that transaction. Hence the checkpointed
> > @@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> >  		memcpy(&thr->ckvr_state, &thr->vr_state,
> >  		       sizeof(struct thread_vr_state));
> >  
> > -	giveup_all(container_of(thr, struct task_struct, thread));
> > -
> >  	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
> >  }
> >  
> > diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> > index 3a2d041..5dfbddb 100644
> > --- a/arch/powerpc/kernel/tm.S
> > +++ b/arch/powerpc/kernel/tm.S
> > @@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim)
> >  
> >  	addi	r7, r3, THREAD_CKVRSTATE
> >  	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> > +
> > +	/* Corrupting v0 (vs32). Should restore it later. */
> >  	mfvscr	v0
> >  	li	r6, VRSTATE_VSCR
> >  	stvx	v0, r7, r6
> > +
> > +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> > +	 * recheckpoint the wrong live value.
> > +	 */
> > +	LXVD2X_ROT(32, R0, R7)
> > +
> >  dont_backup_vec:
> >  	mfspr	r0, SPRN_VRSAVE
> >  	std	r0, THREAD_CKVRSAVE(r3)
> > @@ -272,9 +280,15 @@ dont_backup_vec:
> >  	addi	r7, r3, THREAD_CKFPSTATE
> >  	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp
> > state */
> >  
> > +	/* Corrupting fr0 (vs0). Should restore it later. */
> >  	mffs    fr0
> >  	stfd    fr0,FPSTATE_FPSCR(r7)
> >  
> > +	/* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might
> > +	 * recheckpoint the wrong live value.
> > +	 */
> > +	LXVD2X_ROT(0, R0, R7)
> > +
> >  dont_backup_fp:
> >  
> >  	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-10-26  4:57       ` Cyril Bur
@ 2017-10-26 17:31         ` Breno Leitao
  0 siblings, 0 replies; 12+ messages in thread
From: Breno Leitao @ 2017-10-26 17:31 UTC (permalink / raw)
  To: Cyril Bur; +Cc: Michael Neuling, Gustavo Romero, linuxppc-dev

Hi Cyril, Mikey,

On Thu, Oct 26, 2017 at 03:57:33PM +1100, Cyril Bur wrote:
> On Wed, 2017-07-05 at 11:02 +1000, Michael Neuling wrote:
> > On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote:
> > > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> > > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> > 
> 
> Hi Mikey,
> 
> This completely fell off my radar, we do need something merged!
> 
> For what its worth I like the original patch.
> 
> > tm_reclaim() should have no state live in the registers once it returns.  It
> > should all be saved in the thread struct. The above is not an issue in my book.
> > 
> 
> Yeah, this is something I agree with, however, if that is the case then
> why have tm_recheckpoint() do partial reloads?

I think it is an optimization which unfortunately causes a lot of
bugs in the code, and made the whole code hard to read and debug. I would vote
to change this design to something simpler.

My suggestion is removing the partial reclaim and partial
recheckpoint at all.

In this case, all the checkpointed registers would go to ckvec/ckfp/ckpt
area in the thread. Independently if registers are bogus or not.

Later, __tm_recheckpoint() would recheckpoint all values from the
ckvec/ckfp registers . Of course we can change the values if
we need to, as, to enable a feature, as FP, that might be disabled (thus
the registers were bogus). This will continue to happen in between the
reclaim and recheckpoint. Something in these lines:

        giveup_all(container_of(thr, struct task_struct, thread));

        tm_reclaim(thr, MSR_FP | MSR_VEC, cause);

        if ((thr->ckpt_regs.msr & MSR_FP) == 0)
                memcpy(&thr->ckfp_state, &thr->fp_state,
                       sizeof(struct thread_fp_state));
        if ((thr->ckpt_regs.msr & MSR_VEC) == 0)
                memcpy(&thr->ckvr_state, &thr->vr_state,
                       sizeof(struct thread_vr_state));

	tm_recheckpoint(&new->thread, MSR_FP | MSR_VEC) ;

> A partial reload only makes sense if we can be sure that reclaim will
> have left the state at least (partially) correct - not with (as is the
> case today) one corrupted fp or Altivec reg.

Even in this case, something can happen before the techeckpoint and
mess up with the 'hot' registers that will be placed in the checkpoint area. I
am wondering if an interrupt could happen in the mean time that calls some
functions that keep the registers dirty, as memcpy(?). (Is it possible?)

> > Having a quick look at the code, I think there's and issue but we need something
> > more like this (completely untested).
> > 
> > When we recheckpoint inside an fp unavail, we need to recheckpoint vec if it was
> > enabled.  Currently we only ever recheckpoint the FP which seems like a bug. 
> > Visa versa for the other way around.
> > 
> 
> In your example, we don't need to reload VEC if we can trust that
> reclaim left the checkpointed regs on the CPU correctly - this patch
> achieves this.

Right, it is better to have this patch than nothing, but, why not
recheckpointing everything? That would be much easier to debug the
problems, since we will only need to inspect the ckfp/vec area before
recheckpoint. Nowadays we need to inspect the ckarea and the 'hot' live
registers, which is tricky to do. :-)

> Of course I'm more than happy to reduce complexity and not have this
> optimisation at all but then we should remove the entire parameter to
> tm_recheckpoint(). Any in between feels dangerous.

In fact, Gustavo created a patch that do so already, but we didn't
submit because we also felt dangerous to propose such a big change. Our
idea is to just reclaim/recheckpoint with FP|VEC on, and once we are
confident that it is not broken in any form, we remove the second
parameter.

In order to do it, we have the following suggestion:

1) Fix the two major issues we have right now[1]. In order to do so, we would
start to enable VEC|FP in all recheckpoint. This would be the initial step into
simplifying the whole code:

This is the proposed patch:
 https://github.com/leitao/linux/commit/80a73de53061237948bca76d97fbba18a0e2aba0

2) Force VEC|FP on treclaim

3) Simplify the whole trecheckpoint() and tm_reclaim() code, removing the MSR
argument, since FP|VEC would always be on.

Thank you,
Breno

[1] https://github.com/leitao/linux/commit/747c6e7ce00c971c347ca0b9c5069a4ebd715ef6

> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index d4e545d27e..d1184264e2 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
> >          * If VMX is in use, the VRs now hold checkpointed values,
> >          * so we don't want to load the VRs from the thread_struct.
> >          */
> > -       tm_recheckpoint(&current->thread, MSR_FP);
> > +       tm_recheckpoint(&current->thread, regs->msr);
> >  
> >         /* If VMX is in use, get the transactional values back */
> >         if (regs->msr & MSR_VEC) {
> > @@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
> >                  regs->nip, regs->msr);
> >         tm_reclaim_current(TM_CAUSE_FAC_UNAV);
> >         regs->msr |= MSR_VEC;
> > -       tm_recheckpoint(&current->thread, MSR_VEC);
> > +       tm_recheckpoint(&current->thread, regs->msr);
> >         current->thread.used_vr = 1;
> >  
> >         if (regs->msr & MSR_FP) {
> > 
> > 
> > > Later, we recheckpoint trusting that the live state of FP and VEC are ok
> > > depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> > > means the FP registers checkpointed when we entered in TM are correct and
> > > after a treclaim. we can trust the FP live state. Similarly to VEC regs.
> > > However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> > > will recheckpoint a corrupted state from live state back to the checkpoint
> > > area.
> > 
> > 
> > 
> > 
> > > That commit fixes the corruption by restoring vs0 and vs32 from the
> > > ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> > > respectively.
> > > 
> > > The effect of the issue described above is observed, for instance, once a
> > > VSX unavailable exception is caught in the middle of a transaction with
> > > MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> > > space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> > > 
> > > The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state
> > > and ckvr_state are both copied from fp_state and vr_state, respectively,
> > > and on recheckpointing both states will be restored from these thread
> > > structures and not from the live state.
> > > 
> > > The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> > > implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> > > happen in the middle of the transactional block.
> > > 
> > > Finally, that commit also fixes the MSR used to check if FP and VEC bits
> > > are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only
> > > if giveup_all() is called *before* using ckpt_regs.msr for checks because
> > > check_if_tm_restore_required() in giveup_all() will copy regs->msr to
> > > ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread
> > > had when it came off the processor.
> > > 
> > > No regression was observed on powerpc/tm selftests after this fix.
> > > 
> > > Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> > > Signed-off-by: Breno Leitao <leitao@debian.org>
> > > ---
> > >  arch/powerpc/kernel/process.c |  9 +++++++--
> > >  arch/powerpc/kernel/tm.S      | 14 ++++++++++++++
> > >  2 files changed, 21 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > > index 2ad725e..ac1fc51 100644
> > > --- a/arch/powerpc/kernel/process.c
> > > +++ b/arch/powerpc/kernel/process.c
> > > @@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> > >  	if (!MSR_TM_SUSPENDED(mfmsr()))
> > >  		return;
> > >  
> > > +	/* Copy regs->msr to ckpt_regs.msr making the last valid for
> > > +	 * the checks below. check_if_tm_restore_required() in
> > > +	 * giveup_all() will take care of it. Also update fp_state
> > > +	 * and vr_state from live state if the live state is valid.
> > > +	 */
> > > +	giveup_all(container_of(thr, struct task_struct, thread));
> > > +
> > >  	/*
> > >  	 * If we are in a transaction and FP is off then we can't have
> > >  	 * used FP inside that transaction. Hence the checkpointed
> > > @@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> > >  		memcpy(&thr->ckvr_state, &thr->vr_state,
> > >  		       sizeof(struct thread_vr_state));
> > >  
> > > -	giveup_all(container_of(thr, struct task_struct, thread));
> > > -
> > >  	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
> > >  }
> > >  
> > > diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> > > index 3a2d041..5dfbddb 100644
> > > --- a/arch/powerpc/kernel/tm.S
> > > +++ b/arch/powerpc/kernel/tm.S
> > > @@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim)
> > >  
> > >  	addi	r7, r3, THREAD_CKVRSTATE
> > >  	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> > > +
> > > +	/* Corrupting v0 (vs32). Should restore it later. */
> > >  	mfvscr	v0
> > >  	li	r6, VRSTATE_VSCR
> > >  	stvx	v0, r7, r6
> > > +
> > > +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> > > +	 * recheckpoint the wrong live value.
> > > +	 */
> > > +	LXVD2X_ROT(32, R0, R7)
> > > +
> > >  dont_backup_vec:
> > >  	mfspr	r0, SPRN_VRSAVE
> > >  	std	r0, THREAD_CKVRSAVE(r3)
> > > @@ -272,9 +280,15 @@ dont_backup_vec:
> > >  	addi	r7, r3, THREAD_CKFPSTATE
> > >  	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp
> > > state */
> > >  
> > > +	/* Corrupting fr0 (vs0). Should restore it later. */
> > >  	mffs    fr0
> > >  	stfd    fr0,FPSTATE_FPSCR(r7)
> > >  
> > > +	/* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might
> > > +	 * recheckpoint the wrong live value.
> > > +	 */
> > > +	LXVD2X_ROT(0, R0, R7)
> > > +
> > >  dont_backup_fp:
> > >  
> > >  	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim
  2017-07-04 20:45   ` [PATCH] " Gustavo Romero
  2017-07-05  1:02     ` Michael Neuling
@ 2022-03-11 16:27     ` Christophe Leroy
  1 sibling, 0 replies; 12+ messages in thread
From: Christophe Leroy @ 2022-03-11 16:27 UTC (permalink / raw)
  To: Gustavo Romero, linuxppc-dev, Cyril Bur; +Cc: Breno Leitao



Le 04/07/2017 à 22:45, Gustavo Romero a écrit :
> Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 
> Later, we recheckpoint trusting that the live state of FP and VEC are ok
> depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> means the FP registers checkpointed when we entered in TM are correct and
> after a treclaim. we can trust the FP live state. Similarly to VEC regs.
> However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> will recheckpoint a corrupted state from live state back to the checkpoint
> area.
> 
> That commit fixes the corruption by restoring vs0 and vs32 from the
> ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> respectively.
> 
> The effect of the issue described above is observed, for instance, once a
> VSX unavailable exception is caught in the middle of a transaction with
> MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> 
> The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state
> and ckvr_state are both copied from fp_state and vr_state, respectively,
> and on recheckpointing both states will be restored from these thread
> structures and not from the live state.
> 
> The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> happen in the middle of the transactional block.
> 
> Finally, that commit also fixes the MSR used to check if FP and VEC bits
> are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only
> if giveup_all() is called *before* using ckpt_regs.msr for checks because
> check_if_tm_restore_required() in giveup_all() will copy regs->msr to
> ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread
> had when it came off the processor.
> 
> No regression was observed on powerpc/tm selftests after this fix.
> 
> Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

This patch is still flaged as "New" in patchwork 
(https://patchwork.ozlabs.org/project/linuxppc-dev/patch/1499201115-22967-1-git-send-email-gromero@linux.vnet.ibm.com/)

I don't know why but the discussion that happened on this patch don't 
appear in patchwork.

I see two commit touching the same area of code done in the following 
monthes the same year:

eb5c3f1c8647 ("powerpc: Always save/restore checkpointed regs during 
treclaim/trecheckpoint")
91381b9cb1c3 ("powerpc: Force reload for recheckpoint during tm {fp, 
vec, vsx} unavailable exception")

So I'll mark this patch as "changes requested". Please raise hand if I'm 
wrong.

Christophe

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-11 16:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-30  0:44 [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Gustavo Romero
2017-06-30  0:44 ` [PATCH 2/2] powerpc/tm: test for regs sanity in VSX exception Gustavo Romero
2017-07-04  0:49   ` Cyril Bur
2017-06-30 16:41 ` [PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim Breno Leitao
2017-07-04  0:37   ` Cyril Bur
2017-07-04  0:19 ` Cyril Bur
2017-07-04 20:45   ` [PATCH] " Gustavo Romero
2017-07-05  1:02     ` Michael Neuling
2017-07-05 20:57       ` Breno Leitao
2017-10-26  4:57       ` Cyril Bur
2017-10-26 17:31         ` Breno Leitao
2022-03-11 16:27     ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.